Integrated Gradients — Papers

1 papers found

Improving Fairness in AI-Powered Recrutiment: An Interpretable Resume Screening System

Natalia Agapova, Rustam A. Lukmanov

Mathematics & AI · Vol. 1, No. 2 · May 2026

Modern automated resume screening systems are typically based on neural text classification models that encode a resume as a feature representation and predict a discrete label corresponding to candidate category, suitability level, or job role. Such models commonly produce class logits parameterized by model weights, which are converted into class probabilities via the softmax function over the target classes. These models are typically trained using cross-entropy loss and deployed as the first stage of automated candidate filtering. Despite their effectiveness, resume classifiers may encode implicit bias through correlations between predictions and non-job-related or proxy textual features. To study this effect, we analyze feature influence using Integrated Gradients, which assign an attribution score to each input feature based on the path integral of partial derivatives between a baseline representation and the actual input. This analysis reveals systematic dependencies on features that should be irrelevant to candidate evaluation. Building on these observations, we evaluate multiple debiasing techniques and propose an interpretability-guided framework for bias mitigation. We compare six methods spanning in-processing approaches (GroupDRO, Focal Loss, Label Smoothing, Adversarial debiasing) and attribution-based approaches (Data Scrubbing, Attention Regularization) that leverage the interpretability findings directly. This allows explainable analysis to guide the development of fairer resume screening models.

algorithmic fairness resume screening bias mitigation Integrated Gradients BERT interpretable machine learning explainable AI NLP human resources

DOI: 10.66693/mathai.1017

Published Papers

Filters

Popular Keywords

About