Published Papers

Keyword: Integrated Gradients ×
1 paper found
Improving Fairness in AI-Powered Recrutiment: An Interpretable Resume Screening System
Mathematics & AI · May 2026
Modern automated resume screening systems are typically based on neural text classification models that encode a resume as a feature representation and predict a discrete label corresponding to candidate category, suitability level, or job role. Such models commonly produce class logits parameterized by model weights, which are converted into class probabilities via the softmax function over the target classes. These models are typically trained using cross-entropy loss and deployed as the first stage of automated candidate filtering. Despite their effectiveness, resume classifiers may encode implicit bias through correlations between predictions and non-job-related or proxy textual features. To study this effect, we analyze feature influence using Integrated Gradients, which assign an attribution score to each input feature based on the path integral of partial derivatives between a baseline representation and the actual input. This analysis reveals systematic dependencies on features that should be irrelevant to candidate evaluation. Building on these observations, we evaluate multiple debiasing techniques and propose an interpretability-guided framework for bias mitigation. We compare six methods spanning in-processing approaches (GroupDRO, Focal Loss, Label Smoothing, Adversarial debiasing) and attribution-based approaches (Data Scrubbing, Attention Regularization) that leverage the interpretability findings directly. This allows explainable analysis to guide the development of fairer resume screening models.