This function computes aggregate statistics to detect bias in machine learning models by analyzing performance disparities across different demographic groups. It calculates key metrics such as demographic parity, equalized odds, and disparate impact ratios to ensure compliance with ethical AI standards. The system aggregates predictions and ground truth labels per subgroup, applying statistical tests to identify significant deviations from fairness thresholds defined in enterprise governance policies.
The function initiates a deep analysis of model outputs segmented by protected attributes such as age, gender, or ethnicity.
Statistical algorithms compute specific fairness indicators including true positive rate parity and false negative rate equality.
Results are compared against predefined regulatory thresholds to flag potential algorithmic discrimination risks immediately.
Extract protected attribute labels from the training and inference datasets.
Segment model predictions into distinct groups based on demographic identifiers.
Compute accuracy, precision, recall, and F1 scores for each subgroup independently.
Apply statistical significance testing to determine if disparities are non-random.
Secure upload of labeled datasets containing protected attribute fields for subgroup analysis.
Real-time visualization of fairness scores and deviation heatmaps for regulatory review.
Automated triggers when calculated metrics exceed acceptable bias tolerance limits.