Use case #0002

Drift Detection: When Does Your Credit Model Need Retraining?

A credit model does not break dramatically — it degrades quietly. The Gini coefficient slips from 0.68 to 0.64 over eight months while everyone is watching the NPA ratio. The variable that used to predict default reliably has shifted distribution because the economy changed. By the time the degradation is visible in portfolio outcomes, thousands of decisions have already been made with a model that no longer describes the world it is scoring. The Model Validation AI catches the drift when it forms, not when it matures.

A credit model does not break dramatically — it degrades quietly. The Gini coefficient slips from 0.68 to 0.64 over eight months while everyone is watching the NPA ratio. The variable that used to predict default reliably has shifted distribution because the economy changed. By the time the degradation is visible in portfolio outcomes, thousands of decisions have already been made with a model that no longer describes the world it is scoring. The Model Validation AI catches the drift when it forms, not when it matures.

The Two Types of Model Drift — and Why They Require Different Responses

Drift is not a single phenomenon. Credit models can degrade in two fundamentally different ways, and diagnosing which type of drift is occurring determines whether the correct response is a full retrain, a recalibration, a variable replacement, or simply enhanced monitoring.

Type 1 · Concept Drift

The Relationship Between Features and Default Has Changed

A borrower with a 720 CIBIL score and ₹8L annual income was a certain level of credit risk during the 2019–2022 economic period. In the 2023–2025 rate hike and inflationary environment, the same borrower profile carries a materially different risk level. The feature values have not changed — the statistical relationship between those features and default has changed. This is concept drift: the model's learned function no longer describes current reality.

Response: Full model retrain on recent data is required. Recalibration alone is insufficient.
Type 2 · Data / Covariate Drift

The Distribution of Input Features Has Shifted

The model was trained on a population where 35% of borrowers were self-employed. The current origination mix is 54% self-employed — a systematic shift in who is applying. The model's learned relationships may still be valid, but the inputs it is receiving look different from the inputs it was trained on. This is data drift: the model is being asked to score a population it was not trained to represent well.

Response: Segment-specific recalibration may suffice. Full retrain required if drift is severe or persistent.
"The model that needs retraining is not the one that has stopped working. It is the one that has silently started working differently — giving the same score to borrowers who no longer carry the same risk."

The Drift Detection Battery: What the AI Monitors and What Each Metric Means

Metric What It Measures Green Zone Yellow Zone Red Zone (Retrain) Current Status
Population Stability Index (PSI) Overall input feature distribution shift vs training data < 0.10 0.10 – 0.25 > 0.25 0.28 — Retrain zone
Gini Coefficient Trend Discriminatory power — ability to rank-order risk <5% decline from baseline 5–10% decline >10% decline −8.8% from baseline (0.68→0.62)
Prediction-to-Actual Ratio Does the model predict the right default rate? 0.90 – 1.10 0.80–0.90 or 1.10–1.25 <0.80 or >1.25 1.20 — approaching Red (model under-predicting risk)
KS Statistic Maximum separation between good and bad score distributions <10% decline from baseline 10–20% decline >20% decline −11.4% from baseline (0.44→0.39)
Characteristic Stability Index — Top 3 Variables Distribution shift on highest-weight model inputs All CSI < 0.10 Any CSI 0.10–0.25 Any CSI > 0.25 Employment sector CSI = 0.31
Score Distribution Shift Has the average score changed without population change? <5 point mean shift 5–10 point mean shift >10 point mean shift or bimodal emergence +3.2 point mean shift — Green
Approval Rate by Score Decile Are cut-off policies producing same risk composition? ±2% per decile from baseline ±2–5% per decile >5% shift in top or bottom decile Decile 1 (riskiest) approval rate +4.8%

The Variable-Level Diagnosis: Which Inputs Are Drifting Most

When the PSI signals population drift at the model level, the Model Validation AI drills to the variable level — computing the Characteristic Stability Index (CSI) for every input variable to identify which specific features are driving the shift. This diagnostic is what determines whether a full retrain is necessary or whether the drift can be addressed by replacing or recalibrating the affected variable.

Input Variable Model Weight Rank CSI Value Visual Drift Zone Root Cause Action
Employment sector category #1 (highest weight) 0.31
Red — critical SE manufacturing sector contraction — fewer manufacturing SE applicants Full retrain required
CIBIL score band #2 0.18
Yellow — monitor Post-pandemic bureau score distribution shift — more mid-range scorers Monitor; recalibrate if exceeds 0.25
Monthly income (bank statement) #3 0.14
Yellow — monitor Inflation-driven nominal income growth — income bands shifted upward Income band recalibration candidate
Loan-to-value ratio #4 0.07
Green — stable LTV distribution stable — no action No action required
Tenure of employment (years) #5 0.06
Green — stable Stable — no action No action required
GST filing regularity #6 0.27
Red — drifting GST late filing increased across SE borrowers post-rate hike stress Variable recalibration required

The Retraining Decision Framework: Four Paths

A
Path A · Monitoring Only

All Metrics Green — Continue Standard Monitoring

PSI below 0.10, Gini decline below 5%, prediction-to-actual ratio between 0.90 and 1.10, all CSI values green. No intervention required. Monthly monitoring continues. Quarterly review with Board Risk Committee noting continued model health.

B
Path B · Variable Recalibration

1–2 Yellow Metrics, Yellow CSI on Non-Critical Variables

Targeted recalibration of the drifting variable(s) without rebuilding the full model. Income band boundaries adjusted for inflation. Employment sector weights adjusted for current mix. Recalibration takes 2 to 4 weeks and does not require a full model rebuild or Board Risk Committee approval — only Credit Risk Committee sign-off. Champion-challenger test of recalibrated vs current model runs for 8 weeks before adoption.

C
Path C · Full Model Retrain

PSI Above 0.25, or Red CSI on High-Weight Variable, or Gini Decline Above 10%

A new model is built from scratch on a recent training window (typically the last 24–30 months of originations with mature outcome data). The new model is validated by the Model Validation AI, tested as a challenger against the incumbent champion, and — when it demonstrates statistically significant superiority — promoted via the Board Risk Committee governance process. Estimated time from retraining decision to production: 12 to 18 weeks including challenger test period.

D
Path D · Emergency Governance

PSI Above 0.30, or Prediction-to-Actual Exceeds 1.30 (Model Severely Underestimating Risk)

The model has degraded to the point where it is materially mispricing risk on current originations. Emergency governance is triggered: the Board Risk Committee is notified within 24 hours; origination standards are tightened manually pending model replacement (conservative manual overlays applied); the retrain is prioritised above all other model development work; and the incident is documented in the model risk register as a governance event requiring root cause analysis and remediation plan.

The Retraining Brief the AI Generates

When the Model Validation AI recommends a retrain (Path C or D), it does not simply raise an alert. It generates a complete retraining brief: the specific drift metrics that triggered the recommendation, the variable-level CSI diagnosis identifying which inputs have shifted most, the recommended training data window (start and end date, rationale for the window choice), the borrower segments where drift is most pronounced (and therefore where the retrained model should be most carefully validated), and the governance timeline from retrain initiation to Board Risk Committee approval.

This brief is the starting document for the model development team — not an instruction (the team may reach different conclusions on architecture or variable selection), but a structured evidence base that eliminates the weeks of exploratory analysis that typically precede a retrain decision. The decision to retrain has already been made on evidence. What remains is execution.

7Drift metrics monitored continuously — PSI, Gini, KS, P/A ratio, CSI, score dist., approval decile
0.25PSI threshold triggering formal retrain recommendation — current model at 0.28
4Response pathways: monitor only / variable recalibration / full retrain / emergency governance
CompleteRetraining brief auto-generated — training window, affected variables, governance timeline

The Decision to Retrain Should Never Be Surprising — It Should Always Be Expected

Model drift is not an exception — it is a certainty. Every credit model trained on historical data will eventually stop describing a world that has continued to change. The only question is whether the institution detects this degradation early, when a targeted intervention is sufficient, or late, when emergency governance and manual overlays are the only tools available. The Model Validation AI makes early detection the default outcome — transforming retraining from an episodic emergency into a routine, evidence-based governance decision that the Board Risk Committee can anticipate, approve, and track without the institutional disruption of an unplanned model failure.

← Back to Model Validation Agent AI