AI Agent Profile · LendingIQ · Bengaluru
Model Validation Agent AI
DivisionRisk division
Resume
What this agent does
The Model Validation Agent AI monitors the performance of every statistical model in LendingIQ's model inventory — credit scorecards, PD models, fraud scores, early warning models — against their validation baselines, runs challenger model comparisons against the current champion, detects population and characteristic drift, and structures the retraining case when a model has degraded to the point where replacement is warranted. It is the independent oversight layer of the model governance function. It does not build models, train models, or decide which model goes into production — those are data science and governance committee decisions.
Primary functions
Challenger Model Tests
Triggered at new model submission or quarterly reviewInvoked when: the data science team submits a new challenger model for validation, or the quarterly model review requires assessing whether the champion should be replaced
- Reads the challenger model's documentation package — model card, development methodology, training data description, feature set, out-of-time test results, and bias assessment — and evaluates whether the documentation is complete against the model risk policy requirements before evaluating performance. A model without complete documentation is not validated, regardless of how good its metrics are.
- Compares the challenger against the champion on the standard performance metrics — Gini coefficient, KS-statistic, and PSI on a shared out-of-time validation population — and assesses whether the challenger's performance improvement is statistically significant and practically material. A challenger that is 1 Gini point better than the champion is not practically superior; a 5-point improvement on a well-sized validation sample is significant and material.
- Stress-tests the challenger on population segments where the champion is known to perform poorly — thin-file borrowers, new-to-credit applicants, specific sectors — to check whether the challenger genuinely improves on the champion's weaknesses or simply performs better on the majority population while maintaining the same blind spots.
- Does not build, train, or configure challenger models. It validates submitted models against documented standards. The data science team builds; the validation agent evaluates. If the challenger documentation is incomplete, the validation report will list the documentation gaps that must be remediated before validation can proceed — it will not proceed without them.
Drift Detection
Monthly monitoring for all production modelsInvoked when: monthly model monitoring data is available, or an intra-month trigger fires based on approval rate or NPA rate anomaly
- Computes the Population Stability Index (PSI) comparing the current month's score distribution to the model's development population distribution — a PSI above 0.10 indicates a meaningful population shift; above 0.25 indicates the model may be operating substantially outside its development domain and its predictions cannot be trusted at their face value.
- Computes the Characteristic Stability Index (CSI) for each input feature in the model — identifying which specific features are drifting and by how much. A PSI signal without a CSI breakdown is incomplete: knowing that the population has shifted is less actionable than knowing that bureau score distribution has shifted right by 20 points and GST turnover has compressed by 15% — because those feature-level shifts point to different root causes and different remediation strategies.
- Tracks outcome drift separately from population drift — comparing the model's predicted NPA rate by score band against the observed NPA rate on the cohorts that have matured. A model that correctly predicted a 3% NPA rate for band X at development but is now observing 5% NPA in band X is experiencing outcome drift, which is more serious than population drift alone because it means the model's rank-ordering of risk is deteriorating in practice.
- Cannot determine why drift is occurring — whether it reflects a genuine change in borrower risk behaviour, a change in the origination channel mix, a change in macro conditions, or a change in LendingIQ's underwriting practice. It measures drift; identifying the cause requires the data science team and the CRO AI to investigate the underlying drivers.
Retraining Trigger
Triggered by drift severity or performance degradationInvoked when: drift monitoring identifies a model as "Investigate" or "Materially Degraded," or the quarterly performance review shows sustained Gini degradation beyond the policy threshold
- Reads the full drift and performance history for the model under review — how long the degradation has been developing, which metrics have crossed policy thresholds, whether the drift is accelerating or stabilising, and what the current magnitude of performance loss is — and structures a retraining case that documents why retraining is warranted, what data the retrained model should be trained on, and what the target performance improvement should be.
- Assesses the urgency of the retraining decision: a model that has drifted gradually over 18 months and is still marginally above the minimum acceptable Gini threshold is a planned retraining case that can be scheduled into the next model development cycle. A model that has degraded sharply over 60 days, with PSI above 0.25 and observed NPA rates materially above predicted, is an urgent retraining case that may require a temporary mitigation — tightening the policy cut-off or adding a manual review layer — while retraining is completed.
- Recommends the retraining data window: should the model be retrained on the most recent 12 months of data (capturing current population behaviour), a longer lookback (capturing multiple economic cycles for robustness), or a blended approach that weights recent data more heavily? The recommendation is based on the nature of the drift detected — population drift argues for retraining on current population data; outcome drift may require a longer lookback to capture the full default lifecycle.
- Cannot initiate retraining, access training data, or build the retrained model. It structures the business case and the data specification for the retraining exercise. The data science team executes the retraining; the governance committee approves the deployment of the retrained model after the validation agent validates it against the new baseline.
Knowledge base
Model Registry & Documentation
Model cards, development methodology documents, training data descriptions, prior validation reports, and approved model versions for every model in the inventory. Retrieved via RAG — the validation standard is always assessed against the current policy.
Model Performance Database
Monthly Gini, KS-stat, PSI, CSI, approval rate, and NPA rate by score band and vintage for every production model. The longitudinal performance record that drift detection and retraining triggers are computed from.
Model Risk Policy (RAG)
LendingIQ's model risk policy — validation standards, documentation requirements, performance thresholds, drift trigger levels, and governance approval chain. The regulatory and internal standards the agent applies in every validation exercise.
Development Population Baseline
The score distribution, feature distributions, and performance metrics from the model's development and initial validation. The reference point against which all drift is measured. A model cannot be monitored for drift without a documented baseline.
Challenger Model Output Store
Score distributions and decision outputs from challenger models running in shadow mode. Used for champion vs challenger comparison without challenger models touching production decisions.
Model Validation Knowledge
Pre-training knowledge of credit model validation methodology, Basel model risk management principles, PSI/CSI interpretation, scorecard validation standards, and quantitative model governance frameworks up to knowledge cutoff.
Hard guardrails
Known limitations
Important Reads
Learn more about how to deploy Model Validation Agent AI to your lending workflow.
- Use case #0001How Model Validation AI Runs Challenger Tests in ProductionA credit model that was validated 18 months ago against data from 24 months ago is not a validated model — it is a model whose validation has expired. The Model Validation AI runs continuous champion-challenger testing in live production, monitoring model performance daily, detecting drift as it forms, and surfacing evidence that a challenger model is ready to replace the champion before the champion's degradation has contaminated the portfolio.Read article →
- Use case #0002Drift Detection: When Does Your Credit Model Need Retraining?A credit model does not break dramatically — it degrades quietly. The Gini coefficient slips from 0.68 to 0.64 over eight months while everyone is watching the NPA ratio. The variable that used to predict default reliably has shifted distribution because the economy changed. By the time the degradation is visible in portfolio outcomes, thousands of decisions have already been made with a model that no longer describes the world it is scoring. The Model Validation AI catches the drift when it forms, not when it matures.Read article →
- Use case #0003Model Validation AI and RBI Model Risk Management ComplianceThe RBI's expectations around model risk management have become more specific, more exacting, and more enforceable with each supervisory cycle. An institution whose credit models are validated annually by a team that has since moved on, whose model performance monitoring is a quarterly MIS review, and whose Board Risk Committee has not seen a model validation report in 18 months is not running a compliant model risk management function. The Model Validation AI runs one — continuously, automatically, and in the exact format the regulator expects to see.Read article →
