The underwriter who manually opens a 12-month bank statement, locates the salary credits, cross-references them against three payslips, computes the average, and then does the same again for two ITR PDFs is not underwriting — they are doing clerical extraction. The Document Ops Agent AI extracts income data from every document type in the Indian lending stack — payslips, bank statements, ITRs, Form 16s, GST returns, CA certificates, and six more — in under 90 seconds, and presents a single structured income profile to the underwriter. The underwriter's job is to assess the income, not find it.
Why income extraction is hard — and why it matters that it is done correctly
Income extraction is harder than it appears because income is represented differently in every document type, by every employer, and for every income structure. A payslip from a large private sector employer will have a standardised format. A payslip from a small manufacturing firm may be a hand-typed table in a Word document. A salaried borrower's bank statement will show regular credits labelled "SALARY" from one employer. A self-employed borrower's bank statement will show irregular credits from dozens of counterparties, none labelled as "income." An ITR will show total income after deductions, which is not the same as gross income, which is not the same as income available for EMI servicing after statutory deductions.
Getting income extraction wrong in either direction creates a material problem. Overstating income leads to over-lending — the borrower's actual EMI capacity is less than the sanctioned amount. Understating income leads to under-lending and potential customer loss — the borrower qualifies for more than they were offered, and a competitor who assesses their income correctly will offer more. The Document Ops Agent AI extracts from the source document, applies the correct computation for each document type, and flags the confidence level of each extraction — so the underwriter can give higher scrutiny to low-confidence extractions without spending equal time on every field.
The 12 document types and what the Document AI extracts from each
Payslip (salaried)
Formats: PDF, scanned, photo- Gross salary — before all deductions · Primary income signal
- Net take-home — after PF, ESI, TDS, professional tax
- PF contribution — employer + employee · Income stability proxy
- Variable components — incentive, bonus, allowances · Flagged separately
- Employer name — cross-referenced against EPFO for authenticity
Bank Statement (salaried)
Formats: PDF, Excel, AA pull- Salary credits — regular same-source credits classified as salary
- 12-month average salary credit — primary verification figure
- Credit consistency — variation across months (flag if >15% variance)
- NACH debits — existing EMIs extracted and summed for FOIR
- Average end-of-month balance — liquidity signal
Bank Statement (self-employed)
Formats: PDF, Excel, AA pull- Total gross credits (12 months) — all inward credits summed
- Business-to-business credits — identified by credit narrative pattern
- Cash deposits — flagged separately · Not counted as verifiable income
- Monthly credit trend — growing, stable, or declining classification
- Bounce rate — outbound NACH / cheque return rate
ITR (Individual) — Salaried
Formats: PDF acknowledgement, Form 26AS- Gross total income (Schedule S) — salary income before Chapter VI-A deductions
- Taxable income — after deductions · Not income for lending purposes
- TDS deducted — cross-referenced for consistency with payslip
- Year of assessment — currency check · ITR >18 months old flagged
- Acknowledgement number — verified against ITD portal
ITR (Business) — SE / Proprietor
Formats: Profit & Loss schedule, business income- Net profit after tax — primary income signal for business borrowers
- Gross receipts / turnover — top-line for revenue trend
- Depreciation add-back — non-cash deduction added back for cash income
- Year-on-year trend — 2-year comparison for stability
- Business nature — from ITR filing category
Form 16 / TDS Certificate
Formats: PDF, Part A + Part B- Gross salary (Part B) — employer-certified income figure
- TAN of employer — verifies employer identity
- TDS amount — cross-referenced with ITR
- Assessment year — must be current or prior AY
- Allowances breakup — HRA, LTA, special allowances separated
GST Returns (GSTR-3B)
Formats: PDF download, GST portal pull- Outward taxable supply (Table 3.1a) — monthly turnover signal
- 12-month turnover total — annual revenue for income estimation
- Filing regularity — late filings counted and flagged
- Tax paid (cash ledger) — cross-reference for turnover authenticity
- GSTIN status — active / suspended verified
CA Certificate
Formats: Signed PDF on letterhead- Net annual income certified — primary figure from CA
- CA membership number — verified against ICAI register
- Income computation basis — bank statement / books / estimation
- Certification date — must be within 3 months of application
- Business nature certified — confirmed type of SE activity
Salary Certificate (employer)
Formats: Letterhead PDF, email confirmation- Gross monthly salary — employer-stated figure
- Employment designation and date of joining — tenure signal
- HR signatory name and designation — authenticity signal
- Company letterhead — validated against company registration
- Issue date — must be within 2 months of application
Rental Income Agreement
Formats: Registered / notarised PDF- Monthly rental amount — from agreement schedule
- Lease start and end date — tenure remaining
- Property address — cross-referenced with ownership documents
- Rental income credited to bank — bank statement corroboration required
- Registration status — registered agreements given higher weight
Pension Payment Order
Formats: Government PPO document- Monthly pension amount — fixed, regular, government-backed
- PPO number — verified against pension authority records
- Pension type — service pension vs family pension
- DCRG / commutation status — lump-sum already received or not
- Bank account linked — cross-referenced with application account
Udyam / MSME Registration Certificate
Formats: Udyam portal PDF- Business category — Micro, Small, or Medium
- Date of registration — business vintage signal
- NIC code — business nature for sector risk classification
- Turnover declared at registration — corroborating revenue signal
- PAN linkage — verified against borrower PAN
A live extraction: what the Document AI produces in 87 seconds
The underwriter's value is in what they do with the income figure — not in locating it
Income extraction is not judgement — it is transcription with rules. The rules are known: gross from Schedule S not taxable income, rental at 70% discount, bank statement average over 12 months not the best month, NACH debits as existing obligations. Every one of those rules can be applied consistently by a machine, and inconsistently by a human who is extracting their fifteenth application of the day. The Document Ops Agent AI applies the extraction rules identically for every document, every applicant, every time — and presents the underwriter with a structured income profile whose provenance is documented, whose confidence level is stated, and whose discrepancies with other documents are already flagged. What remains for the underwriter is the judgement that actually requires one.
