AI Agent Profile · LendingIQ · Bengaluru

Document Verification Agent AI

Function: Document VerificationRuntime: AWS Bedrock · ap-south-1Model: Claude Sonnet 4Context window: 200K tokens

DivisionOnboarding

Resume

Agent specs

Agent TypeOCR extraction + Data verification + Policy check

InvocationPer document set — triggered by Loan Origination Agent

MemorySession = one application document set

Latency5–15 sec per document set

Output FormatVerification report per document + exception log

Decision AuthorityNone — flags exceptions; credit officer resolves

Reasoning strengths

OCR data extraction accuracyStrong

Cross-document field consistencyStrong

Policy compliance checkStrong

Forgery pattern flaggingGood

Physical document authenticationCannot do

Live integrations

Document Management SystemDocument images for OCR extraction and verification

GSTN / NSDL / MCA APIsGST, PAN, and company registration cross-verification

Document Policy Corpus (RAG)Acceptable document formats, required fields, policy limits

Forgery Pattern LibraryKnown forged document templates and anomaly patterns

Loan Origination Agent AIReceives document set, returns verification report

Fraud Risk Agent AIDocument flags cross-referenced with application fraud signals

What this agent does

The Document Verification Agent AI extracts structured data from every document in a loan application file using OCR, checks the extracted data for internal consistency and cross-document accuracy, flags deviations from the document policy, and identifies patterns that are associated with forged or manipulated documents. Every exception is documented with the specific field, the expected value, and the observed value — so the credit officer who reviews the exceptions has complete information, not a generic flag. Verification identifies data anomalies; authentication of physical documents requires specialist tools and human examiners.

Primary functions

OCR Extraction

Every document in the application set

Invoked when: document set is assembled by the Loan Origination Agent AI

Extracts all structured data fields from each document type — salary slips (employer name, employer PAN, gross salary, deductions, net salary, pay period, employee name), bank statements (account holder name, account number, bank name, IFSC, statement period, opening and closing balances), ITRs (filed income, tax paid, assessment year, PAN), and business documents (GST registration number, turnover from GSTR, MCA registration number) — using OCR with confidence scoring per extracted field.
Flags fields where OCR confidence is below the configured threshold — poor scan quality, low-contrast print, handwritten fields — and marks these as requiring human verification rather than using the low-confidence extracted value in downstream calculations. A field the agent cannot reliably read is better flagged as unread than passed forward as a wrong value.

Output: Extracted data structured JSON per document, confidence score per field, low-confidence fields flagged for human verification.

Forgery Detection

Pattern-based — known forgery signals

Invoked on each document as part of the verification pass

Cross-checks document data against authoritative external sources: employer PAN on salary slips against NSDL records; employer GSTN against the GSTN API for the stated employer name; bank IFSC code against the RBI IFSC registry. Where the document's stated details do not match the authoritative record, the discrepancy is flagged as a potential fabrication signal — specific field, expected value from the authoritative source, value stated on the document.
Checks the document against the forgery pattern library — known templates used in prior fraud cases: salary slip formats with specific watermark placements or font combinations associated with previously identified forgeries, income certificate formats from institutions that the fraud team has flagged, or document structures that match known synthetic income document patterns.
Does not authenticate physical document features — ink, paper, embossing, watermark UV response. Pattern detection is data-driven, not forensic.

Output: Forgery signal report — each flag with the specific data point that triggered it, the authoritative source checked against, and the severity classification. Flags are routed to the Fraud Risk Agent AI for cross-referencing with application-level fraud signals.

Policy Checks & Exception Flagging

Every document against the policy corpus

Invoked after OCR extraction is complete

Checks every extracted document against the document policy requirements: salary slips must be from the last 3 months (or as configured per product), bank statements must cover the required period, ITRs must be for the assessment year specified in the product policy, and business registration documents must not have a lapsed validity date. Policy misses are documented with the specific requirement and the actual document date/status.
Cross-checks field consistency across documents: name on the salary slip vs name on the PAN card vs name on the bank statement; income declared on the salary slip vs income shown in the bank credits; stated employer on the application form vs employer named on the salary slip. Each discrepancy is flagged with the conflicting fields and values from each document.
Produces a structured exception log — every exception with the document it was found in, the specific field, the policy requirement or expected value, the observed value, and a severity classification (Minor / Material / Fraud flag). The exception log is the handoff document to the credit officer who must resolve each exception before the underwriting agent is invoked.

Output: Exception log — all exceptions categorised by type (policy / consistency / forgery), each with document, field, expected vs observed, and severity. Overall verification verdict: Pass / Pass with exceptions (listed) / Refer for human review (material exceptions present).

Hard guardrails

Will notAuthenticate physical document features. Forensic authentication of stamps, signatures, paper, and print characteristics requires specialist tools. The agent detects data inconsistencies; it does not determine whether the physical document is genuine.

Will notClear an exception without human credit officer resolution. Exceptions in the log remain open until a human officer reviews and either resolves them (obtains a corrected document or explains the discrepancy) or accepts them with documented rationale.

Will notUse a low-confidence OCR extraction as an input to the credit assessment. Fields the agent cannot read reliably are flagged as unread — the credit officer must either obtain a better-quality document or manually enter the value from physical review.

Known limitations

The forgery pattern library covers known patterns — novel forgery methods not yet in the library will not be detected by pattern matching. The library is only as current as the last fraud case analysis that updated it.After every confirmed document fraud case, conduct a structured analysis of the document characteristics that the agent did or did not detect, and update the pattern library accordingly. The library is a living corpus, not a static asset.

OCR accuracy degrades on poor-quality document images. Photographs taken at an angle, compressed image files, and multi-generation scans produce extraction errors that the confidence scoring may not fully capture. The 5% QC sampling process on Loan Origination Agent outputs provides a check on systematic OCR errors, but individual errors on low-quality images remain possible.Build a document image quality gate before OCR: if the image quality (resolution, contrast, angle) falls below the threshold for reliable OCR, reject the document upload and request a higher-quality rescan before processing. A bad scan that produces wrong extracted values is worse than a rejected scan that prompts a correct resubmission.

Important Reads

Learn more about how to deploy Document Verification Agent AI to your lending workflow.