FrenzoCollect
27-02-26
The traditional approach to debt collection in India has always been reactive: wait for the borrower to miss a payment, then spring into action. But by the time an account hits 90 days past due (DPD) and threatens to become a Non-Performing Asset (NPA), recovery becomes exponentially harder and more expensive. What if you could identify accounts at risk 30, 60, or even 90 days before they miss their first payment?
This isn't science fiction - it's the new reality of machine learning-powered collections. And it's transforming how India's smartest lenders manage their portfolios.
India's financial sector grapples with approximately ₹10 lakh crores in NPAs across banking and NBFC sectors. While this includes legacy corporate loans, even digital lenders - once touted as having superior credit models - are seeing stress levels rise. The RBI's recent guidelines on digital lending underscore the urgency: lenders must adopt better risk management practices throughout the loan lifecycle.
Here's the uncomfortable truth: by the time an account is 90 DPD, your chances of full recovery drop to less than 40%. At 180 DPD, it's under 20%. But if you intervene at 15 DPD - or better yet, before the first missed payment - resolution rates exceed 85%.
The question isn't whether to act early. It's how to identify who needs that early intervention when you're managing thousands or lakhs of accounts.
Traditional collection systems are essentially sophisticated record-keepers. They tell you what happened: who paid, who didn't, and how many days past due each account is. They're rear-view mirrors.
Machine learning systems are windshields. They tell you what's about to happen - and give you time to change course.
The Fundamental Shift:
Old Model: Customer misses Day 30 payment → System flags account → Collector calls → Borrower promises to pay → Cycle repeats → 90 DPD → NPA
New Model: ML identifies high-risk indicators → Proactive outreach at Day 10 → Payment plan offered → Issue resolved → No delinquency
The difference? Timing, precision, and cost. A proactive intervention costs ₹50 in operational expenses. A 90 DPD resolution costs ₹2,500 in collection efforts, write-offs, and provisioning.
Machine learning models analyze hundreds of data points to identify patterns invisible to human analysts. Here are the key signal categories:
Payment timing patterns: A borrower who typically pays on Day 2-3 suddenly pays on Day 28
Partial payments: First-time partial payment often predicts future delinquency
Customer service interactions: Sudden increase in calls about payment extensions or account issues
Engagement drops: Stopped opening SMS/WhatsApp messages or using borrower app
Transaction velocity: Multiple small withdrawals suggesting cash flow stress
Cross-borrowing patterns: New loan applications to other lenders (visible via credit bureau checks)
Missed utility payments: Late mobile recharges or bill payments indicate broader financial strain
Income disruption: For salaried borrowers, delayed salary credits or reduced amounts
Macroeconomic triggers: Industry-specific downturns (e.g., auto sector slowdown affecting dealership loans)
Geographic risks: Regional events like floods, strikes, or festival season cash flow patterns
Seasonal employment: Gig workers or agricultural-linked borrowers have predictable stress periods
Job changes: For salaried class, employment gaps detected through banking patterns
Medical emergencies: Sudden large healthcare expenses
Relocation: Address or mobile number changes
A sophisticated ML model doesn't just look at one signal - it evaluates combinations. A borrower who changes jobs might be fine, but a borrower who changes jobs AND starts making minimum payments AND increases credit inquiries has an 87% probability of 60+ DPD within 90 days.
Let's demystify technology without getting lost in mathematics.
Step 1: Data Collection The system ingests data from multiple sources:
Internal: Payment history, loan terms, customer interactions, app usage
Bureau data: Credit score, existing obligations, inquiry patterns
Banking data (with consent): Transaction patterns, balance trends, income regularity
Alternative data: Mobile usage, utility payments, e-commerce behavior
Step 2: Feature Engineering Raw data is transformed into meaningful indicators. For example, "payment dates" becomes "days to payment after due date" and "payment date variance" and "payment acceleration/deceleration trend."
Step 3: Model Training Using historical data from thousands of accounts, the ML algorithm identifies which combinations of features accurately predicted delinquency in the past. Common techniques include:
Gradient Boosting Models: Excel at handling complex, non-linear relationships
Random Forests: Robust against outliers and missing data
Neural Networks: For lenders with millions of accounts and rich data
Step 4: Risk Score Generation Every account receives a dynamic risk score (0-100) that updates daily or weekly:
0-30: Low risk (routine monitoring)
31-60: Medium risk (automated nudges)
61-85: High risk (proactive outreach)
86-100: Critical risk (immediate intervention with payment plans)
Step 5: Action Triggering Scores automatically trigger workflows: a borrower moving from 45 to 65 gets a concerned WhatsApp message offering help, not a threatening call.
Case Study: Personal Loan Portfolio (₹850 Crores AUM)
A mid-sized NBFC implemented ML-based early intervention:
Before ML (Reactive Collections):
90+ DPD rate: 4.2%
Average resolution cost: ₹3,200 per account
Customer retention post-resolution: 11%
Annual write-offs: ₹35.7 crores
After ML (Predictive Intervention):
90+ DPD rate: 1.8% (57% reduction)
Average intervention cost: ₹680 per account
Customer retention: 47%
Annual write-offs: ₹15.3 crores
The ROI: ₹8.2 crores invested in ML platform and training. ₹20.4 crores saved in write-offs. ₹6.8 crores saved in collection costs. Additional ₹12+ crores in retained customer lifetime value.
The most successful implementations don't replace collectors - they empower them.
Old workflow: Collector gets list of 200 delinquent accounts, spends the day making calls with unclear priorities.
New workflow: Collector gets 40 high-risk accounts ranked by probability of resolution, with suggested talk tracks based on why the borrower is struggling. The other 160 accounts are handled via automated digital nudges.
Result? Collectors focus energy where it matters most, job satisfaction increases (they're solving problems, not just making threats), and portfolio performance improves.
For lenders considering predictive collections:
Data Infrastructure: Clean, centralized data is the foundation. Garbage in, garbage out.
Technology Stack: Cloud-based platforms that can scale and integrate with existing systems
Change Management: Train teams to act on predictions, not just react to defaults
Privacy Compliance: Ensure ML models comply with RBI guidelines and DPDP Act requirements
Continuous Learning: Models must be retrained regularly as borrower behavior and economic conditions evolve
The Indian lenders winning the portfolio quality battle aren't those with the strictest underwriting - they're those with the smartest post-disbursement management. Machine learning has moved from competitive advantage to competitive necessity.
The question for your organization isn't whether to adopt predictive collections, but how quickly you can implement it. Because every day you wait is another day of preventable delinquencies becoming unavoidable NPAs.
In the age of digital lending, the best collection is the one that never becomes necessary - because you saw it coming and solved the problem first.
FrenzoFinserv's Connect-To-Collect platform leverages advanced machine learning models to identify at-risk accounts before they become delinquent. Our predictive scores integrate seamlessly with automated interventions and human workflows - helping you shift from reactive collections to proactive portfolio management. Because prevention isn't just better than cure - it's more profitable.