A data-driven exploration of Aadhaar enrolment, update behavior, and societal mobility patterns across India.
This project analyzes UIDAI datasets (as of December 31, 2025) to move beyond basic reporting and uncover the underlying societal mechanics of India's digital identity infrastructure. The goal is to generate actionable insights for resource optimization, service delivery planning, and infrastructure scaling.
All datasets were sourced from UIDAI’s official open data portal and are aggregated and anonymized.
Instead of monolithic notebooks, a structured, production-style pipeline was implemented:
-
Decoupled Logic
src/loader.py: Automated file discoverysrc/processor.py: Uniform data merging
-
Schema Harmonization
- Integrated Enrolment, Demographic Update, and Biometric Update datasets
- Final Master Dataset: 99,619 rows
-
Anomaly Handling
- Standardized Indian date format (DD-MM-YYYY)
- Removed dummy/test districts (e.g.,
district = 100000)
Regions were classified based on Aadhaar usage patterns:
-
Growth Regions (Emerging)
- High enrolment in the 0–5 age group
- Indicates higher birth rates
- Requires mobile enrolment camps
-
Maintenance Hubs (Saturated)
- High volume of Demographic & Biometric updates
- Represents mature, urban digital ecosystems
Pune, Thane, and Nashik recorded the highest Aadhaar activity.
- Insight: These are maintenance-heavy zones where residents frequently update mobile numbers and biometrics for digital service access.
- Recommendation: Deploy Permanent Aadhaar Seva Kendras (ASKs) to reduce wait times.
A System Velocity Metric was developed using month-over-month update trends.
- Finding: Activity levels are stabilizing but remain high.
- Benefit: Enables UIDAI to forecast server load and hardware requirements 3–6 months in advance.
Using the Update-to-Enrolment Ratio, migration-prone districts were identified.
High address update districts such as Kurnool and Murshidabad act as digital proxies for population movement driven by economic migration.
01_multi_file_merge.ipynb→ Master dataset creation02_anomaly_detection.ipynb→ Outlier & migration analysis03_predictive_insights.ipynb→ Time-series forecasting04_report_visuals.ipynb→ Jury-ready infographics
pip install -r requirements.txt
jupyter notebookRun notebooks 01 → 04 sequentially to regenerate all outputs and figures.
This project demonstrates how Aadhaar data can be transformed from static reporting into strategic intelligence for:
- Infrastructure planning
- Migration analysis
- Digital service optimization
- Policy-level decision support
Contributors: @Dr-Venom29 • @anirudhbhoga