Trilingual R, Python, and Stata library for downloading UNICEF child welfare indicators via SDMX API
The unicefData package provides lightweight, consistent interfaces to the UNICEF SDMX Data Warehouse in R, Python, and Stata. Fetch any indicator series by specifying its SDMX key, date range, and optional filters.
| Platform | README | Version |
|---|---|---|
| R | R/README.md | 2.1.0 |
| Python | python/README.md | 2.1.0 |
| Stata | stata/README.md | 2.1.0 |
| Document | Purpose |
|---|---|
| CONTRIBUTING.md | How to contribute |
| CHANGELOG.md | Recent version history |
| NEWS.md | Complete changelog |
| CITATION.cff | Citation metadata for academic use |
| docs/ | Technical documentation index |
All three platforms use the same functions with nearly identical parameters.
from unicef_api import unicefData, search_indicators, list_categories
# Search for indicators
search_indicators("mortality")
list_categories()
# Fetch data (dataflow auto-detected)
df = unicefData(
indicator="CME_MRY0T4",
countries=["ALB", "USA", "BRA"],
year="2015:2023"
)library(unicefData)
# Search for indicators
search_indicators("mortality")
list_categories()
# Fetch data (dataflow auto-detected)
df <- unicefData(
indicator = "CME_MRY0T4",
countries = c("ALB", "USA", "BRA"),
year = "2015:2023"
)* Search for indicators
unicefdata, search(mortality)
unicefdata, flows
* Fetch data (dataflow auto-detected)
unicefdata, indicator(CME_MRY0T4) countries(ALB USA BRA) year(2015:2023) cleardevtools::install_github("unicef-drp/unicefData")
library(unicefData)git clone https://github.com/unicef-drp/unicefData.git
cd unicefData/python
pip install -e .* Using github package (recommended)
net install github, from("https://haghish.github.io/github/")
github install unicef-drp/unicefData, package(stata)See platform-specific READMEs for detailed installation options.
| Feature | R | Python | Stata |
|---|---|---|---|
| Unified API | unicefData() |
unicefData() |
unicefdata |
| Search indicators | search_indicators() |
search_indicators() |
unicefdata, search() |
| List categories | list_categories() |
list_categories() |
unicefdata, categories |
| Auto dataflow detection | ✅ | ✅ | ✅ |
| Filter by country, year, sex | ✅ | ✅ | ✅ |
| Wide/long formats | ✅ | ✅ | ✅ |
| Latest value per country | ✅ | ✅ | ✅ |
| MRV (most recent values) | ✅ | ✅ | ✅ |
| Circa (nearest year) | ✅ | ✅ | ✅ |
| Add metadata (region, income) | ✅ | ✅ | 🔜 |
| 700+ indicators | ✅ | ✅ | ✅ |
| Automatic retries | ✅ | ✅ | ✅ |
| Cache management | clear_unicef_cache() |
clear_cache() |
unicefdata, clearcache |
| Timeout exceptions | ✅ | SDMXTimeoutError |
✅ |
| Parameter | Type | Description |
|---|---|---|
indicator |
string/vector | Indicator code(s), e.g., "CME_MRY0T4" |
countries |
vector | ISO3 codes, e.g., ["ALB", "USA"] |
year |
int/string | Single (2020), range ("2015:2023"), or list |
sex |
string | "_T" (total), "F", "M", or "ALL" |
format |
string | "long", "wide", or "wide_indicators" |
latest |
boolean | Keep only most recent value per country |
mrv |
integer | Keep N most recent values per country |
circa |
boolean | Find closest available year |
See platform READMEs for complete parameter documentation.
| Category | Count | Description |
|---|---|---|
| NUTRITION | 112 | Stunting, wasting, etc. |
| CAUSE_OF_DEATH | 83 | Causes of death |
| CHILD_RELATED_SDG | 77 | SDG targets |
| WASH_HOUSEHOLDS | 57 | Water & Sanitation |
| PT | 43 | Child Protection |
| CHLD_PVTY | 43 | Child Poverty |
| CME | 39 | Child Mortality |
| EDUCATION | 38 | Education |
| HIV_AIDS | 38 | HIV/AIDS |
| MNCH | 38 | Maternal & Child Health |
| IMMUNISATION | 18 | Immunization |
Use list_categories() for the complete list (733 indicators across 22 categories).
| Indicator | Dataflow | Description |
|---|---|---|
CME_MRY0T4 |
CME | Under-5 mortality rate |
CME_MRM0 |
CME | Neonatal mortality rate |
NT_ANT_HAZ_NE2_MOD |
NUTRITION | Stunting prevalence |
IM_DTP3 |
IMMUNISATION | DTP3 coverage |
IM_MCV1 |
IMMUNISATION | Measles coverage |
WS_PPL_W-SM |
WASH | Safely managed water |
PT_CHLD_Y0T4_REG |
PT | Birth registration |
unicefData/
├── R/ # R package
│ ├── *.R # R source files
│ ├── metadata/current/ # R metadata cache
│ └── README.md # R documentation
├── python/ # Python package
│ ├── unicef_api/ # Python module
│ ├── metadata/current/ # Python metadata cache
│ └── README.md # Python documentation
├── stata/ # Stata package
│ ├── src/ # Stata source files
│ ├── metadata/current/ # Stata metadata cache
│ ├── qa/ # QA test suite
│ └── README.md # Stata documentation
├── config/ # Shared configuration
├── tests/ # Cross-platform tests
├── validation/ # Cross-platform validation
├── DESCRIPTION # R package metadata
├── NEWS.md # Changelog
└── README.md # This file
Cross-Language Quality & Testing
- Cache management APIs:
clear_cache()(Python),clear_unicef_cache()(R),clearcache(Stata) - Error handling improvements: Configurable timeouts with
SDMXTimeoutError(Python), fixedapply_circa()NA handling (R) - Portability: Removed all hardcoded paths; R uses
system.file(), Stata uses 3-tier resolution - Error context: All 404 errors now show which dataflows were tried
- Cross-language test suite: 39 shared fixture tests (Python 14, R 13, Stata 12)
- YAML schema documentation: Comprehensive format reference for all 7 YAML file types
Major Quality Milestone
- SYNC-02 fix: Resolved critical metadata enrichment bug
- 100% test coverage: R (26), Python (28), Stata (38/38)
- Cross-platform parity: All platforms aligned
- Fixed 404 fallback behavior
- Added dynamic User-Agent strings
- Added comprehensive test coverage
See NEWS.md for complete changelog.
The package automatically downloads and caches indicator metadata on first use. Cache refreshes every 30 days.
Stata:
unicefdata_refresh_all, verbosePython:
from unicef_api import refresh_indicator_cache
refresh_indicator_cache()R:
refresh_indicator_cache()Python:
from unicefdata import clear_cache
clear_cache() # Clears all 5 cache layers, reloads YAMLR:
clear_unicef_cache() # Clears all 6 cache layers, reloads YAMLStata:
unicefdata, clearcache# Sync metadata across all platforms
.\scripts\sync_metadata_cross_language.ps1See docs/METADATA_GENERATION_GUIDE.md for detailed metadata sync documentation.
R:
devtools::test()Python:
cd python && pytestStata:
cd stata/qa
do run_tests.doShared test fixtures validate structural consistency across all three languages:
# Python
python tests/test_cross_language_output.py
# R
Rscript tests/test_cross_language_output.R
# Stata
do tests/test_cross_language_output.docd validation
python run_validation.py --limit 10 --languages python r stataSee validation/ for validation documentation, including the Quick Start, Indicator Testing Guide, and Documentation Index.
See CONTRIBUTING.md for full guidelines.
- Report bugs — Open an issue
- Request features — Suggest new indicators or functionality
- Submit code — Fork, create branch, open pull request
git clone https://github.com/unicef-drp/unicefData.git
cd unicefData
# Python
cd python && pip install -e .
# R (in RStudio)
devtools::load_all()
# Stata
cd stata && do install_local.do- UNICEF Data Portal: https://data.unicef.org/
- SDMX API Docs: https://data.unicef.org/sdmx-api-documentation/
- GitHub: https://github.com/unicef-drp/unicefData
- Issues: https://github.com/unicef-drp/unicefData/issues
This trilingual package ecosystem was developed at the UNICEF Data and Analytics Section. The author gratefully acknowledges the collaboration of Lucas Rodrigues, Yang Liu, and Karen Avanesian, whose technical contributions and feedback were instrumental in the development of this comprehensive data access library.
Special thanks to Yves Jaques, Alberto Sibileau, and Daniele Olivotti for designing and maintaining the UNICEF SDMX data warehouse infrastructure that makes this package possible.
The author also acknowledges the UNICEF database managers and technical teams who ensure data quality, as well as the country office staff and National Statistical Offices whose data collection efforts make this work possible.
Development of this package was supported by UNICEF institutional funding for data infrastructure and statistical capacity building. The author also acknowledges UNICEF colleagues who provided testing and feedback during development, as well as the broader open-source communities across R, Python, and Stata.
This package is provided for research and analytical purposes.
The unicefData package provides programmatic access to UNICEF's public data warehouse. While the author is affiliated with UNICEF, this package is not an official UNICEF product and the statements in this documentation are the views of the author and do not necessarily reflect the policies or views of UNICEF.
Data accessed through this package comes from the UNICEF Data Warehouse. Users should verify critical data points against official UNICEF publications at data.unicef.org.
This software is provided "as is", without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement. In no event shall the authors or UNICEF be liable for any claim, damages or other liability arising from the use of this software.
The designations employed and the presentation of material in this package do not imply the expression of any opinion whatsoever on the part of UNICEF concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries.
Important Note on Data Vintages
Official statistics are subject to revisions as new information becomes available and estimation methodologies improve. UNICEF indicators are regularly updated based on new surveys, censuses, and improved modeling techniques. Historical values may be revised retroactively to reflect better information or methodological improvements.
For reproducible research and proper data attribution, users should:
- Document the indicator code - Specify the exact SDMX indicator code(s) used (e.g.,
CME_MRY0T4) - Record the download date - Note when data was accessed (e.g., "Data downloaded: 2026-02-09")
- Cite the data source - Reference both the package and the UNICEF Data Warehouse
- Archive your dataset - Save a copy of the exact data used in your analysis
Example citations for data used in research:
- R:
Under-5 mortality data (indicator: CME_MRY0T4) accessed from UNICEF Data Warehouse via unicefData R package (v2.1.0) on 2026-02-09. Data available at: https://sdmx.data.unicef.org/ - Python:
Under-5 mortality data (indicator: CME_MRY0T4) accessed from UNICEF Data Warehouse via unicefData Python package (v2.1.0) on 2026-02-09. Data available at: https://sdmx.data.unicef.org/ - Stata:
Under-5 mortality data (indicator: CME_MRY0T4) accessed from UNICEF Data Warehouse via unicefData Stata package (v2.1.0) on 2026-02-09. Data available at: https://sdmx.data.unicef.org/
This practice ensures that others can verify your results and understand any differences that may arise from data updates. For official UNICEF statistics in publications, always cross-reference with the current version at data.unicef.org.
If you use this package in published work, please cite:
Azevedo, J.P. (2026). "unicefdata: Unified access to UNICEF indicators across R, Python, and Stata." Working paper. URL: https://github.com/unicef-drp/unicefData
@article{azevedo2026unicefdata,
title = {unicefdata: Unified access to {UNICEF} indicators across {R}, {Python}, and {Stata}},
author = {Azevedo, Joao Pedro},
year = {2026},
note = {Working paper},
url = {https://github.com/unicef-drp/unicefData}
}Development assisted by AI coding tools (GitHub Copilot, Claude). All code reviewed and validated by maintainers.
Joao Pedro Azevedo (@jpazvd) Chief Statistician, UNICEF Data and Analytics Section
MIT License — See LICENSE