Replicable AI for Microplanning (ramp) — ABLE Fork
Our team aspires to turn over control of the data value chain to humanitarians. The Replicable AI for Microplanning (ramp) project is producing an open-source deep learning model to accurately digitize buildings in low-and-middle-income countries using satellite imagery as well as enable in-country users to build their own deep learning models for their regions of interest.
This codebase provides python-based machine learning and data processing tools, based on Tensorflow and the Python geospatial tool set, for using deep learning to predict building models from high-resolution satellite imagery.
The ramp online documentation website contains complete documentation of the upstream mission, the ramp project, the codebase, and the associated open source dataset of image chips and label geojson files.
This is ABLE's customised fork of ramp-code, maintained on the able branch. It is tuned for our specific infrastructure (RHEL 9 / Tesla T4 GPU) and includes additional data preparation and prediction scripts contributed by Reut.
Key changes from upstream:
- Dockerfile upgraded to TF 2.15 / Python 3.11, targeting RHEL 9 with Tesla T4 GPU support
- docker-compose.yml replaces the manual
docker runworkflow with a single-command environment that handles GPU passthrough, volume mounts, and environment variables - reut-scripts/ added — a collection of data preparation, tiling, compression, and prediction pipeline scripts contributed by Reut
- GPU memory and CPU worker counts tuned for Tesla T4 (13 GB VRAM, 28-core host)
ramp-code/
├── colab/
│ └── README.md
│ └── jupyter_lab_on_colab.ipynb
│ └── train_ramp_model_on_colab.ipynb
├── data/
├── docker/
│ └── pipped-requirements.txt
├── Dockerfile # updated: TF 2.15, RHEL 9, Tesla T4
├── docker-compose.yml # new: single-command environment setup
├── docs/
├── experiments/
├── notebooks/
├── ramp/
├── reut-scripts/ # new: scripts contributed by Reut
│ ├── scripts/
│ │ ├── python/ # data prep and prediction Python scripts
│ │ ├── shell-scripts/ # orchestration shell scripts
│ │ └── notebooks/ # exploratory notebooks
│ ├── filter_rasters/ # raster filtering and clipping tools
│ └── duplicates_geoms/ # duplicate geometry detection
├── scripts/
├── setup.py
├── shell-scripts/
└── solaris/
This fork replaces the manual docker run commands from the upstream README with a docker-compose workflow. All GPU passthrough, volume mounts, ports, and environment variables are pre-configured.
- RHEL 9 (or Ubuntu 20.04) with at least one NVIDIA Tesla T4 GPU
- NVIDIA driver installed
- Docker CE + NVIDIA Container Toolkit installed
- Your user added to the
dockergroup
# Build the image and start the container
docker compose up --build -d
# Open a shell in the running container
docker compose exec able bashThe container mounts the following directories from the host into /tf/ inside the container:
| Host path | Container path | Purpose |
|---|---|---|
/mnt/able-data |
/tf/ramp-data |
Training data (fast 1.2 TB drive) |
./scripts |
/tf/scripts |
Core ramp Python scripts |
./shell-scripts |
/tf/shell-scripts |
Core shell orchestration scripts |
./reut-scripts |
/tf/reut-scripts |
Reut's data prep and prediction scripts |
./experiments |
/tf/experiments |
Training config JSON files |
./data |
/tf/data |
Sample data |
./notebooks |
/tf/notebooks |
Jupyter notebooks |
./ramp |
/tf/ramp-code/ramp |
ramp source (live-editable) |
Jupyter is available at http://localhost:8888 and TensorBoard at http://localhost:6006.
docker compose downReut contributed a set of scripts for the full data preparation and prediction pipeline, available in reut-scripts/ (mounted at /tf/reut-scripts inside the container). These scripts cover the end-to-end workflow from raw GeoTIFF imagery through to predicted building footprints in GeoJSON.
Takes a folder of source chips and labels, generates multimasks, and splits into train/validate sets (85/15 split by default).
./prepare_dataset_for_training.sh -i <dataset_folder_relative_to_/tf>Expected input layout:
/tf/<input>/
source/ # image chips (.tif)
labels/ # label GeoJSON files
Output layout (created by the script):
/tf/<input>/
multimasks/
train/chips, train/labels, train/multimasks
validate/chips, validate/labels, validate/multimasks
Tiles a folder of GeoTIFFs into 256x256 chips, with a no-data threshold of 40%.
./batch_tile_datasets.sh -i <input_folder> -o <output_folder> -p <chip_prefix>Batch-compresses GeoTIFFs to Cloud-Optimised GeoTIFF (COG) format using JPEG compression at quality 50, processing up to 25 files in parallel.
./batch_compress_geotiffs.sh -i <input_folder> -o <output_folder>The main end-to-end prediction pipeline. Takes raw GeoTIFFs, tiles them, runs the model, and outputs predicted building footprints as GeoJSON. Supports resuming from any step.
./batch_run_predictions_on_geotiffs.sh \
-i <input_folder> \
-o <output_folder> \
-m <model_path> \
[-s <start_step>] # 1=tile (default), 2=predict, 3=geojsonOutput:
<output_folder>/
geojson/predicted_buildings.geojson
run.log
Intermediate chip and mask files are cleaned up automatically on success.
The tiling step parallelises across RAMP_TILE_WORKERS processes (default 21, tuned for our 28-core host).
Orchestrates a full model training run.
Reut's Python scripts extend the core ramp scripts with additional capabilities:
| Script | Purpose |
|---|---|
tile_dataset_1.py |
Single-image tiling (used by the batch prediction pipeline) |
get_preds_with_heatmap.py |
Prediction with confidence heatmap output |
bgr_to_rgb.py |
Convert BGR-ordered imagery to RGB |
remove_nir.py / remove_nir_multiprocess.py |
Strip NIR band from 4-band imagery |
ntf_to_tif.py |
Convert NTF format to GeoTIFF |
rename_to_tif.py |
Batch rename imagery files to .tif |
select_rgb.py |
Select RGB bands from multi-band rasters |
clip_images_to_aoi.py / clip_images_with_clean_geom.py |
Clip imagery to AOI geometry |
augmentation.py |
Data augmentation utilities |
qa_imgs.py |
QA checks on image chips |
| Notebook | Purpose |
|---|---|
maxar_pipeline.ipynb |
End-to-end pipeline for Maxar imagery |
QA_images_for_ramp.ipynb |
Visual QA of image chips |
calculate_iou.ipynb |
IoU metric calculation |
BGR_to_RGB.ipynb |
Band order conversion |
remove_nir_multithreading.ipynb |
NIR band removal with multithreading |
built_index.ipynb |
Built-up area index calculation |
merge_shp.ipynb |
Merge shapefiles |
ntf_to_tif.ipynb |
NTF to GeoTIFF conversion |
The docker-compose environment is pre-tuned for a Tesla T4 GPU:
| Variable | Value | Meaning |
|---|---|---|
RAMP_GPU_MEMORY_MB |
13312 |
75% of T4's 15,360 MiB VRAM |
RAMP_TILE_WORKERS |
21 |
75% of 28 CPU cores for parallel tiling |
To adjust for a different machine, edit these values in docker-compose.yml.
The ramp/ source directory is mounted live into the container, so edits to the ramp library are reflected immediately without rebuilding the image. If you add new dependencies, rebuild with:
docker compose up --buildUse the standard ramp training script, or Reut's train_model_process.bash, from inside the container:
python /tf/scripts/train_ramp.py -config /tf/experiments/<your_config>.jsonTraining configuration files live in experiments/. See docs/using_the_ramp_training_configuration_file.md for full documentation on config options.
This fork is based on devglobalpartners/ramp-code. The upstream project and its full dataset are documented at rampml.global.
LICENSING:
This software has been licensed under the Apache 2.0 software license.