Skip to content

kshitijbhattWHO/ramp-code

 
 

Repository files navigation

Our team aspires to turn over control of the data value chain to humanitarians. The Replicable AI for Microplanning (ramp) project is producing an open-source deep learning model to accurately digitize buildings in low-and-middle-income countries using satellite imagery as well as enable in-country users to build their own deep learning models for their regions of interest.

This codebase provides python-based machine learning and data processing tools, based on Tensorflow and the Python geospatial tool set, for using deep learning to predict building models from high-resolution satellite imagery.

The ramp online documentation website contains complete documentation of the upstream mission, the ramp project, the codebase, and the associated open source dataset of image chips and label geojson files.


About this fork

This is ABLE's customised fork of ramp-code, maintained on the able branch. It is tuned for our specific infrastructure (RHEL 9 / Tesla T4 GPU) and includes additional data preparation and prediction scripts contributed by Reut.

Key changes from upstream:

  • Dockerfile upgraded to TF 2.15 / Python 3.11, targeting RHEL 9 with Tesla T4 GPU support
  • docker-compose.yml replaces the manual docker run workflow with a single-command environment that handles GPU passthrough, volume mounts, and environment variables
  • reut-scripts/ added — a collection of data preparation, tiling, compression, and prediction pipeline scripts contributed by Reut
  • GPU memory and CPU worker counts tuned for Tesla T4 (13 GB VRAM, 28-core host)

Project structure

ramp-code/
├── colab/
│   └── README.md
│   └── jupyter_lab_on_colab.ipynb
│   └── train_ramp_model_on_colab.ipynb
├── data/
├── docker/
│   └── pipped-requirements.txt
├── Dockerfile                    # updated: TF 2.15, RHEL 9, Tesla T4
├── docker-compose.yml            # new: single-command environment setup
├── docs/
├── experiments/
├── notebooks/
├── ramp/
├── reut-scripts/                 # new: scripts contributed by Reut
│   ├── scripts/
│   │   ├── python/               # data prep and prediction Python scripts
│   │   ├── shell-scripts/        # orchestration shell scripts
│   │   └── notebooks/            # exploratory notebooks
│   ├── filter_rasters/           # raster filtering and clipping tools
│   └── duplicates_geoms/         # duplicate geometry detection
├── scripts/
├── setup.py
├── shell-scripts/
└── solaris/

Getting started with Docker (recommended)

This fork replaces the manual docker run commands from the upstream README with a docker-compose workflow. All GPU passthrough, volume mounts, ports, and environment variables are pre-configured.

Prerequisites

  1. RHEL 9 (or Ubuntu 20.04) with at least one NVIDIA Tesla T4 GPU
  2. NVIDIA driver installed
  3. Docker CE + NVIDIA Container Toolkit installed
  4. Your user added to the docker group

Build and start the environment

# Build the image and start the container
docker compose up --build -d

# Open a shell in the running container
docker compose exec able bash

The container mounts the following directories from the host into /tf/ inside the container:

Host path Container path Purpose
/mnt/able-data /tf/ramp-data Training data (fast 1.2 TB drive)
./scripts /tf/scripts Core ramp Python scripts
./shell-scripts /tf/shell-scripts Core shell orchestration scripts
./reut-scripts /tf/reut-scripts Reut's data prep and prediction scripts
./experiments /tf/experiments Training config JSON files
./data /tf/data Sample data
./notebooks /tf/notebooks Jupyter notebooks
./ramp /tf/ramp-code/ramp ramp source (live-editable)

Jupyter is available at http://localhost:8888 and TensorBoard at http://localhost:6006.

Stopping the environment

docker compose down

Reut's scripts

Reut contributed a set of scripts for the full data preparation and prediction pipeline, available in reut-scripts/ (mounted at /tf/reut-scripts inside the container). These scripts cover the end-to-end workflow from raw GeoTIFF imagery through to predicted building footprints in GeoJSON.

Shell scripts (inside the container at /tf/reut-scripts/scripts/shell-scripts/)

prepare_dataset_for_training.sh

Takes a folder of source chips and labels, generates multimasks, and splits into train/validate sets (85/15 split by default).

./prepare_dataset_for_training.sh -i <dataset_folder_relative_to_/tf>

Expected input layout:

/tf/<input>/
    source/     # image chips (.tif)
    labels/     # label GeoJSON files

Output layout (created by the script):

/tf/<input>/
    multimasks/
    train/chips, train/labels, train/multimasks
    validate/chips, validate/labels, validate/multimasks

batch_tile_datasets.sh

Tiles a folder of GeoTIFFs into 256x256 chips, with a no-data threshold of 40%.

./batch_tile_datasets.sh -i <input_folder> -o <output_folder> -p <chip_prefix>

batch_compress_geotiffs.sh

Batch-compresses GeoTIFFs to Cloud-Optimised GeoTIFF (COG) format using JPEG compression at quality 50, processing up to 25 files in parallel.

./batch_compress_geotiffs.sh -i <input_folder> -o <output_folder>

batch_run_predictions_on_geotiffs.sh

The main end-to-end prediction pipeline. Takes raw GeoTIFFs, tiles them, runs the model, and outputs predicted building footprints as GeoJSON. Supports resuming from any step.

./batch_run_predictions_on_geotiffs.sh \
  -i <input_folder> \
  -o <output_folder> \
  -m <model_path> \
  [-s <start_step>]   # 1=tile (default), 2=predict, 3=geojson

Output:

<output_folder>/
    geojson/predicted_buildings.geojson
    run.log

Intermediate chip and mask files are cleaned up automatically on success.

The tiling step parallelises across RAMP_TILE_WORKERS processes (default 21, tuned for our 28-core host).

train_model_process.bash

Orchestrates a full model training run.

Python scripts (inside the container at /tf/reut-scripts/scripts/python/)

Reut's Python scripts extend the core ramp scripts with additional capabilities:

Script Purpose
tile_dataset_1.py Single-image tiling (used by the batch prediction pipeline)
get_preds_with_heatmap.py Prediction with confidence heatmap output
bgr_to_rgb.py Convert BGR-ordered imagery to RGB
remove_nir.py / remove_nir_multiprocess.py Strip NIR band from 4-band imagery
ntf_to_tif.py Convert NTF format to GeoTIFF
rename_to_tif.py Batch rename imagery files to .tif
select_rgb.py Select RGB bands from multi-band rasters
clip_images_to_aoi.py / clip_images_with_clean_geom.py Clip imagery to AOI geometry
augmentation.py Data augmentation utilities
qa_imgs.py QA checks on image chips

Notebooks (inside the container at /tf/reut-scripts/scripts/notebooks/)

Notebook Purpose
maxar_pipeline.ipynb End-to-end pipeline for Maxar imagery
QA_images_for_ramp.ipynb Visual QA of image chips
calculate_iou.ipynb IoU metric calculation
BGR_to_RGB.ipynb Band order conversion
remove_nir_multithreading.ipynb NIR band removal with multithreading
built_index.ipynb Built-up area index calculation
merge_shp.ipynb Merge shapefiles
ntf_to_tif.ipynb NTF to GeoTIFF conversion

GPU and performance tuning

The docker-compose environment is pre-tuned for a Tesla T4 GPU:

Variable Value Meaning
RAMP_GPU_MEMORY_MB 13312 75% of T4's 15,360 MiB VRAM
RAMP_TILE_WORKERS 21 75% of 28 CPU cores for parallel tiling

To adjust for a different machine, edit these values in docker-compose.yml.


Development workflow

The ramp/ source directory is mounted live into the container, so edits to the ramp library are reflected immediately without rebuilding the image. If you add new dependencies, rebuild with:

docker compose up --build

Training a model

Use the standard ramp training script, or Reut's train_model_process.bash, from inside the container:

python /tf/scripts/train_ramp.py -config /tf/experiments/<your_config>.json

Training configuration files live in experiments/. See docs/using_the_ramp_training_configuration_file.md for full documentation on config options.


Upstream project

This fork is based on devglobalpartners/ramp-code. The upstream project and its full dataset are documented at rampml.global.


LICENSING:

This software has been licensed under the Apache 2.0 software license.

About

Open-source repository for the ramp (Replicable AI for MicroPlanning) project

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 83.2%
  • Python 16.2%
  • Other 0.6%