GitHub - JSOUTHWELLCODE/Heads_ears: Scripts , envrionement configs and AI code for Anthropometry project

Tech Arena 2025 — Anthropometric Measurement Extraction (RGB-D → 11D) Predict 11 anthropometric measurements (1 head width and 5 for each ear) from multi-view RGB-D images using a ResNet-18 CNN trained on 2D images.

This project was developed for Tech Arena 2025 (Anthropometric Data Extraction Track) by Team HuddsPros.

🧠 Core Idea Instead of using complex 3D reconstruction or point-cloud methods, this system predicts anthropometric measurements directly from multiple 2D RGB images of a subject. A pretrained ResNet-18 (ImageNet) backbone is fine-tuned for regression on an 11-dimensional vector. During inference, predictions from multiple views are averaged to form a subject-level output. Pipeline Summary: Convert .HEIC → .PNG (RGB + depth separation). Resize and normalise using ImageNet statistics. Load images and 11D ground-truths into a PyTorch Dataset. Fine-tune ResNet-18 (frozen backbone + new 11-output regression head). Average predictions across all subject views at inference.

📂Repository Structure Heads_ears/ │ ├─ Environment/ │ ├─ environment_cpu.yml │ └─ environment.yml │ ├─ PythonScripts/ │ ├─ train_model.py # Main training script │ ├─ resnet18_anthropometry_best_weights.pt # Trained weights │ │ │ └─ sorting_data/ # Preprocessing and utility scripts │ ├─ Process_DATA.py # Converts HEIC → PNG, separates RGB & depth │ ├─ preprocess_size.py # Resizes large images │ ├─ csvfileread.py # Loads anthropometrics.csv → subject labels │ ├─ checkresolution.py # Verifies image resolution │ ├─ remove_depth_images.py # Moves depth maps into one folder │ ├─ Split_depth_maps.py # Reassigns depth maps to subjects │ ├─ Split_training_test.py # Splits subjects into train/val/test │ ├─ HIEF_PNG_liam.py # Optional HEIC-to-PNG converter (legacy) │ └─ prediction.py # Inference and averaging across views │ ├─ .gitignore └─ README.md________________________________________

⚙️ Setup Instructions 1️⃣ Create environment conda env create -f environment_cpu.yml conda activate anthropometry pip install pillow-heif 2️⃣ Verify installation python -c "import torch; print('Torch:', torch.version, '| CUDA available?', torch.cuda.is_available())"

🧾 Ground Truth Format Each subject’s entry in anthropometrics.csv corresponds to an 11D vector: subject,headwidth,p1_left,p2_left,p3_left,p4_left,p5_left,p1_right,p2_right,p3_right,p4_right,p5_right P0001,0.14399,0.02679,0.01669,0.02307,0.07029,0.03025,0.02550,0.01709,0.02457,0.07021,0.03342

🚀 Running the Pipeline 1️⃣ Convert HEIC → PNG python PythonScripts/sorting_data/Process_DATA.py 2️⃣ Train the Model python PythonScripts/train_model.py Training configuration: Model: ResNet-18 (ImageNet pretrained) Optimiser: SGD (momentum = 0.9, lr = 1e-3, StepLR step=7, γ=0.1) Loss: MSE Batch size: 16 Epochs: 25 Device: CUDA if available, else CPU Saves weights to resnet18_anthropometry_best_weights.pt 3️⃣ Run Predictions python PythonScripts/sorting_data/prediction.py Loads trained weights Averages predictions across all views per subject Outputs the final 11D prediction vector per subject

📊 Evaluation Metric Competition metric: d=√(∑_i(s ̂_i-s_i )^2 )

where both vectors are standardised using the training-set mean and standard deviation. Lower distance = higher accuracy.

🧮 Model Summary Stage Description Input RGB PNG images (224×224) Backbone ResNet-18 (ImageNet pretrained) Head Linear (512 → 11) regression layer Loss Mean Squared Error (MSE) Training Image-level; predictions averaged across views at inference

🛠️ Tools & Libraries Component Tool Language Python 3.10 Framework PyTorch, TorchVision Image I/O Pillow, Pillow-HEIF Data Handling Pandas, NumPy Environment Conda (environment_cpu.yml) IDE VSCode

📈 Future Work Integrate depth as a fourth CNN channel (RGB-D fusion). Automate ear cropping for more localized features. Explore attention-based multi-view fusion instead of mean pooling. Deploy model in a lightweight Extractor API for leaderboard testing.

👥 Authors William Meredith Jonathan Southwell University of Huddersfield — Team HuddsPros

📚 References He, K. et al. Deep Residual Learning for Image Recognition, CVPR 2016. Fantini, D. et al. A Survey on ML Techniques for HRTF Individualization, IEEE OJSP 2025. Torres-Gallegos, E.A. et al. Photo-Anthropometry for HRTF Personalization, Applied Acoustics 2015. PyTorch Documentation – https://pytorch.org ResNet18 – https://pytorch.org/vision/stable/models/generated/torchvision.models.resnet18.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Environment		Environment
PythonScripts		PythonScripts
.gitignore		.gitignore
README.md		README.md

JSOUTHWELLCODE/Heads_ears

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages