Skip to content

teco-kit/whar-datasets

Repository files navigation

WHAR Datasets

This library offers comprehensive support for widely used WHAR (Wearable Human Activity Recognition) datasets, including:

  • automated downloading from original sources and data extraction
  • parsing into a unified, standardized data format
  • configurable pre-processing (e.g., resampling, windowing) and post-processing (e.g., normalization)
  • dataset splitting for common evaluation protocols such as LOSO and K-Fold cross-validation
  • built-in caching and multi-processing for improved performance
  • seamless integration with PyTorch and TensorFlow

The library currently includes out-of-the-box support for 33 datasets (listed below). Additional WHAR datasets can be easily integrated by defining a custom configuration with an associated parser and registering it with the framework.

Notice

This library does not host any datasets. To use a dataset, please visit its original website and make sure you understand and agree to the dataset’s terms and conditions.

How to Use

Installation

pip install "git+https://github.com/teco-kit/whar-datasets.git"

Example with PyTorch

from whar_datasets import (
    Loader,
    LOSOSplitter,
    PostProcessingPipeline,
    PreProcessingPipeline,
    TorchAdapter,
    WHARDatasetID,
    get_dataset_cfg,
)

# create cfg for WISDM dataset
cfg = get_dataset_cfg(WHARDatasetID.WISDM)

# create and run pre-processing pipeline
pre_pipeline = PreProcessingPipeline(cfg)
activity_df, session_df, window_df = pre_pipeline.run()

# create LOSO splits
splitter = LOSOSplitter(cfg)
splits = splitter.get_splits(session_df, window_df)
split = splits[0]

# create and run post-processing pipeline for the specific split
post_pipeline = PostProcessingPipeline(cfg, pre_pipeline, window_df, split.train_indices)
samples = post_pipeline.run()

# create dataloaders for the specific split
loader = Loader(session_df, window_df, post_pipeline.samples_dir, samples)
adapter = TorchAdapter(cfg, loader, split)
dataloaders = adapter.get_dataloaders(batch_size=64)

Supported Datasets

Single-Sensor Datasets

Supported Name Year Paper Citations
WISDM 2010 Activity Recognition using Cell Phone Accelerometers 3862
UCI-HAR 2013 A Public Domain Dataset for Human Activity Recognition using Smartphones 3372
UTD-MHAD 2015 UTD-MHAD: A Multimodal Dataset for Human Action Recognition Utilizing a Depth Camera and a Wearable Inertial Sensor 997
HAPT 2016 Transition-aware human activity recognition using smartphones. 939
USC-HAD 2012 USC-HAD: A Daily Activity Dataset for Ubiquitous Activity Recognition Using Wearable Sensors 753
UniMiB-SHAR 2017 Unimib shar: a dataset for human activity recognition using acceleration data from smartphones 712
MotionSense 2019 Mobile Sensor Data Anonymization 345
RealLifeHAR 2020 A Public Domain Dataset for Real-Life Human Activity Recognition Using Smartphone Sensors 208
WISDM-19-PHONE 2019 WISDM: Smartphone and Smartwatch Activity and Biometrics Dataset 198
WISDM-19-WATCH 2019 WISDM: Smartphone and Smartwatch Activity and Biometrics Dataset 198
KU-HAR 2021 KU-HAR: An open dataset for heterogeneous human activity recognition 187
Hang-Time 2023 Hang-time HAR: A benchmark dataset for basketball activity recognition using wrist-worn inertial sensors 52
CAPTURE-24 2024 CAPTURE-24: A large dataset of wrist-worn activity tracker data collected in the wild for human activity recognition 45

Multi-Sensor Datasets

Supported Name Year Paper Citations
PAMAP2 2012 Introducing a New Benchmarked Dataset for Activity Monitoring 1758
OPPORTUNITY 2010 Collecting complex activity datasets in highly rich networked sensor environments 1024
HHAR 2015 Smart Devices are Different: Assessing and Mitigating Mobile Sensing Heterogeneities for Activity Recognition 1019
MHEALTH 2014 mHealthDroid: A Novel Framework for Agile Development of Mobile Health Applications 887
DSADS 2010 Comparative study on classifying human activities with miniature inertial and magnetic sensors 780
SAD 2014 Fusion of Smartphone Motion Sensors for Physical Activity Recognition 752
Daphnet 2009 Ambulatory monitoring of freezing of gait in Parkinson’s disease 652
RealWorld 2016 On-body Localization of Wearable Devices: An Investigation of Position-Aware Activity Recognition 482
UP-Fall 2019 UP-fall detection dataset: A multimodal approach 462
UMAFall 2017 Umafall: A multisensor dataset for the research on automatic fall detection 243
REALDISP 2014 Dealing with the Effects of Sensor Displacement in Wearable Activity Recognition 216
HuGaDB 2018 HuGaDB: Human Gait Database for Activity Recognition from Wearable Inertial Sensor Networks 154
HARTH 2021 HARTH: A Human Activity Recognition Dataset for Machine Learning 132
w-HAR 2020 w-HAR: An Activity Recognition Dataset and Framework Using Low-Power Wearable Devices 100
WEAR 2024 Wear: An outdoor sports dataset for wearable and egocentric activity recognition 66
HAR70+ 2021 A machine learning classifier for detection of physical activity types and postures during free-living 55
UCA-EHAR 2022 UCA-EHAR: A Dataset for Human Activity Recognition with Embedded AI on Smart Glasses 35
GOTOV 2022 A recurrent neural network architecture to model physical activity energy expenditure in older people 33

Citation

If you use the WHAR Datasets library in your research, please cite our paper:

@inproceedings{burzer2025whar,
  title={WHAR Datasets: An Open Source Library for Wearable Human Activity Recognition},
  author={Burzer, Maximilian and King, Tobias and Riedel, Till and Beigl, Michael and R{\"o}ddiger, Tobias},
  booktitle={Companion of the 2025 ACM International Joint Conference on Pervasive and Ubiquitous Computing},
  pages={1315--1322},
  year={2025}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages