- [February 23, 2026] We have officially moved from willxxy/ECG-Bench to ELM-Researc/ELM. There are major updates to the documentation and flow of the code. Please read the documentation and feel free to post any issues!
A research framework for finetuning and evaluating ECG-language models (ELMs). Supports multiple architectures, training objectives, and data representations with distributed training out of the box. Prepare datasets with ecg_preprocess before use. Additionally, if you want to pretrain an ECG encoder, please view ecg_nn.
We hope to continuously update the repository to support more features, ELMs, and datasets. Please feel free to contribute to the repository! If there are any questions or bugs, please do not hesitate to reach out to wjhan{@}andrew{dot}cmu{edu} or submit an issue with corresponding details.
Status: Beta.
We use torch 2.9 with cuda 12.8 and primarily use H100 GPUs.
git clone https://github.com/ELM-Research/ELM.git
cd ELM && uv syncFor BPE symbolic representation with ECG-Byte, compile the Rust tokenizer:
cd src/dataloaders/data_representation/bpe
maturin develop --releaseIf Rust is not installed: curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --default-toolchain=1.82.0 -y
First, preprocess the ECGs using the ecg_preprocess repository.
The structure in which the data folder should be in is the following:
data
├── csn
│ ├── preprocessed_1250
│ ├── preprocessed_500
│ └── preprocessed_2500
├── cpsc
│ └── ...
├── ptb_xl
│ └── ...
├── mimic_iv
│ └── ...
└── code15
└── ...
We support the following datasets in a unified way through datasets from HuggingFace. These datasets will include the ecg_path which is the path to the .npy files in the data folder. It will also include the conversational data (text).
Note that we support mixing different datasets via specifying multiple datas like so:
--data ecg-qa-ptbxl-250-2500 ecg-qa-mimic-iv-ecg-250-2500
We also released synthetic classification datasets on Hugging Face for signal-type identification tasks, where the model predicts whether the input signal is ECG, noise, or flatline. Dataset names follow this format: ecg-comp-ecg-noise-flatline-20000-250-2500. In this example, the dataset contains 20,000 instances per class (ECG, noise, and flatline) in total across training and test splits. We also provide binary classification variants, such as ecg-comp-noise-flatline-30000-250-2500. This indicates a binary task with noise and flatline classes, with 30,000 instances per class across the train and test splits.
For additional datasets and task details, see HF_DATASETS of src/configs/constants.py and src/dataloaders/system_prompts/.
--data_representation |
Description |
|---|---|
signal |
Raw ECG matrix |
symbolic |
BPE-tokenized symbolic sequence |
stacked_signal |
Synthetic three-channel version of signal, denoted signal three times along the color dimension |
rgb |
Derived from signal via plotting and is represented as a tensor H and W denote the image height and width, respectively, and C′ is the number of color channels |
We utilize the following pretrained LLMs from HuggingFace.
| LLM | --llm |
|---|---|
| Llama 3 | llama-3.2-3b-instruct |
| Llama 3 | llama-3.2-1b-instruct |
| Gemma 2 | gemma-2-2b-it |
| Qwen 2.5 | qwen2.5-7b-instruct |
| Qwen 2.5 | qwen2.5-1.5b-instruct |
We utilize the following ECG-specific encoders.
| ECG Encoders | --encoder |
--data_representation |
|---|---|---|
| MERL | merl |
signal |
| MLAE | mlae |
signal |
| MTAE | mtae |
signal |
| ST-Mem | st_mem |
signal |
We utilize the following pretrained vision encoders from HuggingFace.
| Vision Encoders | --encoder |
--data_representation |
|---|---|---|
| Siglip2 | siglip2-so400m-patch16-naflex |
rgb, stacked_signal |
| ViT | vit-base-patch16-224-in21k |
rgb, stacked_signal |
| CLIP | clip-vit-base-patch32 |
rgb, stacked_signal |
We implement several ELMs and describe how to train each variant.
We implement a Llava-like architecture where we connect the encoder to the LLM with a projection layer.
uv run src/main_trainer.py \
--data pretrain-mimic-250-2500 \
--data_representation $DATA_REPRESENTATION \
--llm qwen2.5-1.5b-instruct \
--encoder $ECG_ENCODER or $VISION_ENCODER \
--elm llavaFor multi-gpu training, launch the same script like so. This is general to all ELMs.
CUDA_VISIBLE_DEVICES=0,1,2,3 \
uv run torchrun --standalone --nproc_per_node=4 \
src/main_trainer.py \
--data pretrain-mimic-250-2500 \
--data_representation $DATA_REPRESENTATION \
--llm qwen2.5-1.5b-instruct \
--encoder $ECG_ENCODER or $VISION_ENCODER \
--elm llava \
--distributedFor ECG Encoders, you will have to pretrain your own ECG Encoder using ecg_nn. We plan to release pretrained encoders soon! To load in the pretrained encoder during ELM training run the following:
uv run src/main_trainer.py \
--data pretrain-mimic-250-2500 \
--data_representation signal \
--llm qwen2.5-1.5b-instruct \
--encoder $ECG_ENCODER \
--elm llava \
--encoder_ckpt $ENCODER_CHECKPOINT.ptTo update the encoder during ELM training, specify like so:
uv run src/main_trainer.py \
--data pretrain-mimic-250-2500 \
--data_representation $DATA_REPRESENTATION \
--llm qwen2.5-1.5b-instruct \
--encoder $ECG_ENCODER or $VISION_ENCODER \
--elm llava \
--update_encoderWe implement an encoder-free ELM, similar to Fuyu-8b.
uv run src/main_trainer.py \
--data pretrain-mimic-250-2500 \
--data_representation signal \
--llm qwen2.5-1.5b-instruct \
--elm fuyuWe implement ECG-Byte and provide a trained BPE tokenizer (src/dataloaders/data_representation/bpe/ecg_byte_tokenizer_10000.pkl). Note that you can also train your own BPE tokenizer in ecg_preprocess, however we find ECG-Byte to be generalizable across different datasets. To train an ELM with ECG-Byte run the following:
uv run src/main_trainer.py \
--data pretrain-mimic-250-2500 \
--data_representation symbolic \
--llm qwen2.5-1.5b-instruct \
--ecg_tokenizer src/dataloaders/data_representation/bpe/ecg_byte_tokenizer_10000.pkl \
--elm ecg_byteTo evaluate your model, just execute the main_evaluator.py file while specifying your trained ELM checkpoint via --elm_ckpt:
uv run src/main_evaluator.py \
--data ecg-qa-mimic-iv-ecg-250-2500 \
--data_representation signal \
--llm qwen2.5-1.5b-instruct \
--encoder merl \
--elm llava \
--encoder_ckpt $ENCODER_CHECKPOINT.pt \
--elm_ckpt $PATH_TO_ELM_CKPT.ptTo chat with your model, please have a sample *.npy file and a trained ELM checkpoint. Then run the following:
CUDA_VISIBLE_DEVICES=0 uv run src/main_chat.py \
--llm qwen2.5-0.5b-instruct \
--elm patch_elf \
--system_prompt src/dataloaders/system_prompts/system_prompt.txt \
--peft \
--elm_ckpt $ELM_CHECKPOINT.pt \
--num_encoder_tokens 100 \
--data_representation signal
After running the script, please load in the ECG by typing the following in the first turn:
============================================================
ELM Chat Interface
============================================================
Commands:
/ecg <path> Load an ECG signal (.npy file)
/clear Clear conversation history
/quit Exit
You: /ecg $PATH_TO_SAMPLE.npy
After this turn, you can ask any question for N turns and all answers after will be conditioned on this loaded ECG. We do not currently support adding additional ECGs into one conversation.
| Flag | Description |
|---|---|
--torch_compile |
torch.compile the model |
--data_subset |
Use dataset fraction for quick runs |
--augment_ecg / --augment_rgb |
Enable augmentations |
--perturb |
noise, zeros, or only_text |
--optimizer |
adam, adamw, muon |
We list the research that has been conducted using this repository. Please feel free to add your own research here!
- ECG-Byte: A Tokenizer for End-to-End Generative Electrocardiogram Language Modeling
- Signal, Image, or Symbolic: Exploring the Best Input Representation for Electrocardiogram-Language Models Through a Unified Framework
- Retrieval-Augmented Generation for Electrocardiogram-Language Models
- Encoder-Free ECG-Language Models
We welcome contributions to the repository! Please feel free to open an issue or pull request for any bugs or features you would like to add. We are always looking for new ECG datasets to benchmark our methods on. If you have any recommendations, please let us know! Also, a good place to start is by looking at the TODO section.
For most processes, we have a --dev flag to run in a smaller scale and add some verbosity for debugging. Feel free to add this flag when needed!
We thank the following people for their contributions to the repository:
This work is done in collaboration with the Mario Lemieux Center for Heart Rhythm Care at Allegheny General Hospital.
We thank Chaojing Duan, Michael A. Rosenberg, Emerson Liu, Ding Zhao, Hyoeun Kang, Wenhao Ding, Haohong Lin, Shiqi Liu, Xiaoyu (Simon) Song, Tony Chen, Atharva Mhaskar, Zhepeng Cen, Yihang Yao, and Dylan Leong for their helpful discussions, feedbacks, and support in developing the initial ECG-Bench which turned into the current ELM repository.
We thank the authors of ECG-Byte, MERL, ST-MEM, ECG-QA, ECG-Chat, PULSE, and GEM for their code and publicly released datasets.
Lastly, we thank HuggingFace for providing the APIs for the models.
MIT, except st_mem.py, mlae.py, mtae.py which are CC BY-NC 4.0.
