ReSurgSAM2: Referring Segment Anything in Surgical Video via Credible Long-term Tracking

Official implementation for ReSurgSAM2, an innovative model that leverages the power of the Segment Anything Model 2 (SAM2), integrating it with credible long-term tracking for real-time surgical video segmentation.

ReSurgSAM2: Referring Segment Anything in Surgical Video via Credible Long-term Tracking

Haofeng Liu, Mingqi Gao, Xuxiao Luo, Ziyue Wang, Guanyi Qin, Junde Wu, Yueming Jin

Early accepted by MICCAI 2025

Overview

Surgical scene segmentation is critical in computer-assisted surgery and is vital for enhancing surgical quality and patient outcomes. We introduce ReSurgSAM2, a two-stage surgical referring segmentation framework that:

Leverages SAM2 to perform text-referred target detection with our Cross-modal Spatial-Temporal Mamba (CSTMamba) for precise detection and segmentation
Employs a Credible Initial Frame Selection (CIFS) strategy for reliable tracking initialization
Incorporates a Diversity-driven Long-term Memory (DLM) that maintains a credible and diverse memory bank for consistent long-term tracking
Operates in real-time at 61.2 FPS, making it practical for clinical applications Achieves substantial improvements in accuracy and efficiency compared to existing methods

Video Demo

Click the image above to watch the full demo on YouTube

Architecture

Overall architecture of ReSurgSAM2

Installization

We build the model based on Torch 2.5.10 and Python 3.10. It is recommended to create a new conda environment:

conda create -n resurgsam2 python=3.10
conda activate resurgsam2
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124

cd project_root
pip install -e ".[dev]" --no-build-isolation -v

cd mamba
pip install -e . --no-build-isolation -v

# To test the installation:
import mamba_ssm

However, If you want to use the latest packages, you can install build without --no-build-isolation when setting up for SAM2:

conda create -n resurgsam2 python=3.10
conda activate resurgsam2

cd project_root
pip install -e ".[dev]" -v

cd mamba
pip install -e . --no-build-isolation -v

# To test the installation:
import mamba_ssm

More installation details can be found in SAM2 INSTALL.md and mamba/README.md.

Dataset Acquisition and Preprocessing

Please follow the steps written in datasets/README.md

Note: We have corrected minor annotation inconsistencies in the Ref-EndoVis17/18 training dataset; details can be found in datasets/README.md.

Training

Please set the working directory as project_root, and then follow:

For Ref-Endovis17:

export CUDA_VISIBLE_DEVICES=0
python training/train.py --config configs/rvos_training/17/sam2.1_s_ref17_resurgsam_pretrained --num-gpus 1
python training/train.py --config configs/rvos_training/17/sam2.1_s_ref17_resurgsam --num-gpus 1

For Ref-Endovis18:

export CUDA_VISIBLE_DEVICES=0
python training/train.py --config configs/rvos_training/18/sam2.1_s_ref18_resurgsam_pretrained --num-gpus 1
python training/train.py --config configs/rvos_training/18/sam2.1_s_ref18_resurgsam --num-gpus 1

Evaluation

Download the checkpoints from google drive. Place the files at project_root/checkpoints/.

Please set the working directory as project_root, and then follow:

For Ref-Endovis17:

python tools/rvos_inference.py --training_config_file configs/rvos_training/17/sam2.1_s_ref17_resurgsam --sam2_cfg configs/sam2.1/sam2.1_hiera_s_rvos.yaml --sam2_checkpoint checkpoints/sam2.1_hiera_s_ref17.pth --output_mask_dir results/ref-endovis17/hiera_small_long_mem --dataset_root ./data/Ref-Endovis17/valid --gpu_id 0 --apply_long_term_memory --num_cifs_candidate_frame 5

For Ref-Endovis18:

python tools/rvos_inference.py --training_config_file configs/rvos_training/18/sam2.1_s_ref18_resurgsam --sam2_cfg configs/sam2.1/sam2.1_hiera_s_rvos.yaml --sam2_checkpoint checkpoints/sam2.1_hiera_s_ref18.pth --output_mask_dir results/ref-endovis18/hiera_small_long_mem --dataset_root ./data/Ref-Endovis18/valid --gpu_id 0 --apply_long_term_memory --num_cifs_candidate_frame 5

Acknowledgement

This research utilizes datasets from Endovis 2017 and Endovis 2018. If you wish to use these datasets, please request access through their respective official websites.

Our implementation builds upon the segment anything 2 framework, mamba, and CLIP. We extend our sincere appreciation to the authors for their outstanding work and significant contributions to the field of video segmentation.

Citation

@inproceedings{resurgsam2,
  title={ReSurgSAM2: Referring Segment Anything in Surgical Video via Credible Long-term Tracking},
  author={Haofeng Liu and Mingqi Gao and Xuxiao Luo and Ziyue Wang and Guanyi Qin and Junde Wu and Yueming Jin},
  booktitle={International Conference on Medical Image Computing and Computer-Assisted Intervention},
  year={2025},
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
assets		assets
checkpoints		checkpoints
datasets		datasets
mamba		mamba
sam2		sam2
sav_dataset		sav_dataset
tools		tools
training		training
.clang-format		.clang-format
.gitignore		.gitignore
INSTALL.md		INSTALL.md
LICENSE		LICENSE
LICENSE_cctorch		LICENSE_cctorch
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReSurgSAM2: Referring Segment Anything in Surgical Video via Credible Long-term Tracking

Overview

Video Demo

Architecture

Installization

Dataset Acquisition and Preprocessing

Training

Evaluation

Acknowledgement

Citation

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Languages

Folders and files

Latest commit

History

Repository files navigation

ReSurgSAM2: Referring Segment Anything in Surgical Video via Credible Long-term Tracking

Overview

Video Demo

Architecture

Installization

Dataset Acquisition and Preprocessing

Training

Evaluation

Acknowledgement

Citation

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 0

Languages

Packages

Contributors