Skip to content

Dootmaan/PathSearch

Repository files navigation

PathSearch

Official Repository for Accurate and Scalable Multimodal Pathology Retrieval via Attentive Vision-Language Alignment

PathSearch is an accurate and scalable system for multimodal pathology retrieval, featuring an attentive mosaic mechanism to boost slide-to-slide retrieval accuracy, while leveraging slide-report alignment to further improve semantic understanding of the slide and enable multimodal retrieval support.

PathSearch demonstrates higher slide-to-slide retrieval accuracy and faster slide encoding & matching speed than existing frameworks, making it suitable for real-world clinical applications.


⚠️ Note: The code has been verified for training and inference. If you still find certain files missing, please raise an issue for it. We will continue to ensure that the code behaves the same as in our experiments.

1. Prerequisites

To preprocess WSIs in a unified style, EasyMIL Toolbox is highly recommended.
To process .kfb, .sdpc format slides in Python, please use the ASlide library.

You will need the following libraries to reproduce or deploy PathSearch (tested on Python 3.9.19):

  • torch 2.4.0
  • timm 0.9.8 (switch to the modified version 0.5.4 for CTransPath/CHIEF, provided in EasyMIL)
  • einops 0.8.0
  • numpy 1.25.1
  • scipy 1.13.1
  • scikit-learn 1.6.1
  • pandas

The complete experimental environment will be included in the requirements.txt file. However, not all libraries listed there are required by PathSearch. The installation time varies between different devices but normally would not takes more than 15 minutes.

2. Prepare the data / archive

You can download the TCGA data and corresponding labels from the NIH Genomic Data Commons, of which the detailed list is provided in PathSearch/dataset/TCGA_file_list.txt. The Camelyon16 and Camelyon17 datasets are available on the Grand Challenge and Camelyon17 platforms.
The DHMC-LUAD dataset can be obtained from the Department of Pathology and Laboratory Medicine at Dartmouth–Hitchcock Medical Center via registration and request (link). You can also prepare your own datasets as long as you have the whole slide images available.

You may continuously add different types of samples to your search archive, building your own diagnostic library.


⚠️ Note: You will need to use EasyMIL for tiling and feature extraction of these slides. Please visit EasyMIL's official page for more information about its usage. Kindly note that there is already a demo dataset provided in this repo for some quick tests.

3. Clone the code

Clone the repository by running:

git clone [email protected]:Dootmaan/PathSearch.git

Then navigate into the project directory:

cd PathSearch

4. Demo

We provide a demo dataset containing 30 TCGA slides for quick testing and verification. The demo dataset is located in demo_dataset/ and includes pre-extracted CONCH v1.5 features in .pt format. The demo right now outputs the index of candidate WSIs and does not include the thumbnail visualization of the retrieved samples.

Run the demo retrieval:

# Run on CPU (default)
bash shells/test_demo.sh

This will output retrieval results to demo_retrieval_results.csv. The demo has been verified and the demo_retrieval_results.csv has already been generated in the directory, which can be used for reproducibility verification.

5. Training

Generally speaking, you can directly use the released weights for the attentive mosaic generator and the report encoder in the PathSearch framework.
These weights can be found on Zenodo.

To train PathSearch with the TCGA data pairs, simply run:

bash shell/train_pathsearch.sh

to train the model from scratch with the default hyperparameters.

6. Testing

This repository provides four ready-to-run scripts for the four public datasets used in the study, three of which are external. Simply run:

bash shell/test.sh

to test the model on these datasets. Be sure to specify the path to your archive.

Note: During testing, cache file will be automatically generated to boost future use. You may need to refresh these cache files manually after making modifications to the pipeline.

Acknowledgment

We used CONCH for generating patch-level embeddings via EasyMIL. We have partially borrowed code from CLIP and TransMIL to construct PathSearch; therefore, PathSearch will also follow the GPL v3 LICENSE upon publication.

We sincerely thank these teams for their dedicated efforts in advancing this field. We also would like to thank the authors from the PathologySearchComparison project for the PyTorch reproduction of existing methods.

Citation

If you find this work helpful in your research, please consider citing:

@misc{wang2025accuratescalablemultimodalpathology,
      title={Accurate and Scalable Multimodal Pathology Retrieval via Attentive Vision-Language Alignment}, 
      author={Hongyi Wang and Zhengjie Zhu and Jiabo Ma and Fang Wang and Yue Shi and Bo Luo and Jili Wang and Qiuyu Cai and Xiuming Zhang and Yen-Wei Chen and Lanfen Lin and Hao Chen},
      year={2025},
      eprint={2510.23224},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2510.23224}, 
}

About

Accurate and Scalable Multimodal Pathology Retrieval via Attentive Vision-Language Alignment

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors