Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Protocol
  • Published:

Easy and accurate protein structure prediction using ColabFold

Abstract

Since its public release in 2021, AlphaFold2 (AF2) has made investigating biological questions, by using predicted protein structures of single monomers or full complexes, a common practice. ColabFold-AF2 is an open-source Jupyter Notebook inside Google Colaboratory and a command-line tool that makes it easy to use AF2 while exposing its advanced options. ColabFold-AF2 shortens turnaround times of experiments because of its optimized usage of AF2’s models. In this protocol, we guide the reader through ColabFold best practices by using three scenarios: (i) monomer prediction, (ii) complex prediction and (iii) conformation sampling. The first two scenarios cover classic static structure prediction and are demonstrated on the human glycosylphosphatidylinositol transamidase protein. The third scenario demonstrates an alternative use case of the AF2 models by predicting two conformations of the human alanine serine transporter 2. Users can run the protocol without computational expertise via Google Colaboratory or in a command-line environment for advanced users. Using Google Colaboratory, it takes <2 h to run each procedure. The data and code for this protocol are available at https://protocol.colabfold.com.

Key points

  • We present an outline of how to use ColabFold to perform structure prediction of monomers, complexes and alternative conformations and guidance on interpreting the results through appropriate confidence metrics and visualizations.

  • Integrating MMseqs2’s quick homology search, ColabFold enables accelerated structure prediction compared with AlphaFold2 at similar accuracy, while exposing many advanced parameters. ColabFold can be accessed through a Google Colaboratory notebook for beginners and a command-line interface for advanced users.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Protocol overview.
Fig. 2: Quick start.
Fig. 3: Web-based ColabFold-AF2 notebook.
Fig. 4: ColabFold’s output for PIGU monomer prediction.
Fig. 5: ColabFold’s output for GPIT complex prediction.
Fig. 6: ColabFold’s ASCT2 conformation prediction by MSA depth reduction or activating dropout layers.

Similar content being viewed by others

Data availability

All sequences used in this protocol can be found in Equipment and in the PDB.

Code availability

ColabFold is available at https://github.com/sokrypton/ColabFold and https://colabfold.com. The localcolabfold installer is available at https://github.com/YoshitakaMo/localcolabfold. Colab prediction notebooks based on ColabFold-AF2 v1.5.3 and local prediction scripts are available at https://github.com/steineggerlab/colabfold-protocol, which also includes all the input and output files.

References

  1. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Baek, M. et al. Accurate prediction of protein structures and interactions using a three-track neural network. Science 373, 871–876 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Baek, M. et al. Efficient and accurate prediction of protein structure using RoseTTAFold2. Preprint at bioRxiv https://doi.org/10.1101/2023.05.24.542179 (2023).

  4. Vaswani, A. et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 30, 5998–6008 (2017).

    Google Scholar 

  5. Humphreys, I. R. et al. Computed structures of core eukaryotic protein complexes. Science 374, eabm4805 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Bryant, P., Pozzati, G. & Elofsson, A. Improved prediction of protein-protein interactions using AlphaFold2. Nat. Commun. 13, 1265 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Mirdita, M. et al. ColabFold: making protein folding accessible to all. Nat. Methods 19, 679–682 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Evans, R. et al. Protein complex prediction with AlphaFold-Multimer. Preprint at bioRxiv https://doi.org/10.1101/2021.10.04.463034 (2021).

  9. Peng, Z., Wang, W., Han, R., Zhang, F. & Yang, J. Protein structure prediction in the deep learning era. Curr. Opin. Struct. Biol. 77, 102495 (2022).

    Article  CAS  PubMed  Google Scholar 

  10. Cheng, S. et al. FastFold: Optimizing AlphaFold training and inference on GPU clusters. In Proc. 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming 417–430 (ACM, 2024).

  11. Fang, X. et al. A method for multiple-sequence-alignment-free protein structure prediction using a protein language model. Nat. Mach. Intell. 5, 1087–1096 (2023).

    Article  Google Scholar 

  12. Ahdritz, G. et al. OpenFold: retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization. Nat. Methods 21, 1514–1524 (2022).

    Article  Google Scholar 

  13. Li, Z. et al. Uni-Fold: an open-source platform for developing protein folding models beyond AlphaFold. Preprint at bioRxiv https://doi.org/10.1101/2022.08.04.502811 (2022).

  14. Liu, S. et al. PSP: million-level protein sequence dataset for protein structure prediction. Preprint at https://arxiv.org/abs/2206.12240 (2022).

  15. Lin, Z. et al. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379, 1123–1130 (2023).

    Article  CAS  PubMed  Google Scholar 

  16. Lee, J.-W. et al. DeepFold: enhancing protein structure prediction through optimized loss functions, improved template features, and re-optimized energy function. Bioinformatics 39, btad712 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Steinegger, M. & Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol. 35, 1026–1028 (2017).

    Article  CAS  PubMed  Google Scholar 

  18. Mirdita, M., Steinegger, M. & Söding, J. MMseqs2 desktop and local web server app for fast, interactive sequence searches. Bioinformatics 35, 2856–2858 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Lee, S. et al. Petabase-scale homology search for structure prediction. Cold Spring Harb. Perspect. Biol. 16, a041465 (2024).

    Article  PubMed  Google Scholar 

  20. Abakarova, M., Marquet, C., Rera, M., Rost, B. & Laine, E. Alignment-based protein mutational landscape prediction: doing more with less. Genome Biol. Evol. 15, evad201 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  21. Suzek, B. E. et al. UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches. Bioinformatics 31, 926–932 (2015).

    Article  CAS  PubMed  Google Scholar 

  22. wwPDB consortium. Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucleic Acids Res. 47, D520–D528 (2019).

    Article  Google Scholar 

  23. Liu, J. et al. Enhancing alphafold-multimer-based protein complex structure prediction with MULTICOM in CASP15. Commun. Biol. 6, 1140 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Peng, Z., Wang, W., Wei, H., Li, X. & Yang, J. Improved protein structure prediction with trRosettaX2, AlphaFold2, and optimized MSAs in CASP15. Proteins 91, 1704–1711 (2023).

    Article  CAS  PubMed  Google Scholar 

  25. Rego, N. & Koes, D. 3Dmol.js: molecular visualization with WebGL. Bioinformatics 31, 1322–1324 (2015).

    Article  PubMed  Google Scholar 

  26. Nomura, K. et al. Bacterial pathogens deliver water- and solute-permeable channels to plant cells. Nature 621, 586–591 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Mosalaganti, S. et al. AI-based structure prediction empowers integrative structural analysis of human nuclear pores. Science 376, eabm9506 (2022).

    Article  CAS  PubMed  Google Scholar 

  28. Zhang, H. et al. Structure of human glycosylphosphatidylinositol transamidase. Nat. Struct. Mol. Biol. 29, 203–209 (2022).

    Article  CAS  PubMed  Google Scholar 

  29. Del Alamo, D., Sala, D., Mchaourab, H. S. & Meiler, J. Sampling alternative conformational states of transporters and receptors with AlphaFold2. eLife 11, e75751 (2022).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Gal, Y. & Ghahramani, Z. Dropout as a Bayesian approximation: representing model uncertainty in deep learning. Proc. Mach. Learn. Res. 48, 1050–1059 (2016).

    Google Scholar 

  31. Wallner, B. AFsample: improving multimer prediction with AlphaFold using massive sampling. Bioinformatics 39, btad573 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Wayment-Steele, H. K. et al. Predicting multiple conformations via sequence clustering and AlphaFold2. Nature 625, 832–839 (2024).

    Article  CAS  PubMed  Google Scholar 

  33. Monteiro da Silva, G., Cui, J. Y., Dalgarno, D. C., Lisi, G. P. & Rubenstein, B. M. High-throughput prediction of protein conformational distributions with subsampled AlphaFold2. Nat. Commun. 15, 2464 (2024).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Chakravarty, D. & Porter, L. L. AlphaFold2 fails to predict protein fold switching. Protein Sci. 31, e4353 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Saldaño, T. et al. Impact of protein conformational diversity on AlphaFold predictions. Bioinformatics 38, 2742–2748 (2022).

    Article  PubMed  Google Scholar 

  36. Garibsingh, R.-A. A. et al. Rational design of ASCT2 inhibitors using an integrated experimental-computational approach. Proc. Natl Acad. Sci. USA 118, e2104093118 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Garaeva, A. A., Guskov, A., Slotboom, D. J. & Paulino, C. A one-gate elevator mechanism for the human neutral amino acid transporter ASCT2. Nat. Commun. 10, 3427 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  38. Wu, R. et al. High-resolution de novo structure prediction from primary sequence. Preprint at bioRxiv https://doi.org/10.1101/2022.07.21.500999 (2022).

  39. Chowdhury, R. et al. Single-sequence protein structure prediction using a language model and deep learning. Nat. Biotechnol. 40, 1617–1623 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Wang, W., Peng, Z. & Yang, J. Single-sequence protein structure prediction using supervised transformer protein language models. Nat. Comput. Sci. 2, 804–814 (2022).

    Article  CAS  PubMed  Google Scholar 

  41. Bertoline, L. M. F., Lima, A. N., Krieger, J. E. & Teixeira, S. K. Before and after AlphaFold2: an overview of protein structure prediction. Front. Bioinform. 3, 1120370 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  42. Paysan-Lafosse, T. et al. InterPro in 2022. Nucleic Acids Res. 51, D418–D427 (2023).

    Article  CAS  PubMed  Google Scholar 

  43. Redl, I. et al. ADOPT: intrinsic protein disorder prediction through deep bidirectional transformers. NAR Genom. Bioinform. 5, lqad041 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  44. Zhang, J., Schaeffer, R. D., Durham, J., Cong, Q. & Grishin, N. V. DPAM: a domain parser for AlphaFold models. Protein Sci. 32, e4548 (2023).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Howe, P. W. Principal components analysis of protein structure ensembles calculated using NMR data. J. Biomol. NMR 20, 61–70 (2001).

    Article  CAS  PubMed  Google Scholar 

  46. Roe, D. R. & Cheatham, T. E. PTRAJ and CPPTRAJ: software for processing and analysis of molecular dynamics trajectory data. J. Chem. Theory Comput. 9, 3084–3095 (2013).

    Article  CAS  PubMed  Google Scholar 

  47. Zhang, H. et al. Structure of a human glycosylphosphatidylinositol (GPI) transamidase. Available at https://www.rcsb.org/structure/7W72 (2022).

  48. Garibsingh, R.-A. A. et al. ASCT2 in the presence of the inhibitor Lc-BPE (position “up”) in the outward-open conformation. Available at https://www.rcsb.org/structure/7BCQ (2021).

  49. Garaeva, A. A., Guskov, A., Slotboom, D. J. & Paulino, C. Inward-open structure of the ASCT2 (SLC1A5) mutant C467R in presence of TBOA. Available at https://www.rcsb.org/structure/6RVX (2019).

  50. Pettersen, E. F. et al. UCSF ChimeraX: structure visualization for researchers, educators, and developers. Protein Sci. 30, 70–82 (2021).

    Article  CAS  PubMed  Google Scholar 

  51. Mariani, V., Biasini, M., Barbato, A. & Schwede, T. lDDT: a local superposition-free score for comparing protein structures and models using distance difference tests. Bioinformatics 29, 2722–2728 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Zhang, Y. & Skolnick, J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 33, 2302–2309 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. O’Reilly, F. J. et al. Protein complexes in cells by AI-assisted structural proteomics. Mol. Syst. Biol. 19, e11544 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  54. Steinegger, M. et al. HH-suite3 for fast remote homology detection and deep protein annotation. BMC Bioinforma. 20, 473 (2019).

    Article  Google Scholar 

  55. Gabler, F. et al. Protein sequence analysis using the MPI bioinformatics toolkit. Curr. Protoc. Bioinforma. 72, e108 (2020).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

M.S. acknowledges the support by the National Research Foundation of Korea, grants 2020M3-A9G7-103933, 2021-R1C1-C102065, 2021-M3A9-I4021220 and RS-2024-00396026; the Samsung DS research fund; the Creative-Pioneering Researchers Program; and the AI-Bio Research Grant through Seoul National University. M.M. acknowledges support by the National Research Foundation of Korea (grant RS-2023-00250470). Y.M. acknowledges support from Platform Project for Supporting Drug Discovery and Life Science Research (Basis for Supporting Innovative Drug Discovery and Life Science Research (BINDS)) from AMED under grant number JP23ama121027. S.O. was supported by the National Institutes of Health (NIH) DP5OD026389 and the National Science Foundation (NSF) MCB2032259.

Author information

Authors and Affiliations

Authors

Contributions

G.K., S.L., E.L.K. and M.S. developed the protocol. Y.M., S.O., M.S. and M.M. developed the ColabFold software and notebooks. G.K., S.L. and H.K. performed predictions and visualized the data. S.O., M.S. and M.M. supervised the monomer and complex prediction procedures. E.L.K., Y.M., M.S. and M.M. supervised the conformation prediction procedure. G.K., S.L. and E.L.K. analyzed the results and wrote the paper, with contributions from all authors.

Corresponding authors

Correspondence to Sergey Ovchinnikov, Martin Steinegger or Milot Mirdita.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Protocols thanks Jianyi Yang and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Related links

Key references using this protocol

Mirdita, M. et al. Nat. Methods 19, 679–682 (2022): https://doi.org/10.1038/s41592-022-01488-1

Lee, S. et al. Cold Spring Harb. Perspect. Biol. 16, a041465 (2024): https://doi.org/10.1101/cshperspect.a041465

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kim, G., Lee, S., Levy Karin, E. et al. Easy and accurate protein structure prediction using ColabFold. Nat Protoc 20, 620–642 (2025). https://doi.org/10.1038/s41596-024-01060-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue date:

  • DOI: https://doi.org/10.1038/s41596-024-01060-5

This article is cited by

Search

Quick links

Nature Briefing AI and Robotics

Sign up for the Nature Briefing: AI and Robotics newsletter — what matters in AI and robotics research, free to your inbox weekly.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing: AI and Robotics