PLM-interact: extending protein language models to predict protein-protein interactions
- PMID: 41145424
- PMCID: PMC12559430
- DOI: 10.1038/s41467-025-64512-w
PLM-interact: extending protein language models to predict protein-protein interactions
Abstract
Computational prediction of protein structure from amino acid sequence alone has been achieved with unprecedented accuracy, yet the prediction of protein-protein interactions remains a challenge. Here, we assess the ability of protein language models (PLMs), routinely applied to protein folding, to be retrained for protein-protein interaction prediction. Existing models that exploit PLMs use a pre-trained PLM feature set, ignoring that the proteins are physically interacting. We propose PLM-interact, which goes beyond single proteins by jointly encoding protein pairs to learn their relationships, analogous to the next-sentence prediction task from natural language processing. This approach achieves state-of-the-art performance in a widely adopted cross-species protein-protein interaction prediction benchmark: trained on human data and tested on mouse, fly, worm, E. coli and yeast. In addition, we develop a fine-tuning method for PLM-interact to detect mutation effects on interactions. Finally, we report that the model outperforms existing approaches in predicting virus-host interaction at the protein level. Our work demonstrates that large language models can be extended to learn the intricate relationships among biomolecules from their sequences alone.
© 2025. The Author(s).
Conflict of interest statement
Competing interests: The authors declare no competing interests.
Figures
 
              
              
              
              
                
                
                 
              
              
              
              
                
                
                 
              
              
              
              
                
                
                 
              
              
              
              
                
                
                 
              
              
              
              
                
                
                 
              
              
              
              
                
                
                 
              
              
              
              
                
                
                References
MeSH terms
Substances
Grants and funding
- EDDPGM-Nov21\100001/Cancer Research UK (CRUK)
- DRCMDP-Nov23/100010/Cancer Research UK (CRUK)
- BB/V016067/1/RCUK | Biotechnology and Biological Sciences Research Council (BBSRC)
- MA-TIA22-001/PCUK_/Prostate Cancer UK/United Kingdom
- 101016851/EC | Horizon 2020 Framework Programme (EU Framework Programme for Research and Innovation H2020)
- 955974/EC | EU Framework Programme for Research and Innovation H2020 | H2020 Priority Excellent Science | H2020 Marie Skłodowska-Curie Actions (H2020 Excellent Science - Marie Skłodowska-Curie Actions)
- 955974/EC | EU Framework Programme for Research and Innovation H2020 | H2020 Priority Excellent Science | H2020 Marie Skłodowska-Curie Actions (H2020 Excellent Science - Marie Skłodowska-Curie Actions)
- MC_UU_00034/5/RCUK | Medical Research Council (MRC)
- MC_UU_00034/6/RCUK | Medical Research Council (MRC)
- MR/V01157X/1/RCUK | Medical Research Council (MRC)
LinkOut - more resources
- Full Text Sources
 
        