We introduce a dual-view graph neural network (GNN) framework called scNET that integrates scRNA-seq data with proteinâprotein interaction networks. This approach enhances the characterization of gene functions, pathways and geneâgene relationships and improves cell clustering and the identification of differentially activated biological pathways across conditions.
The problem
Single-cell RNA sequencing (scRNA-seq) has transformed the ability to explore cellular heterogeneity, uncovering the complexity of biological systems among individual cells. However, this breakthrough has come with technical challenges, such as the high frequency of false zero read counts in scRNA-seq data1 (termed zero inflation), which obscures true biological signals and hampers the detection of activated gene pathways and geneâgene relationships2. Traditional imputation methods3 and graph-based clustering approaches have made strides in addressing this issue but often rely on gene expression data alone and do not leverage the functional context provided by proteinâprotein interaction (PPI) networks. PPIs offer complementary insights but lack specificity to particular cell types or conditions. Integrating scRNA-seq data with PPIs has the potential to reveal deeper cellular and molecular relationships, driving advancements in both biological research and medical applications.
The solution
We introduce scNET, a novel GNN4 framework that enables integration of scRNA-seq with PPI data to elucidate both geneâgene and cellâcell relationships. The scNET framework uses a dual-view architecture that simultaneously leverages a PPI network (representing relationships between different proteins) and a cell k-nearest neighbors (KNN) graph (representing relationships between different cells) to integrate PPI information into gene expression data. An attention mechanism was used to refine the KNN graph to better capture the various cell communities within the data (Fig. 1).
The proteinâprotein interaction (PPI) network, the k-nearest neighbors (KNN) graph, and gene expression data are first inputted into the dual view encoder (depicted by the detailed dashed line). Subsequently, graph attention layers are used to extract the latent representations of both cells and genes. The inner product decoder is then used to reconstruct the network connections, and a fully connected (FC) layer is responsible for reconstructing gene expression. Additionally, the KNN graph undergoes pruning using attention coefficients to optimize the performance of the model performance. GCN, graph convolutional network. © 2025, Sheinin, R. et al., CC BY-NC-ND 4.0.
Through comprehensive analysis, we show that scNET can effectively incorporate PPI networks with gene expression data, leading to improved representations of cells and genes. Importantly, the reconstructed gene expression pattern of scNET not only reduced zero inflation but also substantially improved the accuracy of differential pathway enrichment analysis in different cell populations. These findings underscore the benefits of this framework for integrating biological networks with gene expression data, enabling a greater understanding of complex biological systems. Finally, we showed that the attention mechanism of scNET improves the separation of different cell communities (that is, cell clusters) in the data, capturing fine-tuned dynamics more effectively than related approaches in the field.
The implications
We believe that scNET could represent an important tool for single-cell genomics. Through the integration of scRNA-seq data with PPI networks, the scNET framework enhances our understanding of complex biological systems. This capability is particularly valuable in exploring disease pathogenesis, advancing drug discovery and tailoring personalized medicine approaches. The ability of scNET to capture the PPI network is crucial for identifying enriched pathways, potential drug targets and disease-specific biomarkers.
However, there are some limitations that warrant consideration. It is important to note that although PPI networks provide critical insights into gene interactions, they do not provide information on regulatory mechanisms, notably those driven by transcription factors. Thus, the results presented do not fully address the influence of transcription factor-mediated regulatory effects on gene expression. The use of an integrated PPI and proteinâDNA interaction (PDI) network may enable a more holistic representation of protein relationships and regulations.
Over the past five years, the landscape of deep learning applications in scRNA-seq data analysis has evolved rapidly, driven by major technical advancements in large language models (LLMs)5. Current methods4 leverage these developments, focusing on training foundational models on extensive atlas datasets. By contrast, scNET adopts a complementary approach, offering a robust deep learning framework tailored for analyzing novel, small to medium-sized datasets.
Ron Sheinin, Roded Sharan & Asaf Madi
Tel Aviv University, Tel Aviv, Israel
From the editor
âscNET combines single-cell gene expression with proteinâprotein interaction networks to simultaneously learn gene and cell embeddings for improved downstream applications such as functional gene annotations, and the identification of geneâgene relationships within specific biological contexts.â Arunima Singh, Senior Editor, Nature Methods.
Behind the paper
This project emerged from the combined expertise of the authors, who bring backgrounds in systems immunology, scRNA-seq and biological network analysis. The unique perspectives from these domains, coupled with the understanding that orthogonal complementary information is essential to enrich single-cell analysis, naturally led to the idea of integrating single-cell data with PPI networks to deepen our understanding of cellular and molecular processes. We initially investigated this idea by leveraging PPI networks to pinpoint cellular signaling activation pathways and map cellâcell interactions. Drawing on our experience in the field, we prioritized creating a framework that is not only powerful but also computationally lightweight, ensuring accessibility to researchers with limited computational resources. By making such a tool broadly available, we hope to empower the research community to extract greater insights from their datasets (existing and newly generated alike) and advance the field of single-cell analysis. R.S., R.S. & A.M.
References
Jiang, R., Sun, T., Song, D. & Li, J. J. Statistics or biology: the zero-inflation controversy about scRNA-seq data. Genome Biol. 23, 1â24 (2022). A review article that covers zero inflation in scRNA-seq.
Cheng, Y. et al. Evaluating imputation methods for single-cell RNA-seq data. BMC Bioinformatics 24, 302 (2023). An article that reviews and evaluates imputation methods for scRNA-seq.
Crow, M. & Gillis, J. Co-expression in single-cell analysis: saving grace or original sin? Trends Genet. 34, 823â831 (2018). This paper discusses co-expression analysis in scRNA-seq.
Zhou, J. et al. Graph neural networks: a review of methods and applications. AI Open 1, 57â81 (2020). A review article that discusses GNNs.
SzaÅata, A. et al. Transformers in single-cell omics: a review and new perspectives. Nat. Methods 21, 1430â1443 (2024). A review article that presents LLM-based methods in the scRNA-seq domain.
Additional information
Publisherâs note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This is a summary of: Sheinin, R. et al. scNET: learning context-specific gene and cell embeddings by integrating single-cell gene expression data with proteinâprotein interactions. Nat. Methods. https://doi.org/10.1038/s41592-025-02627-0 (2025).
Rights and permissions
About this article
Cite this article
A graph neural network that combines scRNA-seq and proteinâprotein interaction data. Nat Methods 22, 660â661 (2025). https://doi.org/10.1038/s41592-025-02628-z
Published:
Issue date:
DOI: https://doi.org/10.1038/s41592-025-02628-z
