The problem

Single-cell RNA sequencing (scRNA-seq) has transformed the ability to explore cellular heterogeneity, uncovering the complexity of biological systems among individual cells. However, this breakthrough has come with technical challenges, such as the high frequency of false zero read counts in scRNA-seq data1 (termed zero inflation), which obscures true biological signals and hampers the detection of activated gene pathways and gene–gene relationships2. Traditional imputation methods3 and graph-based clustering approaches have made strides in addressing this issue but often rely on gene expression data alone and do not leverage the functional context provided by protein–protein interaction (PPI) networks. PPIs offer complementary insights but lack specificity to particular cell types or conditions. Integrating scRNA-seq data with PPIs has the potential to reveal deeper cellular and molecular relationships, driving advancements in both biological research and medical applications.

The solution

We introduce scNET, a novel GNN4 framework that enables integration of scRNA-seq with PPI data to elucidate both gene–gene and cell–cell relationships. The scNET framework uses a dual-view architecture that simultaneously leverages a PPI network (representing relationships between different proteins) and a cell k-nearest neighbors (KNN) graph (representing relationships between different cells) to integrate PPI information into gene expression data. An attention mechanism was used to refine the KNN graph to better capture the various cell communities within the data (Fig. 1).

Fig. 1: The autoencoder model architecture.
figure 1

The protein–protein interaction (PPI) network, the k-nearest neighbors (KNN) graph, and gene expression data are first inputted into the dual view encoder (depicted by the detailed dashed line). Subsequently, graph attention layers are used to extract the latent representations of both cells and genes. The inner product decoder is then used to reconstruct the network connections, and a fully connected (FC) layer is responsible for reconstructing gene expression. Additionally, the KNN graph undergoes pruning using attention coefficients to optimize the performance of the model performance. GCN, graph convolutional network. © 2025, Sheinin, R. et al., CC BY-NC-ND 4.0.

Through comprehensive analysis, we show that scNET can effectively incorporate PPI networks with gene expression data, leading to improved representations of cells and genes. Importantly, the reconstructed gene expression pattern of scNET not only reduced zero inflation but also substantially improved the accuracy of differential pathway enrichment analysis in different cell populations. These findings underscore the benefits of this framework for integrating biological networks with gene expression data, enabling a greater understanding of complex biological systems. Finally, we showed that the attention mechanism of scNET improves the separation of different cell communities (that is, cell clusters) in the data, capturing fine-tuned dynamics more effectively than related approaches in the field.

The implications

We believe that scNET could represent an important tool for single-cell genomics. Through the integration of scRNA-seq data with PPI networks, the scNET framework enhances our understanding of complex biological systems. This capability is particularly valuable in exploring disease pathogenesis, advancing drug discovery and tailoring personalized medicine approaches. The ability of scNET to capture the PPI network is crucial for identifying enriched pathways, potential drug targets and disease-specific biomarkers.

However, there are some limitations that warrant consideration. It is important to note that although PPI networks provide critical insights into gene interactions, they do not provide information on regulatory mechanisms, notably those driven by transcription factors. Thus, the results presented do not fully address the influence of transcription factor-mediated regulatory effects on gene expression. The use of an integrated PPI and protein–DNA interaction (PDI) network may enable a more holistic representation of protein relationships and regulations.

Over the past five years, the landscape of deep learning applications in scRNA-seq data analysis has evolved rapidly, driven by major technical advancements in large language models (LLMs)5. Current methods4 leverage these developments, focusing on training foundational models on extensive atlas datasets. By contrast, scNET adopts a complementary approach, offering a robust deep learning framework tailored for analyzing novel, small to medium-sized datasets.

Ron Sheinin, Roded Sharan & Asaf Madi

Tel Aviv University, Tel Aviv, Israel

From the editor

“scNET combines single-cell gene expression with protein–protein interaction networks to simultaneously learn gene and cell embeddings for improved downstream applications such as functional gene annotations, and the identification of gene–gene relationships within specific biological contexts.” Arunima Singh, Senior Editor, Nature Methods.

Behind the paper

This project emerged from the combined expertise of the authors, who bring backgrounds in systems immunology, scRNA-seq and biological network analysis. The unique perspectives from these domains, coupled with the understanding that orthogonal complementary information is essential to enrich single-cell analysis, naturally led to the idea of integrating single-cell data with PPI networks to deepen our understanding of cellular and molecular processes. We initially investigated this idea by leveraging PPI networks to pinpoint cellular signaling activation pathways and map cell–cell interactions. Drawing on our experience in the field, we prioritized creating a framework that is not only powerful but also computationally lightweight, ensuring accessibility to researchers with limited computational resources. By making such a tool broadly available, we hope to empower the research community to extract greater insights from their datasets (existing and newly generated alike) and advance the field of single-cell analysis. R.S., R.S. & A.M.