Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2013 Oct;25(5):571-8.
doi: 10.1016/j.coi.2013.09.015. Epub 2013 Oct 19.

Computational deconvolution: extracting cell type-specific information from heterogeneous samples

Affiliations
Review

Computational deconvolution: extracting cell type-specific information from heterogeneous samples

Shai S Shen-Orr et al. Curr Opin Immunol. 2013 Oct.

Abstract

The quanta unit of the immune system is the cell, yet analyzed samples are often heterogeneous with respect to cell subsets which can mislead result interpretation. Experimentally, researchers face a difficult choice whether to profile heterogeneous samples with the ensuing confounding effects, or a priori focus on a few cell subsets of interest, potentially limiting new discoveries. An attractive alternative solution is to extract cell subset-specific information directly from heterogeneous samples via computational deconvolution techniques, thereby capturing both cell-centered and whole system level context. Such approaches are capable of unraveling novel biology, undetectable otherwise. Here we review the present state of available deconvolution techniques, their advantages and limitations, with a focus on blood expression data and immunological studies in general.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Biological samples are heterogeneous with respect to underlying cell subsets, with strong implications on downstream analysis
A) Most tissue samples are composed of multiple cell subsets, and different samples show high variance between one another in relative cell subset proportions, especially under pathological conditions. B) This implies that the total measured transcript abundance of a gene (as well as many other molecular species) is strongly affected by the cell subset composition of the sample and may be decoupled into three Abundance Components. Implications of sample heterogeneity include (C) an inability to identify whether increased total expression is due to the over-expression of a gene or to merely having more cells of a given subset in the sample, as well as (D) a difficulty in interpreting results and identifying the cell subset of origin of any detected differences.
Figure 2
Figure 2. Computational deconvolution methodologies enable capturing both cell-centered and system wide information
Experimental methodologies for dealing with sample heterogeneity require either to isolate cells of interest, which perturbs the cells, entails a loss of perspective on the whole system, and is biased towards prior knowledge on which cells are of interest. The alternative, that of profiling the heterogeneous sample directly, provides a whole system view, which, however, lacks any cellular context. Computational deconvolution methodologies offer an intermediate alternative and allow to capture system level information in a cell-centered manner, a model proper to immunology, namely cells interacting with one another.
Figure 3
Figure 3. Five classes of computational approaches that extract cell type-specific information from heterogeneous sample data
Different classes of deconvolution methods defined according to the combination of the input data they require and the type and resolution of output they offer. All methods use data from heterogeneous samples, combined with either markers, signatures or proportions to (A) detect cell presence or implication of cell types, (B) estimate cell proportions, (C) correct for heterogeneity, or (D) estimate cell type-specific expression profiles. Dotted line indicates a possibility of using the output of one class of methods as input for another. Complete deconvolution methods (E) alternately estimate proportions from cell type-specific expression and vice-versa, starting with some limited prior knowledge on proportions or expression profiles (signatures, markers).

References

    1. Davey HM, Kell DB. Flow cytometry and cell sorting of heterogeneous microbial populations: the importance of single-cell analyses. Microbiological reviews. 1996;60(4):641–696. URL: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=239459&tool=pm.... - PMC - PubMed
    1. Whitney A, Diehn M, Popper S, Alizadeh A, Boldrick J, Relman D, et al. Individuality and variation in gene expression patterns in human blood. Proceedings of the National Academy of Sciences of the United States of America. 2003;100(4):1896. URL: http: //www.pnas.org/content/100/4/1896.short. ** The authors provide a comprehensive analysis of factors of variation in whole blood and PBMC gene expression, notably showing that a significant fraction of the observed variation refelects differences in cell type proportions.

    1. De Ridder D, Van Der Linden CE, Schonewille T, Dik WA, Reinders MJT, Van Dongen JJM, et al. Purity for clarity: the need for purification of tumor cells in DNA microarray studies. Leukemia official journal of the Leukemia Society of America Leukemia Research Fund UK. 2005;19(4):618–627. URL: http://www.ncbi.nlm.nih.gov/pubmed/15744349. - PubMed
    1. Venet D, Pecasse F, Maenhaut C, Bersini H. Separation of samples into their constituents using gene expression data. Bioinformatics. 2001;17(suppl 1):S279. URL: http://bioinformatics.oxfordjournals.org/content/17/suppl_1/S279.short. - PubMed
    1. Novershtern N, Subramanian A, Lawton LN, Mak RH, Haining WN, Mc- Conkey ME, et al. Densely interconnected transcriptional circuits control cell states in human hematopoiesis. Cell. 2011;144(2):296–309. URL: http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=3049864&tool=pmcentrez&rendertype=abstract. ** D-MAP: the authors generated a dataset of 39 human cell types across the whole hematopoietic tree, showing, in particular, that gene expression programs get reused across lineages and cell subsets are generally more similar to one another in deep branches of the hematopoietic tree.

Publication types