Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2013:4:2612.
doi: 10.1038/ncomms3612.

Inferring tumour purity and stromal and immune cell admixture from expression data

Affiliations

Inferring tumour purity and stromal and immune cell admixture from expression data

Kosuke Yoshihara et al. Nat Commun. 2013.

Abstract

Infiltrating stromal and immune cells form the major fraction of normal cells in tumour tissue and not only perturb the tumour signal in molecular studies but also have an important role in cancer biology. Here we describe 'Estimation of STromal and Immune cells in MAlignant Tumours using Expression data' (ESTIMATE)--a method that uses gene expression signatures to infer the fraction of stromal and immune cells in tumour samples. ESTIMATE scores correlate with DNA copy number-based tumour purity across samples from 11 different tumour types, profiled on Agilent, Affymetrix platforms or based on RNA sequencing and available through The Cancer Genome Atlas. The prediction accuracy is further corroborated using 3,809 transcriptional profiles available elsewhere in the public domain. The ESTIMATE method allows consideration of tumour-associated normal cells in genomic and transcriptomic studies. An R-library is available on https://sourceforge.net/projects/estimateproject/.

PubMed Disclaimer

Figures

Figure 1
Figure 1. An overview of the ESTIMATE algorithm.
The ESTIMATE algorithm uses gene expression data to output the estimated levels of infiltrating stromal and immune cells and estimated tumour purity. Infiltrating stromal- and immune cell-related genes were identified by five gene filterings.
Figure 2
Figure 2. Stromal and immune scores for tumour cell and stromal fractions of tumour samples.
Stromal and immune scores were generated using expression data sets obtained from tumour cell or stromal cell-enriched samples. (a,b) Heatmaps display stromal (upper row) and immune score (lower row) per sample (each column) using ovarian cancer samples after (a) microbead-based cell sorting and (b) laser-capture microdissection (red=high, blue=low score). (c,d) Box and whisker plots display reduced (c) stromal and (d) immune scores for the tumour cell-enriched samples (tumour part) after laser-capture microdissection compared with matched stromal cell-enriched (ovary, breast) or bulk tumour samples (lung). Box represents the median (thick line) and the quartiles (line). Whisker expresses 1.5 interquartile range (IQR) of the lower or the upper quartile.
Figure 3
Figure 3. The association between tumour purity variables in TCGA’s ovarian cancer data set.
(ad) Scatterplots between tumour purity and (a) stromal, (b) immune, (c) ESTIMATE scores and between (d) stromal and immune scores in the TCGA ovarian cancer data set. TCGA ovarian cancer samples used in the gene selection (n=28) were not included in the figure. Dash lines denote each median value for stromal and immune scores. (e) The association between tumour purity and stromal- or immune-dominant pattern. Four subgroups were divided based on the median of stromal and immune scores. (f) The ROC curves for four cutoff values in TCGA ovarian cancer data set. N=417.
Figure 4
Figure 4. Evaluation of ESTIMATE algorithm.
The accuracy of the ESTIMATE algorithm was evaluated by the AUC when tumour samples were divided into high- and low-purity groups on the basis of DNA copy number-based tumour purity. (a,b) The ROC curves for four cutoff values in (a) the Agilent data set, the Affymetrix data set, and the RNAseq data set, the RNAseqV2 data set, and (b) the validation data set. (c) An example of ESTIMATE for new Affymetrix sample, with an ESTIMATE-predicted tumour purity of 0.58. Black dot and grey dash lines show ESTIMATE tumour purity and 95% prediction interval, respectively. The grey dots represent the background distribution based on 955 samples from the TCGA Affymetrix data set.
Figure 5
Figure 5. Correlation of scores with histological findings.
Scatterplots between stromal, immune, ESTIMATE scores and ABSOLUTE-based tumour purity versus the following histological findings: percentage of stromal cells (left upper corner), percentage of infiltrating lymphocytes (right upper corner), and percentage of tumour cells (bottoms panels). Twenty-eight TCGA ovarian cancer samples used in the gene selection were excluded from this analysis.
Figure 6
Figure 6. Unique distribution of stromal and immune scores.
(a,b) Distinct distributions of (a) stromal and (b) immune scores across different tumour types were observed in RNAseqV2Affymetrix platform data sets. The number of parenthesis means sample size per data sets.

References

    1. Hanahan D. & Weinberg R. A. Hallmarks of cancer: the next generation. Cell 144, 646–674 (2011). - PubMed
    1. Kalluri R. & Zeisberg M. Fibroblasts in cancer. Nat. Rev. Cancer 6, 392–401 (2006). - PubMed
    1. Straussman R. et al. Tumour micro-environment elicits innate resistance to RAF inhibitors through HGF secretion. Nature 487, 500–504 (2012). - PMC - PubMed
    1. Fridman W. H., Pages F., Sautes-Fridman C. & Galon J. The immune contexture in human tumours: impact on clinical outcome. Nat. Rev. Cancer 12, 298–306 (2012). - PubMed
    1. Zhang L. et al. Intratumoral T cells, recurrence, and survival in epithelial ovarian cancer. N. Engl. J. Med. 348, 203–213 (2003). - PubMed

Publication types

MeSH terms