Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Review
. 2014 Nov 4;20(5):731-741.
doi: 10.1016/j.cmet.2014.10.003. Epub 2014 Nov 4.

Determining microbial products and identifying molecular targets in the human microbiome

Affiliations
Review

Determining microbial products and identifying molecular targets in the human microbiome

Regina Joice et al. Cell Metab. .

Abstract

Human-associated microbes are the source of many bioactive microbial products (proteins and metabolites) that play key functions both in human host pathways and in microbe-microbe interactions. Culture-independent studies now provide an accelerated means of exploring novel bioactives in the human microbiome; however, intriguingly, a substantial fraction of the microbial metagenome cannot be mapped to annotated genes or isolate genomes and is thus of unknown function. Meta'omic approaches, including metagenomic sequencing, metatranscriptomics, metabolomics, and integration of multiple assay types, represent an opportunity to efficiently explore this large pool of potential therapeutics. In combination with appropriate follow-up validation, high-throughput culture-independent assays can be combined with computational approaches to identify and characterize novel and biologically interesting microbial products. Here we briefly review the state of microbial product identification and characterization and discuss possible next steps to catalog and leverage the large uncharted fraction of the microbial metagenome.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Uncharacterized microbial genes represent over half of the human gut metagenome
Data from 139 Human Microbiome Project (HMP) stool sample shotgun metagenomes (Human Microbiome Project Consortium, 2012b). The HMP Data Analysis and Coordinating Center (http://hmpdacc.org) annotated microbial community genes with a GO term (Ashburner et al., 2000), EC number (Bairoch, 2000), and/or gene name when possible. (A) Relative abundances of GO and/or EC annotated genes, uncharacterized genes with a homology-based gene name, and completely uncharacterized genes. (B) Distribution of EC-annotated genes in the Enzyme Commission functional hierarchy. Tree shows log-scaled percent coverages of direct EC annotations at each level as observed in the stool samples. With the exception of transferases and hydrolases, even when microbial genes receive EC annotations, they are often at non-specific higher levels within the hierarchy rather than to specific EC subclasses, highlighting the need for deeper microbial gene product characterization in the human microbiome.
Figure 2
Figure 2. Identification and validation of microbe-derived gene product functions
An overview of the process of microbial gene functional annotation and validation. In microbial isolate genomes and metagenomes alike, gene function is typically first assigned using standard sequence analysis methods (homology-based assignment (Loewenstein et al., 2009) and domain profiling (Finn et al., 2013)). These predictions can be further refined by additional bioinformatic approaches, such as comparative metagenomics (“guilt by association” of uncharacterized microbial products with characterized genes across samples through the use of data integration), supervised curation (manual determination of a consensus among multiple complementary automated annotations (Richardson and Watson, 2013)), phylogenic profiling (analysis of co-occurrence of genes across isolates (Eisen and Fraser, 2003)), and network context (“guilt by association” in isolate coexpression, interaction, or functional linkage networks (Sharan et al., 2007)). Following putative classification, bioactivity must be validated and further characterized by experimental methods. When standard culture is challenging (as is common for the microbiome), microculture and induction culture, as well as heterologous expression of genes and direct isolation of products are particularly useful. Functional assays for investigating the activity of microbial products include enzymatic/metabolic activity assays (Craciun and Balskus, 2012), microbial co-culture (Yan et al., 2013), host cell profiling (Wieland Brown et al., 2013), and in vivo host phenotype assessments (Olle, 2013).
Figure 3
Figure 3. Integration methods for multiple data types or datasets
Schematic of approaches for data integration either (A) among different data types within the same study or (B) across different studies assaying the same data type. Integration methods include (i) network analyses capturing similarity of genes/gene products/microbes (correlation, co-abundance, co-expression, etc); (ii) ordination projections, showing overall patterns of clustering or co-variation (shown here applying to samples, can also apply to gene/microbial features); and (iii) hierarchical statistical models such as regression that quantify the degree of association among genes/microbes and sample phenotypes. Each of these methods can be applied to one or more assay types (and phenotype metadata) within study, or they can be applied to a combination of multiple studies. (Ai) Networks of covarying features can be generated separately for different data types (e.g. gene and transcript) or using both data types in one unified network by correlating multiple feature types. (Bi) Networks of covarying features can also be generated separately for different studies or can be summarized in one network to relate features that covary in both studies. (Aii) A combined ordination (or biplot) of multiple data types (e.g. gene and transcript) can reveal patterns of variation that enrich one or more data types or metadata (e.g. red meat consumption) in particular subsets of samples. In this example, samples are ordinated jointly with metadata, genes, and transcripts. (Bii) Ordination can be used to understand patterns of variation either independently in different studies, or a joint ordination can reveal patterns of sample co-variation across studies, possibly as linked to common metadata (e.g. consumption of non-dairy diet). (Aiii) Statistically significantly (un)related features can be identified by formal models such as linear regression. Regression among linked data types (e.g. genes and transcripts) can quantify the degree to which features or metadata associate across data types. In this example, we show feature levels that are similar between data types (close to the diagonal) as well as those that are significantly up- or down-regulated. (Biii) Statistical models can be meta-analyzed by applying them within each study, determining the significance and variability of a result within each study individually, and then comparing the resulting significance and effect sizes across studies. Meta-analysis can be used to detect signals too weak to see in any one study or to assess the reproducibility of a result across studies.

References

    1. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. - PMC - PubMed
    1. Amin SR, Erdin S, Ward RM, Lua RC, Lichtarge O. Prediction and experimental validation of enzyme substrate specificity in protein structures. Proceedings of the National Academy of Sciences of the United States of America. 2013;110:E4195–4202. - PMC - PubMed
    1. An D, Oh SF, Olszak T, Neves JF, Avci FY, Erturk-Hasdemir D, Lu X, Zeissig S, Blumberg RS, Kasper DL. Sphingolipids from a symbiotic microbe regulate homeostasis of host intestinal natural killer T cells. Cell. 2014;156:123–133. - PMC - PubMed
    1. Arpaia N, Campbell C, Fan X, Dikiy S, van der Veeken J, deRoos P, Liu H, Cross JR, Pfeffer K, Coffer PJ, et al. Metabolites produced by commensal bacteria promote peripheral regulatory T-cell generation. Nature. 2013;504:451–455. - PMC - PubMed
    1. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000;25:25–29. - PMC - PubMed

Publication types

LinkOut - more resources