Fig. 4: FUGAsseM predictions greatly expand putative function assignments in the human gut microbial gene catalog.
From: Predicting functions of uncharacterized gene products from microbial communities

a, High-confidence BP annotations were assigned to the top 25 most uncharacterized HMP2 species (that is, those encoding the greatest numbers of novel proteins). These species from the community showed different levels of functional characterization, including sometimes expanding the community pangenomes of species with otherwise well-characterized strains (for example, E.âcoli). Here, default (prediction probability 0.75) and stringent (0.85) thresholds are used to define high-confidence predictions. âNo_annâ, protein families that were not assigned any high-confidence predictions; âPreserved_annâ, characterized protein families annotated in UniProt; âAmp_ann (default)â, characterized proteins assigned new predictions at the default threshold; âAmp_ann (stringent)â, the same for the stringent threshold; âNew_ann (default)â, uncharacterized protein families assigned one or more new predictions using the default threshold; âNew_ann (stringent)â, the same for the stringent threshold. b, Predicted functions not only expanded the function capacity of well-studied species but also improved the characterization of less studied species in the human microbiome. Both characterized proteins and uncharacterized proteins had better functional annotations. The full list is provided in Supplementary Table 13. c,d, Moreover, these predicted functions spanned all three aspects of GO (c) and benefited from data integration within FUGAsseM, taking advantage of different data types (d). The full list is provided in Supplementary Table 14. âBeforeâ, not processed by FUGAsseM; âAfter (default)â, processed with the default threshold; âAfter (stringent)â, processed with the stringent threshold.