Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2002 Apr 30;99(9):5993-8.
doi: 10.1073/pnas.092135699. Epub 2002 Apr 16.

Ab initio protein structure prediction on a genomic scale: application to the Mycoplasma genitalium genome

Affiliations

Ab initio protein structure prediction on a genomic scale: application to the Mycoplasma genitalium genome

Daisuke Kihara et al. Proc Natl Acad Sci U S A. .

Abstract

An ab initio protein structure prediction procedure, TOUCHSTONE, was applied to all 85 small proteins of the Mycoplasma genitalium genome. TOUCHSTONE is based on a Monte Carlo refinement of a lattice model of proteins, which uses threading-based tertiary restraints. Such restraints are derived by extracting consensus contacts and local secondary structure from at least weakly scoring structures that, in some cases, can lack any global similarity to the sequence of interest. Selection of the native fold was done by using the convergence of the simulation from two different conformational search schemes and the lowest energy structure by a knowledge-based atomic-detailed potential. Among the 85 proteins, for 34 proteins with significant threading hits, the template structures were reasonably well reproduced. Of the remaining 51 proteins, 29 proteins converged to five or fewer clusters. In the test set, 84.8% of the proteins that converged to five or fewer clusters had a correct fold among the clusters. If this statistic is simply applied, 24 proteins (84.8% of the 29 proteins) may have correct folds. Thus, the topology of a total of 58 proteins probably has been correctly predicted. Based on these results, ab initio protein structure prediction is becoming a practical approach.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The number of clusters obtained for the 85 proteins. The subset of proteins whose fold can be also assigned by sequence comparison (fasta and psi-blast, using an E value 0.01) or a threading method (PROSPECTOR, Z score >10) are shown separately: cross hatched, by threading; black bar, by fasta or psi-blast. All of the latter cases are included in the former.
Figure 2
Figure 2
The number of obtained clusters by RE with respect to the number of the predicted contacts from threading results. Nc, the number of contacts; L, the length of the chain. ●, proteins whose fold can be assigned by fasta or psi-blast; ▵, those whose fold can be assigned by threading but not by fasta nor psi-blast; □, the rest of the proteins.
Figure 3
Figure 3
The predicted structures of MG129 (A), MG132 (B), MG353 (C), and MG449 (D) where the cluster centroid that has the largest overlap to the threading template is shown. The structures shown for MG132 and MG353 have the lowest energy in the entire simulation by the knowledge-based atomic-detailed potential, and that of MG449 has the second lowest energy by the potential. N terminus of the protein is colored blue, and the C terminus is red.
Figure 4
Figure 4
Predicted structures of MG059 (small protein B homolog) (A), MG158 (50S ribosomal protein L16) (B), MG198 (50S ribosomal protein L20) (C), MG232 (50S ribosomal protein L21) (D), and MG335.1 (function unknown) (E). The functional annotations in parentheses are according to KEGG database. N terminus of the protein is colored blue, and the C terminus is red.
Figure 5
Figure 5
Similar fragments found in PDB for all of the clusters of 85 proteins. Top five fragments are selected for each cluster centroid (both from RE and PHS) according to the Z score of rrmsd (19). The rmsd of the fragments with respect of the fraction of their length to that of the proteins is shown.

References

    1. Fetrow J S, Godzik A, Skolnick J. J Mol Biol. 1998;282:703–711. - PubMed
    1. Rychlewski L, Zhang B, Godzik A. Protein Sci. 1999;8:614–624. - PMC - PubMed
    1. Jones D T. J Mol Biol. 1999;287:797–815. - PubMed
    1. Wolf Y I, Brenner S E, Bash P A, Koonin E V. Genome Res. 1999;9:17–26. - PubMed
    1. Gerstein M. Proteins. 1998;33:518–534. - PubMed

Publication types

LinkOut - more resources