Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Comparative Study
. 2005 Apr 22;33(7):2302-9.
doi: 10.1093/nar/gki524. Print 2005.

TM-align: a protein structure alignment algorithm based on the TM-score

Affiliations
Comparative Study

TM-align: a protein structure alignment algorithm based on the TM-score

Yang Zhang et al. Nucleic Acids Res. .

Abstract

We have developed TM-align, a new algorithm to identify the best structural alignment between protein pairs that combines the TM-score rotation matrix and Dynamic Programming (DP). The algorithm is approximately 4 times faster than CE and 20 times faster than DALI and SAL. On average, the resulting structure alignments have higher accuracy and coverage than those provided by these most often-used methods. TM-align is applied to an all-against-all structure comparison of 10 515 representative protein chains from the Protein Data Bank (PDB) with a sequence identity cutoff <95%: 1996 distinct folds are found when a TM-score threshold of 0.5 is used. We also use TM-align to match the models predicted by TASSER for solved non-homologous proteins in PDB. For both folded and misfolded models, TM-align can almost always find close structural analogs, with an average root mean square deviation, RMSD, of 3 A and 87% alignment coverage. Nevertheless, there exists a significant correlation between the correctness of the predicted structure and the structural similarity of the model to the other proteins in the PDB. This correlation could be used to assist in model selection in blind protein structure predictions. The TM-align program is freely downloadable at http://bioinformatics.buffalo.edu/TM-align.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Illustrative example of structure alignments by different alignment methods for 1atzA and 1auoA. The first row is the ribbon diagram of the native structures of 1atzA (184 residues) and 1auoA (218 residues), which have a sequence identity 16% and adopt the common αβα-sandwich topology. The second and third rows are the structure superposition between the aligned residues by CE (17) and SAL (18), DALI (38) and TM-align algorithms, respectively. The thick and thin backbones denote the aligned residues from 1atzA and 1auoA, respectively. The indicated numbers are the length of aligned residues, the RMSD between the aligned residues, and the TM-score normalized by the length of 1atzA. All the pictures are generated by RASMOL () with blue to red running from the N- to C-terminus.
Figure 2
Figure 2
Number of folds included in the representative protein sets collected from the PDB library on January 28, 2005 using different sequence identity cutoffs. A fold is defined using a TM-score threshold of 0.5.
Figure 3
Figure 3
Two examples of protein pairs that have high sequence identities but adopt entirely different folds. In both examples, the upper parts show the sequence alignments of the proteins and ‘:’ denotes the residues with identical amino acids; the lower parts are the cartoon structures of the proteins with blue to red running from N- to C-terminus. The proteins in the first example are from 1a64A (32) and the N-terminal domain of 1hngB (39). The deletion mutation of two key residues (K44 and M45) induces a domain swapping of two proteins. The proteins in the second example are from the calmodulin binding domain (CaMBD), where 1g4yB is the crystal structure from Ca2+-loaded CaMBD in complex with calmodulin (40) and 1kkdA is the NMR structure from Ca2+-free CaMBD in complex with calmodulin (33). Ca2+-binding is responsible for the conformational changes of the two structures.
Figure 4
Figure 4
Structure alignments of the computer models by TASSER (8) to non-homologous proteins in the PDB library (6). (A) TM-score between the closest template to the native structure found by TM-align and the native structure versus the TM-score between the TASSER model and the native. (B) TM-score between the TASSER model and the closest found (highest TM-score) template versus the TM-score between the TASSER model and the native. (C) RMSD between the closest template to the native structure and the native structure versus RMSD between the model and the native. (D) RMSD between the model and the closest template versus the RMSD between the model and the native. The stars denote the alignment coverage of the closest templates found by TM-align. The yellow solid circles denote the average of the points fallen in the intervals of the horizontal axis in each picture. The black lines are to guide the eye.
Figure 5
Figure 5
A comparison of a computer model generated by TASSER (8) and the closest PDB structure (template) found by TM-align. This is a typical example where the model has a much larger RMSD than the template because of the misoriented tails and loops. The thick backbones are the model or template and the thin ones the native structure of 1c0fS. The red residues are those residues where their distances are <5 Å in the TM-score rotation matrix.

References

    1. Murzin A.G., Brenner S.E., Hubbard T., Chothia C. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol. 1995;247:536–540. - PubMed
    1. Orengo C.A., Michie A.D., Jones S., Jones D.T., Swindells M.B., Thornton J.M. CATH—a hierarchic classification of protein domain structures. Structure. 1997;5:1093–1108. - PubMed
    1. Moult J., Fidelis K., Zemla A., Hubbard T. Critical assessment of methods of protein structure prediction (CASP)-round V. Proteins. 2003;53:334–339. - PubMed
    1. Skolnick J., Fetrow J.S., Kolinski A. Structural genomics and its importance for gene function analysis. Nat. Biotechnol. 2000;18:283–287. - PubMed
    1. Baker D., Sali A. Protein structure prediction and structural genomics. Science. 2001;294:93–96. - PubMed

Publication types