Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures

Rachel Kolodny¹, Patrice Koehl, Michael Levitt

Affiliations

PMID: 15701525
PMCID: PMC2692023
DOI: 10.1016/j.jmb.2004.12.032

Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures

Rachel Kolodny et al. J Mol Biol. 2005.

. 2005 Mar 4;346(4):1173-88.

doi: 10.1016/j.jmb.2004.12.032. Epub 2005 Jan 16.

Authors

Rachel Kolodny¹, Patrice Koehl, Michael Levitt

Affiliation

¹ Department of Structural Biology, Fairchild Building, Stanford University, Stanford CA 94305, USA. [email protected]

PMID: 15701525
PMCID: PMC2692023
DOI: 10.1016/j.jmb.2004.12.032

Abstract

We report the largest and most comprehensive comparison of protein structural alignment methods. Specifically, we evaluate six publicly available structure alignment programs: SSAP, STRUCTAL, DALI, LSQMAN, CE and SSM by aligning all 8,581,970 protein structure pairs in a test set of 2930 protein domains specially selected from CATH v.2.4 to ensure sequence diversity. We consider an alignment good if it matches many residues, and the two substructures are geometrically similar. Even with this definition, evaluating structural alignment methods is not straightforward. At first, we compared the rates of true and false positives using receiver operating characteristic (ROC) curves with the CATH classification taken as a gold standard. This proved unsatisfactory in that the quality of the alignments is not taken into account: sometimes a method that finds less good alignments scores better than a method that finds better alignments. We correct this intrinsic limitation by using four different geometric match measures (SI, MI, SAS, and GSAS) to evaluate the quality of each structural alignment. With this improved analysis we show that there is a wide variation in the performance of different methods; the main reason for this is that it can be difficult to find a good structural alignment between two proteins even when such an alignment exists. We find that STRUCTAL and SSM perform best, followed by LSQMAN and CE. Our focus on the intrinsic quality of each alignment allows us to propose a new method, called "Best-of-All" that combines the best results of all methods. Many commonly used methods miss 10-50% of the good Best-of-All alignments. By putting existing structural alignments into proper perspective, our study allows better comparison of protein structures. By highlighting limitations of existing methods, it will spur the further development of better structural alignment methods. This will have significant biological implications now that structural comparison has come to play a central role in the analysis of experimental work on protein structure, protein function and protein evolution.

PubMed Disclaimer

Figures

**Figure 1**
Receiver operating characteristic (ROC) curves for the structural alignment methods SSAP, STRUCTAL, DALI, LSQMAN, CE, and SSM. A true positive is assumed when the two aligned structures have the same Class/Architecture/Topology CATH classification. We sort all alignments and calculate the fraction of false positives (FP) and fraction of true positives (TP) with values lower than a particular threshold. As the threshold is increased to include less good alignments, we get pairs of FP and TP values that are plotted in the ROC curve. Here the alignments are sorted by their native scores, (those given by the programs and shown as broken lines) or by the geometric match measure SAS (continuous lines). In continuous black, we plot the ROC curve of the Best-of-All method, the best alignments (in terms of SAS) found by all methods. In (a) we plot the fractions of FP against the fractions of TP, and in (b), we plot log₁₀ (fraction FP) against fractions of TP, to better see performance at low rates of false positives. In (c) and (d) we plot for every threshold the average SAS value of the TP and the FP alignments below that threshold. Methods that perform better in terms of their ROC curves climb to high TP values very quickly (i.e. at low FP values). We see that the performance of the methods depends on whether the alignments are sorted by the SAS geometric match measures or the native scores. Furthermore, some of the best methods as judged by the ROC curves (such as DALI and SSAP) do not produce the best alignments as indicated by the average TP SAS value; they seem to do well because they find even worse average FP SAS values.

**Figure 2**
Comparison of the quality of the alignments produced by the methods SSAP, STRUCTAL, DALI, LSQMAN, CE, and SSM, using four geometric match measures: GSAS, SAS, SI, and MI. For each geometric measure and for each method, we plot a cumulative distribution. This gives the number of alignments (expressed as a percentage of the total number of alignments in the set considered) that is found with a geometric match score better than the particular threshold plotted along the x-axis. A lower value of the geometric match measure is better in all cases. In the upper panels, we consider the set of 104,309 pairs that have the same Class/Architecture/Topology (CAT) classification; in the lower panels we consider all pairs (these number 4,290,985). Better performing methods find more alignments (greater values along the y-axis) with better scores (smaller values on x-axis). The MI measure is always between 0 and 1, whereas the other measures are unbounded. For GSAS, SAS, and SI, we use a cutoff value 5 Å, which allows us to focus on good matches. The Figure also shows the cumulative distribution of the Best-of-All method, a method that returns the best alignment found by any of the above methods. This method is clearly the best performer in all categories. Among the existing methods, for each of the geometric match measures, STRUCTAL is the best performer; the next best method is SSM.

**Figure 3**
The composition of the set of structure pairs that have good alignments demonstrates the significant amount of structural similarity across CATH fold classes. These Best-of-All alignments are divided into four categories depending on the similarity of the CATH classification of the two aligned structures. These four categories (color-coded from black to light gray) are: (1) both structures have the same C, A and T classifiers (CAT set), (2) both structures have only the same C and A classifiers (CA set), (3) both structures have only the same C classifier (C set), and (4) both structures have different C classifiers (other pair set). We consider all good alignments, i.e. of low GSAS, SAS or SI value (left hand six panels), as well as the subset of good alignments with more than 50 matched residues (right hand six panels). The upper panels give the number of good alignments and the lower panels plot the percentage of alignments found at that level of similarity for each category. All methods find many examples of highly similar structures that CATH classifies differently.

**Figure 4**
Comparing the performance of the different structural alignment programs when aligning different classes of structures. We consider all 104,309 pairs of the same fold class (i.e. same CAT), and partition them according to their C classification into four classes: Mainly α, Mainly β, Mixed α/β and Few Secondary Structure (from left to right). For each of the programs, and for each SAS threshold value, we plot the percent of alignments found below it. The percentage of pairs of each group, among all pairs, is the horizontal line. We see that Mainly α pairs, and Mixed α/β pairs, are over-represented in the high geometric similarity region, while the Mainly β pairs are under-represented. Generally, all methods have similar behavior, with the exception of LSQMAN, which is less successful, compared to all the other methods at detecting Mixed α/β pair similarity when the geometric similarity is high (this leads to a compensatory increase in recognition of alignments of Mainly α, Mainly β, and Few Secondary Structures pairs).

See this image and copyright information in PMC

References

1. Perutz MF, Rossmann MG, Cullis AF, Muirhead H, Will G, North ACT. Structure of myoglobin: a three-dimensional Fourier synthesis at 5.5 Angstrom resolution, obtained by X-ray analysis. Nature. 1960;185:416–422. - PubMed
1. Orengo CA, Michie AD, Jones S, Jones DT, Swindells MB, Thornton JM. CATH—a hierarchic classification of protein domain structures. Structure. 1997;5:1093–1108. - PubMed
1. Shindyalov IN, Bourne PE. An alternative view of protein fold space. Proteins: Struct Funct Genet. 2000;38:247–260. - PubMed
1. Thompson JD, Plewniak F, Poch O. BAliBASE: a benchmark alignment database for the evaluation of multiple alignment programs. Bioinformatics. 1999;15:87–88. - PubMed
1. Sauder JM, Arthur JW, Dunbrack RL. Large scale comparison of protein sequence alignment algorithms with structure alignments. Proteins: Struct Funct Genet. 2000;40:6–22. - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures

Affiliation

Comprehensive evaluation of protein structure alignment methods: scoring by geometric measures

Authors

Affiliation

Abstract

Figures

References

Publication types

MeSH terms

Substances

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources

Miscellaneous