High-throughput computational and experimental techniques in structural genomics
- PMID: 15489337
- PMCID: PMC528931
- DOI: 10.1101/gr.2537904
High-throughput computational and experimental techniques in structural genomics
Abstract
Structural genomics has as its goal the provision of structural information for all possible ORF sequences through a combination of experimental and computational approaches. The access to genome sequences and cloning resources from an ever-widening array of organisms is driving high-throughput structural studies by the New York Structural Genomics Research Consortium. In this report, we outline the progress of the Consortium in establishing its pipeline for structural genomics, and some of the experimental and bioinformatics efforts leading to structural annotation of proteins. The Consortium has established a pipeline for structural biology studies, automated modeling of ORF sequences using solved (template) structures, and a novel high-throughput approach (metallomics) to examining the metal binding to purified protein targets. The Consortium has so far produced 493 purified proteins from >1077 expression vectors. A total of 95 have resulted in crystal structures, and 81 are deposited in the Protein Data Bank (PDB). Comparative modeling of these structures has generated >40,000 structural models. We also initiated a high-throughput metal analysis of the purified proteins; this has determined that 10%-15% of the targets contain a stoichiometric structural or catalytic transition metal atom. The progress of the structural genomics centers in the U.S. and around the world suggests that the goal of providing useful structural information on most all ORF domains will be realized. This projected resource will provide structural biology information important to understanding the function of most proteins of the cell.
Figures


References
-
- Baker, D. and Sali, A. 2001. Protein structure prediction and structural genomics. Science 294: 93-96. - PubMed
-
- Bentley, S.D., Chater, K.F., Cerdeno-Tarraga, A.M., Challis, G.L., Thomson, N.R., James, K.D., Harris, D.E., Quail, M.A., Kieser, H., Harper, D., et al. 2002. Complete genome sequence of the model actinomycete Streptomyces coelicolor A3(2). Nature 417: 141-147. - PubMed
WEB SITE REFERENCES
-
- www.nigms.nih.gov/psi; NIH Web site providing information and relevant links for the Protein Structure Initiative.
-
- http://targetdb.pdb.org; Web site operated by the Protein Databank to allow searching of targets from the structural genomics centers.
-
- www.nysgxrc.org; Web site operated by the NYSGRC. Its functions are to provide a public target list and progress as well as to allow consortium members to enter target data.
-
- http://salilab.org/modbase; MODBASE, a comprehensive database of comparative protein structure models.
-
- www-archbac.u-psud.fr/genomics/COG_Guess.html; Clusters of Orthologous Groups Database Query Page to perform similarity search in COG database. This provides a function and COG category guess for input sequence.
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources