Abstract
We give a test for protein coding regions which is based on simple and universal differences between protein-coding and noncoding DNA. The test is simple enough to use without a computer and is completely objective. The test has been thoroughly proven on 400,000 bases of sequence data: it misclassifies 5% of the regions tested and gives an answer of "No Opinion" one fifth of the time. We predict some new coding and noncoding regions in published sequences.
Full text
PDF















Selected References
These references are in PubMed. This may not be the complete list of references from this article.
- Borst P., Grivell L. A. One gene's intron is another gene's exon. Nature. 1981 Feb 5;289(5797):439–440. doi: 10.1038/289439a0. [DOI] [PubMed] [Google Scholar]
- Breathnach R., Chambon P. Organization and expression of eucaryotic split genes coding for proteins. Annu Rev Biochem. 1981;50:349–383. doi: 10.1146/annurev.bi.50.070181.002025. [DOI] [PubMed] [Google Scholar]
- Brosius J., Dull T. J., Noller H. F. Complete nucleotide sequence of a 23S ribosomal RNA gene from Escherichia coli. Proc Natl Acad Sci U S A. 1980 Jan;77(1):201–204. doi: 10.1073/pnas.77.1.201. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Brosius J., Dull T. J., Sleeter D. D., Noller H. F. Gene organization and primary structure of a ribosomal RNA operon from Escherichia coli. J Mol Biol. 1981 May 15;148(2):107–127. doi: 10.1016/0022-2836(81)90508-8. [DOI] [PubMed] [Google Scholar]
- Brosius J., Palmer M. L., Kennedy P. J., Noller H. F. Complete nucleotide sequence of a 16S ribosomal RNA gene from Escherichia coli. Proc Natl Acad Sci U S A. 1978 Oct;75(10):4801–4805. doi: 10.1073/pnas.75.10.4801. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Csordás-Tóth E., Boros I., Venetianer P. Structure of the promoter region for the rrnB gene in Escherichia coli. Nucleic Acids Res. 1979 Dec 20;7(8):2189–2197. doi: 10.1093/nar/7.8.2189. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Dijkema R., Dekker B. M., Van Ormondt H. The nucleotide sequence of the transforming BglII-H fragment of adenovirus type 7 DNA. Gene. 1980 Apr;9(1-2):141–156. doi: 10.1016/0378-1119(80)90172-9. [DOI] [PubMed] [Google Scholar]
- Eigen M., Winkler-Oswatitsch R. Transfer-RNA, an early gene? Naturwissenschaften. 1981 Jun;68(6):282–292. doi: 10.1007/BF01047470. [DOI] [PubMed] [Google Scholar]
- Gold L., Pribnow D., Schneider T., Shinedling S., Singer B. S., Stormo G. Translational initiation in prokaryotes. Annu Rev Microbiol. 1981;35:365–403. doi: 10.1146/annurev.mi.35.100181.002053. [DOI] [PubMed] [Google Scholar]
- Grantham R., Gautier C., Gouy M. Codon frequencies in 119 individual genes confirm consistent choices of degenerate bases according to genome type. Nucleic Acids Res. 1980 May 10;8(9):1893–1912. doi: 10.1093/nar/8.9.1893. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Grantham R., Gautier C., Gouy M., Jacobzone M., Mercier R. Codon catalog usage is a genome strategy modulated for gene expressivity. Nucleic Acids Res. 1981 Jan 10;9(1):r43–r74. doi: 10.1093/nar/9.1.213-b. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Hartley J. L., Donelson J. E. Nucleotide sequence of the yeast plasmid. Nature. 1980 Aug 28;286(5776):860–865. doi: 10.1038/286860a0. [DOI] [PubMed] [Google Scholar]
- Hindley J., Phear G. A. Sequence of 1019 nucleotides encompassing one of the inverted repeats from the yeast 2 micrometer plasmid. Nucleic Acids Res. 1979 Sep 25;7(2):361–375. doi: 10.1093/nar/7.2.361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Kastelein R. A., Remaut E., Fiers W., van Duin J. Lysis gene expression of RNA phage MS2 depends on a frameshift during translation of the overlapping coat protein gene. Nature. 1982 Jan 7;295(5844):35–41. doi: 10.1038/295035a0. [DOI] [PubMed] [Google Scholar]
- Lother H., Messer W. Promoters in the E. coli replication origin. Nature. 1981 Nov 26;294(5839):376–378. doi: 10.1038/294376a0. [DOI] [PubMed] [Google Scholar]
- Meijer M., Beck E., Hansen F. G., Bergmans H. E., Messer W., von Meyenburg K., Schaller H. Nucleotide sequence of the origin of replication of the Escherichia coli K-12 chromosome. Proc Natl Acad Sci U S A. 1979 Feb;76(2):580–584. doi: 10.1073/pnas.76.2.580. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Nakamura M., Yamada M., Hirota Y., Sugimoto K., Oka A., Takanami M. Nucleotide sequence of the asnA gene coding for asparagine synthetase of E. coli K-12. Nucleic Acids Res. 1981 Sep 25;9(18):4669–4676. doi: 10.1093/nar/9.18.4669. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ohtsubo H., Nyman K., Doroszkiewicz W., Ohtsubo E. Multiple copies of iso-insertion sequences of IS1 in Shigella dysenteriae chromosome. Nature. 1981 Aug 13;292(5824):640–643. doi: 10.1038/292640a0. [DOI] [PubMed] [Google Scholar]
- Ohtsubo H., Ohtsubo E. Nucleotide sequence of an insertion element, IS1. Proc Natl Acad Sci U S A. 1978 Feb;75(2):615–619. doi: 10.1073/pnas.75.2.615. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rodier F., Gabarro-Arpa J., Ehrlich R., Reiss C. Key for protein coding sequences identification: computer analysis of codon strategy. Nucleic Acids Res. 1982 Jan 11;10(1):391–402. doi: 10.1093/nar/10.1.391. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Rubtsov P. M., Musakhanov M. M., Zakharyev V. M., Krayev A. S., Skryabin K. G., Bayev A. A. The structure of the yeast ribosomal RNA genes. I. The complete nucleotide sequence of the 18S ribosomal RNA gene from Saccharomyces cerevisiae. Nucleic Acids Res. 1980 Dec 11;8(23):5779–5794. doi: 10.1093/nar/8.23.5779. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shepherd J. C. Method to determine the reading frame of a protein from the purine/pyrimidine genome sequence and its possible evolutionary justification. Proc Natl Acad Sci U S A. 1981 Mar;78(3):1596–1600. doi: 10.1073/pnas.78.3.1596. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Shepherd J. C. Periodic correlations in DNA sequences and evidence suggesting their evolutionary origin in a comma-less genetic code. J Mol Evol. 1981;17(2):94–102. doi: 10.1007/BF01732679. [DOI] [PubMed] [Google Scholar]
- Shulman M. J., Steinberg C. M., Westmoreland N. The coding function of nucleotide sequences can be discerned by statistical analysis. J Theor Biol. 1981 Feb 7;88(3):409–420. doi: 10.1016/0022-5193(81)90274-5. [DOI] [PubMed] [Google Scholar]
- Singleton C. K., Roeder W. D., Bogosian G., Somerville R. L., Weith H. L. DNA sequence of the E. coli trpR gene and prediction of the amino acid sequence of Trp repressor. Nucleic Acids Res. 1980 Apr 11;8(7):1551–1560. doi: 10.1093/nar/8.7.1551. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Spritz R. A., DeRiel J. K., Forget B. G., Weissman S. M. Complete nucleotide sequence of the human delta-globin gene. Cell. 1980 Oct;21(3):639–646. doi: 10.1016/0092-8674(80)90427-4. [DOI] [PubMed] [Google Scholar]
- Staden R., McLachlan A. D. Codon preference and its use in identifying protein coding regions in long DNA sequences. Nucleic Acids Res. 1982 Jan 11;10(1):141–156. doi: 10.1093/nar/10.1.141. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sugimoto K., Oka A., Sugisaki H., Takanami M., Nishimura A., Yasuda Y., Hirota Y. Nucleotide sequence of Escherichia coli K-12 replication origin. Proc Natl Acad Sci U S A. 1979 Feb;76(2):575–579. doi: 10.1073/pnas.76.2.575. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Sutcliffe J. G., Shinnick T. M., Green N., Liu F. T., Niman H. L., Lerner R. A. Chemical synthesis of a polypeptide predicted from nucleotide sequence allows detection of a new retroviral gene product. Nature. 1980 Oct 30;287(5785):801–805. doi: 10.1038/287801a0. [DOI] [PubMed] [Google Scholar]
- Trifonov E. N., Sussman J. L. The pitch of chromatin DNA is reflected in its nucleotide sequence. Proc Natl Acad Sci U S A. 1980 Jul;77(7):3816–3820. doi: 10.1073/pnas.77.7.3816. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waring R. B., Davies R. W., Lee S., Grisi E., Berks M. M., Scazzocchio C. The mosaic organization of the apocytochrome b gene of Aspergillus nidulans revealed by DNA sequencing. Cell. 1981 Nov;27(1 Pt 2):4–11. doi: 10.1016/0092-8674(81)90354-8. [DOI] [PubMed] [Google Scholar]
