Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2015 Sep;15(18):3163-8.
doi: 10.1002/pmic.201400441. Epub 2015 Mar 12.

Version 4.0 of PaxDb: Protein abundance data, integrated across model organisms, tissues, and cell-lines

Affiliations

Version 4.0 of PaxDb: Protein abundance data, integrated across model organisms, tissues, and cell-lines

Mingcong Wang et al. Proteomics. 2015 Sep.

Abstract

Protein quantification at proteome-wide scale is an important aim, enabling insights into fundamental cellular biology and serving to constrain experiments and theoretical models. While proteome-wide quantification is not yet fully routine, many datasets approaching proteome-wide coverage are becoming available through biophysical and MS techniques. Data of this type can be accessed via a variety of sources, including publication supplements and online data repositories. However, access to the data is still fragmentary, and comparisons across experiments and organisms are not straightforward. Here, we describe recent updates to our database resource "PaxDb" (Protein Abundances Across Organisms). PaxDb focuses on protein abundance information at proteome-wide scope, irrespective of the underlying measurement technique. Quantification data is reprocessed, unified, and quality-scored, and then integrated to build a meta-resource. PaxDb also allows evolutionary comparisons through precomputed gene orthology relations. Recently, we have expanded the scope of the database to include cell-line samples, and more systematically scan the literature for suitable datasets. We report that a significant fraction of published experiments cannot readily be accessed and/or parsed for quantitative information, requiring additional steps and efforts. The current update brings PaxDb to 414 datasets in 53 organisms, with (semi-) quantitative abundance information covering more than 300,000 proteins.

Keywords: Absolute protein abundance; Bioinformatics; Evolution; Spectral counting.

PubMed Disclaimer

Figures

Figure 1
Figure 1
The PaxDb website. Screenshot of the entry page of the PaxDb website, at http://paxdb‐org/. Model organisms can be browsed via the navigable taxonomy to the left; proteins of interest are accessible directly as well, via a variety of identifiers and full‐text searches (input box at top of the page).
Figure 2
Figure 2
Data updates for version 4.0. (A) Growth of datasets and organisms covered by PaxDb. Note that PaxDb focuses mainly on normal, unperturbed, physiological cells, and tissues—some prominent, published datasets are thus not included. (B) Sources for the new data in PaxDB version 4.0 (i.e. not already contained in version 3.0; the counts in the Venn diagram represent “publications” as opposed to “samples” or “replicates”). Only published data, according to the references stated at the respective sources, is included. Panel (B) focuses on data availability—datasets found at multiple sources are imported only once, from the most convenient/applicable source.
Figure 3
Figure 3
Abundance conservation and stoichiometry (A) Abundance correlations among orthologous proteins, across different evolutionary distances. (B) Inferred stoichiometry ratios between functionally related cellular processes, across multiple datasets, and organisms. In this plot, each data point denotes one organism; the abundance averages were taken from the integrated datasets (where available). Boxplots denote the medians, as well as the 25 and 75% percentiles, respectively. Below each boxplot, the median is also indicated textually. With the exception of the two ribosomal subunits, all stoichiometries are significantly different from 1:1 (p‐values are indicated). All data in this figure are from version 3.0 of PaxDb.

References

    1. Perez‐Riverol, Y. , Alpi, E. , Wang, R. , Hermjakob, H. , et al., Making proteomics data accessible and reusable: Current state of proteomics databases and repositories. Proteomics 2015, 15, 930–949. - PMC - PubMed
    1. Martens, L. , Bioinformatics challenges in mass spectrometry‐driven proteomics. Methods Mol. Biol. 2011, 753, 359–371. - PubMed
    1. Smith, L. M. , Kelleher, N. L. , Linial, M. , Goodlett, D. , et al., Proteoform: a single term describing protein complexity. Nat. Methods 2013, 10(3), 186–187. - PMC - PubMed
    1. Zubarev, R. A. , The challenge of the proteome dynamic range and its implications for in‐depth proteomics. Proteomics 2013, 13, 723–726. - PubMed
    1. Breker, M. , Schuldiner, M. , The emergence of proteome‐wide technologies: systematic analysis of proteins comes of age. Nat. Rev. Mol. Cell. Biol. 2014, 15, 453–464. - PubMed