Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jan 7;50(D1):D543-D552.
doi: 10.1093/nar/gkab1038.

The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences

Affiliations

The PRIDE database resources in 2022: a hub for mass spectrometry-based proteomics evidences

Yasset Perez-Riverol et al. Nucleic Acids Res. .

Abstract

The PRoteomics IDEntifications (PRIDE) database (https://www.ebi.ac.uk/pride/) is the world's largest data repository of mass spectrometry-based proteomics data. PRIDE is one of the founding members of the global ProteomeXchange (PX) consortium and an ELIXIR core data resource. In this manuscript, we summarize the developments in PRIDE resources and related tools since the previous update manuscript was published in Nucleic Acids Research in 2019. The number of submitted datasets to PRIDE Archive (the archival component of PRIDE) has reached on average around 500 datasets per month during 2021. In addition to continuous improvements in PRIDE Archive data pipelines and infrastructure, the PRIDE Spectra Archive has been developed to provide direct access to the submitted mass spectra using Universal Spectrum Identifiers. As a key point, the file format MAGE-TAB for proteomics has been developed to enable the improvement of sample metadata annotation. Additionally, the resource PRIDE Peptidome provides access to aggregated peptide/protein evidences across PRIDE Archive. Furthermore, we will describe how PRIDE has increased its efforts to reuse and disseminate high-quality proteomics data into other added-value resources such as UniProt, Ensembl and Expression Atlas.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Schema of the PRIDE resources ecosystem. PRIDE Archive users must provide the raw files, the processed results files, and metadata about every given dataset. Standard file formats (for processed result files) can be provided for 'Complete' submissions. A group of open-source libraries is used by the PX Submission tool, and the PRIDE pipelines to validate, assess the quality of the reported peptides and proteins, and store the information (metadata, peptides/proteins and spectra) into multiple databases. The PRIDE Peptidome resource selects high-quality peptides across all the datasets in PRIDE Archive. All the data from PRIDE Archive and PRIDE Peptidome is served to external users such as Ensembl and UniProt through the PRIDE API and PRIDE web interface. Additionally, proteomics quantitative datasets are reanalyzed and integrated into Expression Atlas.
Figure 2.
Figure 2.
PRIDE Archive users can now provide SDRF-Proteomics files to represent the experimental design and the relationship between the samples analyzed and the instrument raw files. The samples included in the SDRF-Proteomics files are submitted to BioSamples getting each of them a unique accession number. In addition, the PRIDE web interface represents the information contained in SDRF-Proteomics files in an ‘Experimental Design’ table, including all samples and data files.
Figure 3.
Figure 3.
The PRIDE web interface provides functionality to assess the quality of each Complete submission, including components to: (A) visualize the sequence coverage of a particular protein; and (B) visualize the spectrum used to identify a given peptide.
Figure 4.
Figure 4.
(A) Number of submitted datasets to PRIDE Archive per month (from the beginning of PX in 2012 till August 2021); (B) cumulative size of PRIDE Archive data since 2012; (C) number of submitted datasets per species or taxonomy identifier (as of August 2021). All species that had less than 100 datasets are grouped in one category; (D) distribution of the number of submitted datasets to PRIDE Archive per annotated disease.
Figure 5.
Figure 5.
(A) Volumes of PRIDE Archive data downloads per year, from 2013 to 2020. (B) Number of manuscripts (including pre-prints) per year (2013–2021), where datasets from PRIDE Archive are reused. The figures from 2021 are estimated at the end of the year, according to the existing data at the end of September. It should be noted that the figures represent an underestimation since they only include those manuscripts that could be tracked successfully.

References

    1. Perez-Riverol Y., Zorin A., Dass G., Vu M.T., Xu P., Glont M., Vizcaino J.A., Jarnuczak A.F., Petryszak R., Ping P.et al. .. Quantifying the impact of public omics data. Nat. Commun. 2019; 10:3512. - PMC - PubMed
    1. Perez-Riverol Y., Csordas A., Bai J., Bernal-Llinares M., Hewapathirana S., Kundu D.J., Inuganti A., Griss J., Mayer G., Eisenacher M.et al. .. The PRIDE database and related tools and resources in 2019: improving support for quantification data. Nucleic Acids Res. 2019; 47:D442–D450. - PMC - PubMed
    1. Deutsch E.W., Bandeira N., Sharma V., Perez-Riverol Y., Carver J.J., Kundu D.J., Garcia-Seisdedos D., Jarnuczak A.F., Hewapathirana S., Pullman B.S.et al. .. The ProteomeXchange consortium in 2020: enabling ‘big data’ approaches in proteomics. Nucleic Acids Res. 2020; 48:D1145–D1152. - PMC - PubMed
    1. Ternent T., Csordas A., Qi D., Gomez-Baena G., Beynon R.J., Jones A.R., Hermjakob H., Vizcaino J.A.. How to submit MS proteomics data to ProteomeXchange via the PRIDE database. Proteomics. 2014; 14:2233–2241. - PubMed
    1. Griss J., Jones A.R., Sachsenberg T., Walzer M., Gatto L., Hartler J., Thallinger G.G., Salek R.M., Steinbeck C., Neuhauser N.et al. .. The mzTab data exchange format: communicating mass-spectrometry-based proteomics and metabolomics experimental results to a wider audience. Mol. Cell. Proteomics. 2014; 13:2765–2775. - PMC - PubMed

Publication types

MeSH terms