PRIDE Database

Overview

The PRIDE Archive (ProteomicsIDEntifications database) at EBI is the world's largest public repository of mass spectrometry-based proteomics data, containing 30,000+ datasets from peer-reviewed publications. The REST API v2 at https://www.ebi.ac.uk/pride/ws/archive/v2/ provides project discovery, file listing, peptide/PSM identification retrieval, and protein-level evidence — all without authentication. Data types include RAW files, peak lists (mzML, MGF), PRIDE XML result files, and processed identification tables.

When to Use

Finding published proteomics datasets by organism, tissue, disease keyword, or instrument type for meta-analysis or benchmarking
Downloading raw mass spectrometry data (RAW, mzML) or peak files (MGF) from a specific PRIDE project accession
Retrieving peptide identification tables with sequence, modification, and confidence score for a project
Querying protein-level evidence (PSMs, unique peptides) for a protein of interest across PRIDE projects
Checking whether a protein has experimental proteomics evidence in a specific tissue or disease context
Building training datasets of confident peptide-spectrum matches (PSMs) for proteomics ML applications
For protein domain and family classification use interpro-database; PRIDE provides experimental identification evidence only
For protein sequences, Swiss-Prot annotations, and ID mapping use uniprot-protein-database

pride-database

PRIDE Database

Overview

When to Use

Prerequisites

More from jaechang-hits/sciagent-skills

scientific-brainstorming

gene-database

snakemake-workflow-engine

esm-protein-language-model

matchms-spectral-matching

chembl-database-bioactivity