pride-database

Installation
SKILL.md

PRIDE Database

Overview

The PRIDE Archive (ProteomicsIDEntifications database) at EBI is the world's largest public repository of mass spectrometry-based proteomics data, containing 30,000+ datasets from peer-reviewed publications. The REST API v2 at https://www.ebi.ac.uk/pride/ws/archive/v2/ provides project discovery, file listing, peptide/PSM identification retrieval, and protein-level evidence — all without authentication. Data types include RAW files, peak lists (mzML, MGF), PRIDE XML result files, and processed identification tables.

When to Use

  • Finding published proteomics datasets by organism, tissue, disease keyword, or instrument type for meta-analysis or benchmarking
  • Downloading raw mass spectrometry data (RAW, mzML) or peak files (MGF) from a specific PRIDE project accession
  • Retrieving peptide identification tables with sequence, modification, and confidence score for a project
  • Querying protein-level evidence (PSMs, unique peptides) for a protein of interest across PRIDE projects
  • Checking whether a protein has experimental proteomics evidence in a specific tissue or disease context
  • Building training datasets of confident peptide-spectrum matches (PSMs) for proteomics ML applications
  • For protein domain and family classification use interpro-database; PRIDE provides experimental identification evidence only
  • For protein sequences, Swiss-Prot annotations, and ID mapping use uniprot-protein-database

Prerequisites

Related skills

More from jaechang-hits/sciagent-skills

Installs
9
GitHub Stars
152
First Seen
Mar 16, 2026