sustainability-fulltext-fetch
Installation
SKILL.md
Sustainability Fulltext Fetch
Core Goal
- Read relevant DOI entries from RSS metadata DB.
- Write fetched content into a separate fulltext DB.
- Process only relevant entries (
is_relevant=1). - Prefer API metadata retrieval by DOI (OpenAlex first, Semantic Scholar fallback).
- Fallback to webpage fulltext extraction when API metadata is unavailable.
- Persist one content row per DOI in
entry_content.
Triggering Conditions
- Receive a request to enrich relevant DOI records with abstract/fulltext content.
- Receive a request to replace webpage-first crawling with API-first enrichment.
- Need retry-safe incremental updates without duplicate rows.
Workflow
- Ensure upstream DOI/relevance data exists.