sustainability-fulltext-fetch
Sustainability Fulltext Fetch
Core Goal
- Read relevant DOI entries from RSS metadata DB.
- Write fetched content into a separate fulltext DB.
- Process only relevant entries (
is_relevant=1). - Prefer API metadata retrieval by DOI (OpenAlex first, Semantic Scholar fallback).
- Fallback to webpage fulltext extraction when API metadata is unavailable.
- Persist one content row per DOI in
entry_content.
Triggering Conditions
- Receive a request to enrich relevant DOI records with abstract/fulltext content.
- Receive a request to replace webpage-first crawling with API-first enrichment.
- Need retry-safe incremental updates without duplicate rows.
Workflow
- Ensure upstream DOI/relevance data exists.
More from tiangong-ai/skills
email-smtp-send
Send emails through SMTP with optional local attachments and optional IMAP APPEND sync to Sent mailbox. Use when tasks need reliable outbound email delivery, attachment sending, SMTP connectivity checks, or cross-client sent-mail visibility (for example appending to "Sent Items" after SMTP send).
573email-imap-fetch
Listen for one or more IMAP inboxes with the IDLE command, fetch unread email metadata plus text previews, and forward each message to OpenClaw webhooks. Use when tasks need near-real-time mailbox monitoring, multi-account inbox ingestion via environment variables, and automatic trigger delivery into OpenClaw automation.
208ai-tech-rss-fetch
Subscribe to AI and tech RSS feeds and persist normalized metadata into SQLite using mature Python tooling (feedparser + sqlite3). Use when adding feed URLs/OPML sources, running incremental sync with deduplication, and storing entry metadata without full-text extraction or summarization.
202dify-knowledge-base-search
Dify dataset retrieve API for knowledge base chunk search/testing. Use when integrating or debugging Dify knowledge base retrieval requests, retrieval_model options, or response shaping.
89sci-journals-hybrid-search
Supabase edge function sci_search for hybrid search over scientific journal chunks with optional journal/date filters, chunk expansion (extK), and metadata retrieval. Use when integrating or debugging sci_search requests, filters, or result shaping.
70synology-file-station
Operate Synology DSM File Station via WebAPI for major file workflows including listing, search, folder creation, rename, copy/move, delete, upload/download, and archive extract. Use when tasks need scripted NAS file operations with service address, username, and password loaded from environment variables. Note: compress is temporarily unavailable in this skill.
61