harvard-artifacts-data-engineering-pipeline
Installation
SKILL.md
Harvard Artifacts Collection Data Engineering Pipeline
Skill by ara.so — Data Skills collection.
This skill enables AI coding agents to build end-to-end data engineering and analytics applications using the Harvard Art Museums API. The project demonstrates real-world ETL pipelines, SQL analytics, and interactive data visualization using Streamlit.
What It Does
The Harvard Artifacts Collection Data Engineering App provides:
- Dynamic data collection from Harvard Art Museums API with pagination and rate limiting
- ETL pipeline that transforms nested JSON into relational database tables
- SQL database storage (MySQL/TiDB Cloud) with proper schema design
- 20+ predefined analytical SQL queries for artifact insights
- Interactive Streamlit dashboard with Plotly visualizations
- Analysis of artifact metadata, media availability, and color patterns