harvard-artifacts-collection-data-engineering-analytics
Installation
SKILL.md
Harvard Artifacts Collection Data Engineering Analytics
Skill by ara.so — Data Skills collection.
Overview
This project provides a complete data engineering and analytics solution for the Harvard Art Museums API. It demonstrates production-grade ETL pipelines that extract artifact metadata, transform nested JSON into relational schemas, load into SQL databases (MySQL/TiDB Cloud), and visualize insights through an interactive Streamlit dashboard.
The application handles:
- API pagination and rate limiting
- Nested JSON transformation into normalized tables
- Batch SQL operations for performance
- 20+ analytical queries for artifact insights
- Real-time interactive visualizations with Plotly