harvard-artifacts-collection-data-engineering
Installation
SKILL.md
Harvard Artifacts Collection Data Engineering
Skill by ara.so — Data Skills collection.
This project provides an end-to-end data engineering and analytics application built on the Harvard Art Museums API. It demonstrates real-world ETL pipelines, SQL database design, analytical queries, and interactive visualization using Streamlit. The architecture follows: API → ETL → SQL → Analytics → Visualization.
What This Project Does
- API Integration: Fetches artifact data from Harvard Art Museums API with pagination and rate limiting
- ETL Pipeline: Extracts, transforms, and loads nested JSON into relational database tables
- SQL Database: Stores structured data across
artifactmetadata,artifactmedia, andartifactcolorstables - Analytics: Executes 20+ predefined SQL queries for insights
- Visualization: Interactive dashboards using Plotly and Streamlit