data-cog-guide
Data Cog Guide
An intelligent data analysis assistant that accepts messy, poorly documented CSV files and automatically infers structure, cleans anomalies, and produces deep analytical reports with minimal user prompting. Designed for researchers who need quick insights from unfamiliar or inherited datasets without spending hours on manual data preparation.
Overview
Researchers frequently receive datasets from collaborators, public repositories, or legacy systems that lack documentation, use inconsistent formatting, and contain mixed data quality. Traditional analysis requires significant upfront effort to understand and prepare such data. Data Cog automates this process by applying heuristic inference, pattern recognition, and iterative cleaning to produce analysis-ready data along with a comprehensive profile report.
The skill implements a "zero-configuration" philosophy: provide the CSV file path and an optional research question, and it handles encoding detection, delimiter inference, type casting, missingness assessment, and initial exploratory statistics automatically.
Automated Ingestion Pipeline
Smart Loading
import pandas as pd
import chardet
import io