implementing-warehouse-sources
Installation
SKILL.md
Implementing Data warehouse sources
Use this skill when building or updating Data warehouse sources in posthog/temporal/data_imports/sources/.
Read first
Before coding, read:
posthog/temporal/data_imports/sources/source.template(use the top-of-file TODOs as a starting reference, but verify target files against the current source implementations — the template can drift, e.g. it currently still points at the oldposthog/warehouse/types.pypath instead ofproducts/data_warehouse/backend/types.py)posthog/temporal/data_imports/sources/README.mdposthog/temporal/data_imports/sources/SOURCES.md— inventory of every registered source with its communication method (HTTP / vendor SDK / gRPC / DB protocol / webhook) and tracked-transport state. Skim this first to see how similar sources are wired and what state today's source you're touching is in. Keep it in sync — see "Updating SOURCES.md" below.posthog/temporal/data_imports/sources/common/base.py— base classes (SimpleSource,ResumableSource,WebhookSource) and theFieldTypeunionposthog/temporal/data_imports/sources/common/resumable.py—ResumableSourceManagerposthog/temporal/data_imports/sources/common/webhook_s3.py—WebhookSourceManager- 1 API source with
settings.py+ transport logic (e.g. klaviyo, github). For dependent-resource fan-out (parent→child withtype: "resolve"), also readposthog/temporal/data_imports/sources/common/rest_source/__init__.pyandconfig_setup.py(e.g.process_parent_data_item,make_parent_key_name). - For webhook-capable sources, read
posthog/temporal/data_imports/sources/stripe/source.pyas the reference implementation.