implementing-warehouse-sources

Installation
SKILL.md

Implementing Data warehouse sources

Use this skill when building or updating Data warehouse sources in posthog/temporal/data_imports/sources/.

Read first

Before coding, read:

  • posthog/temporal/data_imports/sources/source.template (use the top-of-file TODOs as a starting reference, but verify target files against the current source implementations — the template can drift, e.g. it currently still points at the old posthog/warehouse/types.py path instead of products/data_warehouse/backend/types.py)
  • posthog/temporal/data_imports/sources/README.md
  • posthog/temporal/data_imports/sources/SOURCES.md — inventory of every registered source with its communication method (HTTP / vendor SDK / gRPC / DB protocol / webhook) and tracked-transport state. Skim this first to see how similar sources are wired and what state today's source you're touching is in. Keep it in sync — see "Updating SOURCES.md" below.
  • posthog/temporal/data_imports/sources/common/base.py — base classes (SimpleSource, ResumableSource, WebhookSource) and the FieldType union
  • posthog/temporal/data_imports/sources/common/resumable.pyResumableSourceManager
  • posthog/temporal/data_imports/sources/common/webhook_s3.pyWebhookSourceManager
  • 1 API source with settings.py + transport logic (e.g. klaviyo, github). For dependent-resource fan-out (parent→child with type: "resolve"), also read posthog/temporal/data_imports/sources/common/rest_source/__init__.py and config_setup.py (e.g. process_parent_data_item, make_parent_key_name).
  • For webhook-capable sources, read posthog/temporal/data_imports/sources/stripe/source.py as the reference implementation.

Picking the right base class

Installs
3
GitHub Stars
513
First Seen
3 days ago
implementing-warehouse-sources — posthog/posthog-foss