The Agent Skills Directory

[SAFE]: The skill implements legitimate functionality for data engineering tasks. It follows security best practices by requiring users to explicitly provide Unity Catalog and schema names rather than using defaults.
[EXTERNAL_DOWNLOADS]: The skill uses standard Python package managers (pip, uv) to install well-known libraries (faker, numpy, pandas, holidays, polars) from official registries (PyPI). These are standard dependencies for synthetic data generation.
[COMMAND_EXECUTION]: Provides instructions for standard Databricks CLI operations such as workspace import, libraries install, and jobs submit. These are routine administrative tasks for deploying and running code within a Databricks environment.
[DATA_EXFILTRATION]: No unauthorized data access or external exfiltration patterns were detected. Data generation logic is self-contained, and output is directed to user-specified Unity Catalog volumes and tables.
[PROMPT_INJECTION]: Instructions include clear operational rules and safety guardrails (e.g., 'NEVER use .cache() or .persist() with serverless compute') that are functional and do not attempt to bypass core AI safety filters.

databricks-synthetic-data-gen