sql-ml-features
SQL for Machine Learning
Patterns for extracting ML-ready features from SQL Server — turning normalized relational data into the wide, denormalized, NULL-free datasets that models consume.
When to Use
- Building a feature table or training dataset from application data
- Feature engineering in T-SQL (rolling aggregations, lag features, RFM scoring)
- Encoding categorical variables (one-hot, ordinal, frequency encoding)
- Handling NULLs for ML (imputation strategies in SQL)
- Sampling and train/test splitting from SQL Server
- Exporting large datasets to CSV, Parquet, or pandas
- Preventing data leakage in temporal feature queries
- Writing queries that feed scikit-learn, XGBoost, or PyTorch pipelines
When NOT to use: application schema design (table design, naming conventions, access control), query performance tuning (execution plans, index tuning, wait stats), BI dashboards and summary reports (GROUPING SETS, pivot tables, dashboard queries), or running R/Python inside SQL Server (SQL Server ML Services).
Feature Table Build Workflow
More from damusix/skills
htmx
Implements HTMX interactions, configures swap behaviors, debugs hx-* requests, and builds hypermedia-driven UI components. Use when tasks involve hx-* attributes, HTMX AJAX requests, swap strategies, server-sent events, WebSockets, or hypermedia-driven UIs.
90hapi
Creates, configures, and debugs `@hapi/hapi` servers — implements routes, plugins, auth schemes, validation, caching, and request lifecycle hooks. Use when building HTTP APIs, setting up stale-while-revalidate caching, registering server methods, configuring views, managing startup sequences, or troubleshooting response marshalling.
32mssql-server
Writes, optimizes, and debugs T-SQL queries. Explains SQL Server internals, troubleshoots performance issues, and guides database administration tasks including backup/restore, high availability, security, and index design. Use when the user asks about T-SQL syntax, SQL Server administration, query performance, stored procedures, indexes, locking, transactions, backup/restore, high availability, security, or any MSSQL-related topic — even without saying 'SQL Server' explicitly. Also trigger on terms like SSMS, tempdb, bcp, sqlcmd, MSSQL, sp_executesql, NOLOCK, columnstore, Hekaton, RCSI, param sniffing, or execution plan.
31joi
Use when building joi schemas, validating input data, defining custom types, conditional validation with .when(), cross-field references, custom error messages, or writing joi extensions. Standalone package that integrates with the hapijs ecosystem.
27google-zx-scripting
Writes and executes JavaScript-based shell scripts using Google's zx library. Use when writing shell scripts, automation, build tools, file processing, CLI tools, deployment scripts, data pipelines, or batch operations. Also covers piping, streams, parallel execution, retries, cross-platform scripting, built-in fs utilities, and minimist argument parsing.
26sql-bi-reporting
Use when writing T-SQL for business intelligence, analytics, or reporting. Includes building summary reports with GROUPING SETS, ROLLUP, and CUBE, writing time-series queries with date bucketing, creating pivot/unpivot transformations, generating tally/numbers tables for gap-filling, building running totals and moving averages with window functions, writing year-over-year comparisons, designing materialized views for dashboards, or producing CSV/JSON exports from SQL Server.
9