spark-principal-engineer

Installation

SKILL.md

Spark Mastery (Senior → Principal)

Operate

Start from data volume, compute economics, shuffle behavior, and correctness requirements.
Treat Spark as a distributed execution system with real storage, network, and scheduling tradeoffs.
Prefer explicit workload design over vague “big data” assumptions.
Optimize for predictable cost, reliability, and debuggable pipelines.

Default Standards

Data layout and partitioning must match workload reality.
Shuffle-heavy patterns require scrutiny.
Memory and executor tuning should follow evidence.
Streaming and batch semantics must be separated clearly.
Platform cost and job performance should be evaluated together.

References

Related skills

More from modra40/claude-codex-skills-directory

Installs

2

Repository

modra40/claude-…irectory

GitHub Stars

5

First Seen

Apr 7, 2026

Security Audits

Gen Agent Trust HubPass