walk-forward-validation
Walk-Forward Validation
Walk-forward validation framework for trading strategies and ML models. Standard cross-validation (k-fold, random splits) fails catastrophically for financial time series because it introduces lookahead bias and ignores autocorrelation. This skill covers proper time-series validation techniques including rolling and expanding windows, purged cross-validation, combinatorial purged cross-validation (CPCV), and overfit detection metrics.
Why Standard Cross-Validation Fails
Standard k-fold CV assumes data points are independent and identically distributed (IID). Financial time series violate both assumptions:
- Lookahead bias — Random splits let the model train on future data and predict past data, artificially inflating performance.
- Autocorrelation — Adjacent observations are correlated. A random split that puts Monday in test and Tuesday in train leaks information.
- Regime dependence — Markets shift between regimes. A model trained on a bull market and tested on a bull market tells you nothing about bear market performance.
- Label overlap — If labels are computed over windows (e.g., 24h forward return), adjacent train/test samples share label computation periods, leaking information.
Walk-Forward Framework
Rolling Window (Fixed Train Size)
The train window has a fixed size and slides forward in time. This is preferred when you believe older data is less relevant (common in crypto).
More from agiprolabs/claude-trading-skills
pandas-ta
Technical analysis with 130+ indicators using pandas-ta for crypto market data
105risk-management
Portfolio-level risk controls, drawdown management, exposure limits, and circuit breakers for crypto trading
76feature-engineering
Feature construction from market data for ML trading models including price, volume, on-chain, and microstructure features
76trading-visualization
Professional trading charts including candlesticks, equity curves, drawdowns, correlation heatmaps, and return distributions
76signal-classification
ML trading signal classifiers using XGBoost and LightGBM with walk-forward validation, SHAP feature importance, and threshold optimization
73market-microstructure
DEX orderflow analysis, trade classification, buyer/seller pressure, and microstructure signals for Solana tokens
73