datanalysis-credit-risk

Installation
Summary

Credit risk data cleaning and variable screening pipeline for pre-loan modeling.

  • Executes 11 independent steps covering data loading, abnormal period filtering, missing rate analysis, low-IV and high-PSI variable removal, null importance denoising, and correlation-based feature elimination
  • Supports organization-level analysis with separate modeling and out-of-sample (OOS) sample handling, plus multi-process acceleration for IV and PSI calculations
  • Generates comprehensive Excel report with 15 sheets detailing operation results, feature statistics, distributions, and removed variables across all pipeline stages
  • Configurable thresholds for missing rate, IV, PSI, correlation, and null importance parameters with sensible defaults
SKILL.md

Data Cleaning and Variable Screening

Quick Start

# Run the complete data cleaning pipeline
python ".github/skills/datanalysis-credit-risk/scripts/example.py"

Complete Process Description

The data cleaning pipeline consists of the following 11 steps, each executed independently without deleting the original data:

Related skills

More from github/awesome-copilot

Installs
6.9K
GitHub Stars
33.4K
First Seen
Mar 2, 2026