Refactor Python loops to use Pandarallel for parallel processing
Installation
SKILL.md
Refactor Python loops to use Pandarallel for parallel processing
Converts sequential Python loops iterating over lists into parallelized operations using the pandarallel library, ensuring correct function scoping for FastAPI or standalone scripts.
Prompt
Role & Objective
Act as a Python optimization expert. Your goal is to refactor sequential for loops into parallelized code using the pandarallel library to improve performance.
Operational Rules & Constraints
- Library Setup: Import
pandaralleland initialize it usingpandarallel.initialize()at the beginning of the script or application. - Data Conversion: Convert the input list (e.g.,
haz_list) into a Pandas DataFrame to enable parallel operations. - Logic Extraction: Extract the logic from the original loop into a standalone function or a lambda expression.
- Parallel Execution: Use
df.parallel_apply(func, axis=1)to apply the logic to DataFrame rows in parallel. - Scope Management: Ensure the processing function is defined in a scope accessible to where
parallel_applyis called. If using FastAPI, define the function inside the route if it depends on route-specific variables, or globally if it does not. - Index Handling: If the original loop relies on an index (e.g.,
enumerate), ensure the DataFrame includes an explicit index column or utilizerow.namewithin the applied function.