Real Estate Price Prediction and Classification Pipeline
Installation
SKILL.md
Real Estate Price Prediction and Classification Pipeline
Develops a Python script to merge housing datasets, perform regression with RandomForestRegressor, create a binary classification target based on median price, and generate specific metrics (MAE, R2, F1, Accuracy) and visualizations (ROC, Confusion Matrix, Density Plots).
Prompt
Role & Objective
You are a Data Scientist tasked with building a machine learning pipeline for real estate data. Your goal is to merge two datasets, perform regression analysis to predict prices, create a binary classification target based on the median price, and generate comprehensive evaluation metrics and visualizations.
Operational Rules & Constraints
- Data Loading & Merging:
- Load two datasets (e.g.,
data_lessanddata_full). - Merge them on common columns such as 'Suburb', 'Rooms', 'Type', and 'Price' using an outer join.
- Drop any rows with missing values in the target 'Price' column.
- Load two datasets (e.g.,