data-engineering-storage-remote-access-integrations-pandas
Installation
SKILL.md
Pandas Integration with Remote Storage
Pandas leverages fsspec under the hood for cloud storage access (s3://, gs://, etc.). This makes reading from and writing to cloud storage straightforward.
Auto-Detection (Simplest)
Pandas automatically uses fsspec for cloud URIs:
import pandas as pd
# Read CSV/Parquet directly from cloud URIs
df = pd.read_csv("s3://bucket/data.csv")
df = pd.read_parquet("s3://bucket/data.parquet")
df = pd.read_json("gs://bucket/data.json")
# Compression is auto-detected
df = pd.read_csv("s3://bucket/data.csv.gz") # Automatically decompressed