accessing-cloud-storage

Installation
SKILL.md

Accessing Cloud Storage

Comprehensive guide to accessing cloud storage (S3, GCS, Azure) and remote filesystems in Python. Covers three major libraries - fsspec, pyarrow.fs, and obstore - and their integration with data engineering tools.

Quick Comparison

Feature fsspec pyarrow.fs obstore
Best For Broad compatibility, ecosystem integration Arrow-native workflows, Parquet High-throughput, performance-critical
Backends S3, GCS, Azure, HTTP, FTP, 20+ more S3, GCS, HDFS, local S3, GCS, Azure, local
Performance Good (with caching) Excellent for Parquet 9x faster for concurrent ops
Dependencies Backend-specific (s3fs, gcsfs) Bundled with PyArrow Zero Python deps (Rust)
Async Support Yes (aiohttp) Limited Native sync/async
DataFrame Integration Universal PyArrow-native Via fsspec wrapper
Maturity Very mature (2018+) Mature New (2025), rapidly evolving

When to Use Which?

Installs
5
First Seen
Apr 2, 2026
accessing-cloud-storage — legout/data-platform-agent-skills