Parallel sharded data -> cupynumeric load

Why this skill exists. cupynumeric mirrors NumPy's array API, including cupynumeric.load for a single .npy file. Beyond that, file loading lives in Legate, not cupynumeric:

Format	Built-in loader
Single `.npy`	`cupynumeric.load(path)` (NumPy-API parity)
HDF5 (single file)	`legate.io.hdf5.from_file` / `from_file_batched`
Sharded multi-file (any format), Parquet/Arrow, raw binary, custom layouts	No built-in loader — this skill.

This skill shows the canonical way to fill the gap in the last row: write a Legate Python task that calls the third-party reader the format needs (h5py, pyarrow, np.memmap, ...) inside the task body, and let Legate distribute the reads across GPUs / nodes. For the formats with a built-in loader, prefer it unless you need a custom in-task body (mmap-based loader, format-specific decoder, sidecar metadata, partial / sharded reads).

cupynumeric-parallel-data-load

Parallel sharded data -> cupynumeric load