torchvision

Installation
SKILL.md

Overview

TorchVision provides models, datasets, and transforms for computer vision. It has recently transitioned to "v2" transforms, which support more complex data types like bounding boxes and masks alongside images, using a unified API.

When to Use

Use TorchVision for standard CV tasks like classification, detection, or segmentation. Use the v2 transforms for performance-critical pipelines or when applying augmentations like MixUp/CutMix that require batch-level processing.

Decision Tree

  1. Are you starting a new project?
    • YES: Use torchvision.transforms.v2.
  2. Do you need a pretrained model?
    • YES: Use the weights parameter (e.g., ResNet50_Weights.DEFAULT).
  3. Do you have bounding boxes that need to move with the image?
    • YES: Use TVTensors for automatic coordinate transformation.

Workflows

Related skills

More from cuba6112/skillfactory

Installs
3
First Seen
Feb 9, 2026