Getting Started

This guide walks through the core Vi SDK operations you'll use most often. By the end, you'll have authenticated, listed datasets, uploaded assets, and run inference against a trained model.

Before You Start

The Vi SDK works with models trained on the Datature Vi platform. Follow the quickstart to train your first model, or install the SDK if you already have one.

1

Install Vi SDK

Install Vi SDK: pip install vi-sdk[all]

2

Your secret key and organization ID

Your secret key and organization ID


Initialize the client

Create a vi.Client instance to connect to Datature Vi:

import vi

client = vi.Client(
    secret_key="your-secret-key",
    organization_id="your-organization-id"
)

print(f"Connected to organization: {client.organizations.id}")

# See available operations
client.help()
Expected output
Connected to organization: your-organization-id

Store credentials in environment variables rather than hardcoding them:

Terminal
export DATATURE_VI_SECRET_KEY="your-secret-key" export DATATURE_VI_ORGANIZATION_ID="your-organization-id"

Then load them at runtime:

import os
import vi

client = vi.Client(
    secret_key=os.getenv("DATATURE_VI_SECRET_KEY"),
    organization_id=os.getenv("DATATURE_VI_ORGANIZATION_ID")
)

The SDK also has built-in help methods. Call client.help() for a quick reference, or client.datasets.help(), client.assets.help(), and similar methods for resource-specific operations.


List your datasets

Iterate over all datasets in your organization:

for dataset in client.datasets:
    print(f"{dataset.name}")
    print(f"  ID: {dataset.dataset_id}")
    print(f"  Assets: {dataset.statistic.asset_total}")
    print(f"  Annotations: {dataset.statistic.annotation_total}")

For manual page-by-page iteration:

for page in client.datasets.list():
    for dataset in page.items:
        print(f"Dataset: {dataset.name}")

To inspect a single dataset in detail:

dataset = client.datasets.get("your-dataset-id")
dataset.info()  # Prints a formatted summary

Download a dataset

Download a complete dataset with assets and annotations to a local directory:

result = client.get_dataset(
    dataset_id="your-dataset-id",
    save_dir="./data"
)

print(result.summary())
print(f"Total size: {result.size_mb:.1f} MB")
print(f"Splits: {', '.join(result.splits)}")

The dataset saves to ./data/your-dataset-id/ with this structure:

  • your-dataset-id
    • metadata.json
    • dump
      • annotations
        • annotations.jsonl
      • assets
        • image1.jpg
        • image2.jpg
    • training
      • validation

      Load dataset for local training

      Use ViDataset to load a downloaded dataset and iterate over training pairs for your local model training:

      from vi.dataset.loaders import ViDataset
      
      dataset = ViDataset("./data/your-dataset-id")
      
      info = dataset.info()
      print(f"Dataset: {info.name}")
      print(f"Total assets: {info.total_assets}")
      print(f"Total annotations: {info.total_annotations}")
      print(f"Training: {info.splits['training'].assets} assets")
      print(f"Validation: {info.splits['validation'].assets} assets")
      
      # Iterate training pairs
      for asset, annotations in dataset.training.iter_pairs():
          print(f"{asset.filename}: {asset.width}x{asset.height}")
      
          for ann in annotations:
              if hasattr(ann.contents, 'caption'):
                  # Phrase grounding annotation
                  print(f"  Caption: {ann.contents.caption}")
                  print(f"  Phrases: {len(ann.contents.grounded_phrases)}")
              elif hasattr(ann.contents, 'interactions'):
                  # VQA annotation
                  print(f"  Q&A pairs: {len(ann.contents.interactions)}")

      Upload assets

      Upload images to an existing dataset:

      result = client.assets.upload(
          dataset_id="your-dataset-id",
          paths="path/to/image.jpg"
      )
      print(f"Uploaded: {result.total_succeeded} assets")
      print(result.summary())
      result = client.assets.upload(
          dataset_id="your-dataset-id",
          paths="path/to/images/",
          wait_until_done=True
      )
      print(f"Uploaded: {result.total_succeeded}/{result.total_files} assets")
      print(f"Success rate: {result.success_rate:.1f}%")
      print(result.summary())

      List assets

      Iterate over assets in a dataset:

      for asset in client.assets(dataset_id="your-dataset-id"):
          print(f"{asset.filename}")
          print(f"  Size: {asset.metadata.width}x{asset.metadata.height}")
      
      # Inspect a single asset
      first_asset = next(iter(client.assets(dataset_id="your-dataset-id")))
      first_asset.info()

      Run inference

      Load a trained model and run predictions on images:

      from vi.inference import ViModel
      
      model = ViModel(
          secret_key="your-secret-key",
          organization_id="your-organization-id",
          run_id="your-run-id"
      )
      
      # Returns a (result, error) tuple
      result, error = model(
          source="path/to/test_image.jpg",
          user_prompt="Describe this image in detail"
      )
      
      if error is None:
          print(f"Result: {result}")
      else:
          print(f"Error: {error}")
      images = ["image1.jpg", "image2.jpg", "image3.jpg"]
      results = model(
          source=images,
          user_prompt="Describe this image",
          show_progress=True
      )
      
      for img, (result, error) in zip(images, results):
          if error is None:
              print(f"{img}: {result}")
      results = model(
          source="./test_images/",
          user_prompt="Describe this image",
          recursive=True,
          show_progress=True
      )
      
      for result, error in results:
          if error is None:
              print(f"Success: {result}")

      Deploy with NVIDIA NIM

      Deploy a trained model as a GPU-accelerated container using NVIDIA NIM:

      from vi.deployment.nim import NIMDeployer, NIMPredictor, NIMConfig
      
      # Step 1: Configure and deploy
      config = NIMConfig(
          nvidia_api_key="nvapi-...",
          run_id="your-run-id",
          port=8000
      )
      
      deployer = NIMDeployer(config)
      result = deployer.deploy()
      print(f"Container deployed on port {result.port}")
      
      # Step 2: Run inference
      predictor = NIMPredictor(config=config)
      
      result = predictor(
          source="image.jpg",
          user_prompt="What objects are in this image?",
          stream=False
      )
      print(f"Result: {result.result}")
      
      # Step 3: Stop when done
      NIMDeployer.stop(result.container_name)

      NIM containers support GPU-accelerated inference, custom weights from Datature Vi training runs, video processing with Cosmos-Reason2 models, and sampling controls (temperature, top-p). See the NIM deployment guide for full configuration options.


      Error handling

      The SDK raises typed exceptions so you can handle specific failures:

      from vi import (
          ViError,
          ViNotFoundError,
          ViAuthenticationError,
          ViValidationError
      )
      
      try:
          dataset = client.datasets.get("invalid-id")
      except ViNotFoundError as e:
          print(f"Not found: {e.message}")
          if e.suggestion:
              print(f"Suggestion: {e.suggestion}")
      except ViAuthenticationError as e:
          print(f"Auth failed: {e.message}")
          print("Check your API credentials.")
      except ViValidationError as e:
          print(f"Validation error: {e.message}")
      except ViError as e:
          print(f"Error {e.error_code}: {e.message}")

      Full example

      This script combines multiple operations in a single file:

      import vi
      from vi.dataset.loaders import ViDataset
      
      client = vi.Client(
          secret_key="your-secret-key",
          organization_id="your-organization-id"
      )
      
      # 1. List and select a dataset
      print("Available datasets:")
      datasets = client.datasets.list()
      for dataset in datasets.items[:5]:
          print(f"  - {dataset.name} ({dataset.dataset_id})")
      
      # 2. Download the dataset
      dataset_id = datasets.items[0].dataset_id
      downloaded = client.get_dataset(
          dataset_id=dataset_id,
          save_dir="./data"
      )
      print(f"Downloaded to: {downloaded.save_dir}")
      
      # 3. Load and inspect
      dataset = ViDataset(downloaded.save_dir)
      info = dataset.info()
      print(f"Dataset: {info.name}")
      print(f"Total assets: {info.total_assets}")
      print(f"Total annotations: {info.total_annotations}")
      
      # 4. Preview training data
      for i, (asset, annotations) in enumerate(dataset.training.iter_pairs()):
          if i >= 3:
              break
          print(f"  {i+1}. {asset.filename}: {len(annotations)} annotations")
      
      # 5. Upload new assets
      try:
          upload_result = client.assets.upload(
              dataset_id=dataset_id,
              paths="path/to/new/images/",
              wait_until_done=True
          )
          print(f"Uploaded {upload_result.total_succeeded} assets")
          print(upload_result.summary())
      except Exception as e:
          print(f"Upload failed: {e}")

      Quick reference

      Quick reference

      Operation
      Code
      Initialize client
      vi.Client(secret_key, organization_id)
      List datasets
      client.datasets.list()
      Get dataset
      client.datasets.get(dataset_id)
      Download dataset
      client.get_dataset(dataset_id, save_dir)
      Upload assets
      client.assets.upload(dataset_id, paths)
      List assets
      client.assets.list(dataset_id)
      Run inference
      ViModel(secret_key, organization_id, run_id)
      List runs
      client.runs.list()

      Next steps

      Vi SDK Overview

      Full feature list, resource hierarchy, and pagination patterns for the Vi SDK.

      Create A Secret Key

      Generate API credentials for authenticating SDK requests.

      NIM Deployment

      Deploy trained models as GPU-accelerated containers with NVIDIA NIM.