Deploy NIM Container

To deploy an NVIDIA NIM container with Datature Vi, configure a NIMConfig, create a NIMDeployer, and call deploy(). The deployer handles image pulling, container startup, weight mounting, and health-checking automatically.

Before You Start
  • Vi SDK with deployment extras: pip install vi-sdk[deployment]
  • Docker running with GPU support (NVIDIA Container Toolkit installed)
  • A valid NGC API key starting with nvapi-
  • (Optional) Vi secret key and organization ID to deploy custom weights

Basic deployment

Deploy with default settings

from vi.deployment.nim import NIMDeployer, NIMConfig

config = NIMConfig(nvidia_api_key="nvapi-...")

deployer = NIMDeployer(config)
result = deployer.deploy()

print(f"Container ID: {result.container_id}")
print(f"Container name: {result.container_name}")
print(f"Port: {result.port}")
print(f"Available models: {result.available_models}")

Deploy using environment variables

Set your NGC API key in the environment so it never appears in source code:

export NGC_API_KEY="nvapi-..."
from vi.deployment.nim import NIMDeployer

# NGC_API_KEY is read automatically
deployer = NIMDeployer()
result = deployer.deploy()

Deploy on a custom port

from vi.deployment.nim import NIMDeployer, NIMConfig

config = NIMConfig(
    nvidia_api_key="nvapi-...",
    port=8080
)

deployer = NIMDeployer(config)
result = deployer.deploy()

print(f"Service running on port {result.port}")

Deploying with custom weights

From Datature Vi

Provide your run ID to deploy a model you trained on Datature Vi:

from vi.deployment.nim import NIMDeployer, NIMConfig

config = NIMConfig(
    nvidia_api_key="nvapi-...",
    secret_key="your-vi-secret-key",
    organization_id="your-org-id",
    run_id="your-run-id"
)

deployer = NIMDeployer(config)
result = deployer.deploy()

You can also load Vi credentials from environment variables and supply only the run ID in code:

export NGC_API_KEY="nvapi-..."
export DATATURE_VI_SECRET_KEY="your-secret-key"
export DATATURE_VI_ORGANIZATION_ID="your-org-id"
from vi.deployment.nim import NIMDeployer, NIMConfig

config = NIMConfig(run_id="your-run-id")
deployer = NIMDeployer(config)
result = deployer.deploy()

Custom save path for model weights

from vi.deployment.nim import NIMDeployer, NIMConfig
from pathlib import Path

config = NIMConfig(
    nvidia_api_key="nvapi-...",
    secret_key="your-vi-secret-key",
    organization_id="your-org-id",
    run_id="your-run-id",
    model_save_path=Path("./my_models")
)

deployer = NIMDeployer(config)
result = deployer.deploy()

Force re-download of weights

config = NIMConfig(
    nvidia_api_key="nvapi-...",
    run_id="your-run-id",
    overwrite=True  # Re-download even if weights already exist locally
)

deployer = NIMDeployer(config)
result = deployer.deploy()
LoRA Adapter Limitation

Models trained with LoRA adapters deploy with full base model weights only. NVIDIA NIM does not support PEFT adapters.

Selecting a NIM image

Three images are available:

# Cosmos-Reason1 7B (image reasoning only)
config = NIMConfig(
    nvidia_api_key="nvapi-...",
    image_name="cosmos-reason1-7b"
)

# Cosmos-Reason2 2B (supports images and video)
config = NIMConfig(
    nvidia_api_key="nvapi-...",
    image_name="cosmos-reason2-2b"  # default
)

# Cosmos-Reason2 8B (higher accuracy, supports images and video)
config = NIMConfig(
    nvidia_api_key="nvapi-...",
    image_name="cosmos-reason2-8b"
)

To pin a specific image version instead of always using latest:

config = NIMConfig(
    nvidia_api_key="nvapi-...",
    image_name="cosmos-reason2-2b",
    tag="1.0.0"
)

Container lifecycle

Reuse an existing container

The SDK reuses a running container with the same name by default. This makes re-deploying during development instant:

config = NIMConfig(
    nvidia_api_key="nvapi-...",
    use_existing_container=True  # default: True
)

deployer = NIMDeployer(config)
result1 = deployer.deploy()  # Creates container
result2 = deployer.deploy()  # Reuses existing container, returns immediately

Auto-remove an existing container

To always start fresh, set auto_kill_existing_container=True:

config = NIMConfig(
    nvidia_api_key="nvapi-...",
    use_existing_container=False,
    auto_kill_existing_container=True
)

deployer = NIMDeployer(config)
result = deployer.deploy()  # Stops any existing container first
Destructive Operation

auto_kill_existing_container=True stops and removes any running container with the same name without confirmation. Do not use this in shared environments without checking first.

Stop a container

from vi.deployment.nim import NIMDeployer

# Stop by container name
success = NIMDeployer.stop("cosmos-reason2-2b")

if success:
    print("Container stopped")
else:
    print("Container not found")
deployer = NIMDeployer(config)
result = deployer.deploy()

# ... run inference ...

NIMDeployer.stop(result.container_name)

Deployment options

Stream logs during startup

config = NIMConfig(
    nvidia_api_key="nvapi-...",
    stream_logs=True   # default: True (shows real-time container output)
)
config = NIMConfig(
    nvidia_api_key="nvapi-...",
    stream_logs=False  # Quieter output
)

Suppress all console output

Useful in automated scripts and CI/CD pipelines:

config = NIMConfig(nvidia_api_key="nvapi-...")
deployer = NIMDeployer(config, quiet=True)
result = deployer.deploy()

Force image pull

Always pull the latest image from the registry, even if a local copy exists:

config = NIMConfig(
    nvidia_api_key="nvapi-...",
    force_pull=True
)

Resource configuration

Shared memory

config = NIMConfig(
    nvidia_api_key="nvapi-...",
    shm_size="32GB"   # default: "32GB"
    # "64GB" for large models or high batch sizes
    # "16GB" for limited hardware
)

Maximum context length

config = NIMConfig(
    nvidia_api_key="nvapi-...",
    max_model_len=8192   # default: 8192
    # 4096: short contexts, faster inference
    # 16384: long contexts, higher memory usage
)

Cache directory

from pathlib import Path

config = NIMConfig(
    nvidia_api_key="nvapi-...",
    local_cache_dir=str(Path.home() / ".cache" / "nim")  # default
)

Reading the deployment result

deploy() returns a NIMDeploymentResult with these fields:

Deployment result fields

Name
Type
Description
Required
Default
container_id
str
Full Docker container ID
Optional
container_name
str
Container name (e.g. cosmos-reason2-2b)
Optional
port
int
Port where the service is running
Optional
available_models
list[str]
Model IDs available in the container
Optional
result = deployer.deploy()

print(f"Container ID: {result.container_id}")
print(f"Container name: {result.container_name}")
print(f"Port: {result.port}")
print(f"Models: {', '.join(result.available_models or [])}")

Error handling

from vi.deployment.nim import NIMDeployer, NIMConfig
from vi.deployment.nim.exceptions import (
    InvalidConfigError,
    ContainerExistsError,
    ModelIncompatibilityError
)

try:
    config = NIMConfig(nvidia_api_key="nvapi-...")
    deployer = NIMDeployer(config)
    result = deployer.deploy()
    print(f"Deployed on port {result.port}")

except InvalidConfigError as e:
    # API key format wrong, image name not recognized, etc.
    print(f"Invalid configuration: {e}")

except ContainerExistsError as e:
    # A container with the same name is already running
    # Fix: set use_existing_container=True or auto_kill_existing_container=True
    print(f"Container '{e.container_name}' already exists")

except ModelIncompatibilityError as e:
    # Custom model architecture not supported by this NIM image
    print(f"Model incompatible with {e.image_name}: {e.details}")

except Exception as e:
    print(f"Deployment failed: {e}")

Full example

from vi.deployment.nim import NIMDeployer, NIMConfig
from pathlib import Path
import os

config = NIMConfig(
    # Credentials
    nvidia_api_key=os.getenv("NGC_API_KEY"),
    secret_key=os.getenv("DATATURE_VI_SECRET_KEY"),
    organization_id=os.getenv("DATATURE_VI_ORGANIZATION_ID"),
    run_id="your-run-id",

    # Image
    image_name="cosmos-reason2-2b",
    tag="latest",

    # Paths
    model_save_path=Path("./models"),
    local_cache_dir=str(Path.home() / ".cache" / "nim"),

    # Container
    port=8000,
    shm_size="32GB",
    max_model_len=8192,
    use_existing_container=True,
    auto_kill_existing_container=False,

    # Output
    stream_logs=True,
    force_pull=False,
    overwrite=False
)

try:
    deployer = NIMDeployer(config, quiet=False)
    result = deployer.deploy()

    print("Deployment successful!")
    print(f"Container ID: {result.container_id}")
    print(f"Container name: {result.container_name}")
    print(f"Port: {result.port}")
    print(f"Available models: {', '.join(result.available_models or [])}")

except Exception as e:
    print(f"Deployment failed: {e}")
    raise

Next steps

Run Inference

Process images and videos with your deployed NIM container using NIMPredictor.

Configuration Reference

Complete parameter reference for NIMConfig and NIMSamplingParams.

Troubleshooting

Fix common deployment errors: image pull failures, GPU issues, and container conflicts.