Deploy NIM Container
Deploy NIM container
Deploy NVIDIA NIM containers for optimized GPU-accelerated inference with vision-language models.
Overview
The NIMDeployer class handles the complete lifecycle of NIM container deployment:
- Image pulling — Automatically pulls NIM images from NVIDIA Container Registry
- Custom weights — Downloads and mounts model weights from Datature Vi
- Container management — Creates, starts, and monitors containers
- Health checking — Waits for service readiness before returning
- Error handling — Graceful handling of deployment failures
Basic deployment
Deploy with default settings
Deploy a NIM container using your NGC API key:
from vi.deployment.nim import NIMDeployer, NIMConfig
# Create config with NGC API key
config = NIMConfig(nvidia_api_key="nvapi-...")
# Deploy container
deployer = NIMDeployer(config)
result = deployer.deploy()
print(f"Container ID: {result.container_id}")
print(f"Container name: {result.container_name}")
print(f"Port: {result.port}")
print(f"Available models: {result.available_models}")Using environment variables
For better security, set your NGC API key as an environment variable:
export NGC_API_KEY="nvapi-..."Then deploy without explicit credentials:
from vi.deployment.nim import NIMDeployer
# API key loaded from NGC_API_KEY environment variable
deployer = NIMDeployer()
result = deployer.deploy()Custom port
Specify a custom port for the NIM service:
config = NIMConfig(
nvidia_api_key="nvapi-...",
port=8080 # Custom port
)
deployer = NIMDeployer(config)
result = deployer.deploy()
print(f"Service running on port {result.port}")Deploying with custom weights
From Datature Vi
Deploy a model trained on Datature Vi with custom weights:
from vi.deployment.nim import NIMDeployer, NIMConfig
# Configure with Datature Vi credentials
config = NIMConfig(
nvidia_api_key="nvapi-...",
secret_key="your-vi-secret-key",
organization_id="your-org-id",
run_id="your-run-id" # Model from Datature Vi
)
deployer = NIMDeployer(config)
result = deployer.deploy()
Using environment variables for Vi credentialsSet Vi credentials as environment variables for better security:
export NGC_API_KEY="nvapi-..." export DATATURE_VI_SECRET_KEY="your-secret-key" export DATATURE_VI_ORGANIZATION_ID="your-org-id"Then deploy with just the run ID:
config = NIMConfig(run_id="your-run-id") deployer = NIMDeployer(config) result = deployer.deploy()
Custom save path
Specify where to save downloaded model weights:
from pathlib import Path
config = NIMConfig(
nvidia_api_key="nvapi-...",
secret_key="your-vi-secret-key",
organization_id="your-org-id",
run_id="your-run-id",
model_save_path=Path("./my_models") # Custom save location
)
deployer = NIMDeployer(config)
result = deployer.deploy()Force re-download
Force re-download of model weights even if they exist locally:
config = NIMConfig(
nvidia_api_key="nvapi-...",
run_id="your-run-id",
overwrite=True # Force re-download
)
deployer = NIMDeployer(config)
result = deployer.deploy()
LoRA Adapter LimitationModels trained with LoRA adapters will only use the full base model weights when deployed with NIM. NVIDIA NIM does not currently support PEFT adapters, so LoRA adapter weights are not utilized during inference.
Selecting NIM images
Available images
Choose from supported NIM images:
# Cosmos Reason1 7B (default)
config = NIMConfig(
nvidia_api_key="nvapi-...",
image_name="cosmos-reason1-7b"
)
# Cosmos Reason2 2B (video support)
config = NIMConfig(
nvidia_api_key="nvapi-...",
image_name="cosmos-reason2-2b"
)
# Cosmos Reason2 8B (video support)
config = NIMConfig(
nvidia_api_key="nvapi-...",
image_name="cosmos-reason2-8b"
)Image tags
Specify image tag (default is latest):
config = NIMConfig(
nvidia_api_key="nvapi-...",
image_name="cosmos-reason2-2b",
tag="1.0.0" # Specific version
)Container lifecycle management
Reusing existing containers
Reuse an existing container instead of creating a new one:
config = NIMConfig(
nvidia_api_key="nvapi-...",
use_existing_container=True # Default: True
)
# First deployment - creates container
deployer = NIMDeployer(config)
result = deployer.deploy()
# Second deployment - reuses existing container (instant)
result = deployer.deploy()Benefits:
- Instant deployment (no image pull or startup time)
- Maintains container state
- Saves resources
Auto-kill existing containers
Automatically stop and remove existing containers with the same name:
config = NIMConfig(
nvidia_api_key="nvapi-...",
use_existing_container=False, # Don't reuse
auto_kill_existing_container=True # Auto-remove conflicts
)
deployer = NIMDeployer(config)
result = deployer.deploy() # Removes any existing container first
Destructive operationSetting
auto_kill_existing_container=Truewill stop and remove any existing container with the same name. Use with caution in shared environments.
Stopping containers
Stop a running NIM container:
from vi.deployment.nim import NIMDeployer
# Stop by container name
success = NIMDeployer.stop("cosmos-reason2-2b")
if success:
print("Container stopped successfully")
else:
print("Container not found")Or use the container name from deployment result:
# Deploy
deployer = NIMDeployer(config)
result = deployer.deploy()
# Stop using result
NIMDeployer.stop(result.container_name)Deployment options
Streaming logs
Stream container logs to terminal during startup:
config = NIMConfig(
nvidia_api_key="nvapi-...",
stream_logs=True # Default: True
)
deployer = NIMDeployer(config)
result = deployer.deploy() # Shows real-time logsDisable log streaming for cleaner output:
config = NIMConfig(
nvidia_api_key="nvapi-...",
stream_logs=False
)
deployer = NIMDeployer(config)
result = deployer.deploy() # No log outputQuiet mode
Suppress all console output:
config = NIMConfig(nvidia_api_key="nvapi-...")
# Enable quiet mode
deployer = NIMDeployer(config, quiet=True)
result = deployer.deploy() # No outputUseful for:
- Automated scripts
- CI/CD pipelines
- Logging to files instead of console
Force image pull
Always pull the latest image even if it exists locally:
config = NIMConfig(
nvidia_api_key="nvapi-...",
force_pull=True # Always pull latest
)
deployer = NIMDeployer(config)
result = deployer.deploy()When to use:
- Testing image updates
- Ensuring latest security patches
- Debugging image issues
Resource configuration
Shared memory size
Configure shared memory for the container:
config = NIMConfig(
nvidia_api_key="nvapi-...",
shm_size="64GB" # Default: 32GB
)
deployer = NIMDeployer(config)
result = deployer.deploy()Guidelines:
32GB— Sufficient for most models64GB— Large models or high batch sizes16GB— Small models on limited hardware
Maximum model length
Set maximum context length:
config = NIMConfig(
nvidia_api_key="nvapi-...",
max_model_len=16384 # Default: 8192
)
deployer = NIMDeployer(config)
result = deployer.deploy()Options:
4096— Short contexts, faster inference8192— Default, balanced performance16384— Long contexts, higher memory usage
Cache directory
Specify local cache directory for NIM:
from pathlib import Path
config = NIMConfig(
nvidia_api_key="nvapi-...",
local_cache_dir=str(Path.home() / ".cache" / "nim")
)
deployer = NIMDeployer(config)
result = deployer.deploy()Default location: ~/.cache/nim
Deployment result
The deploy() method returns a NIMDeploymentResult object:
result = deployer.deploy()
# Access deployment information
print(f"Container ID: {result.container_id}")
print(f"Container name: {result.container_name}")
print(f"Port: {result.port}")
print(f"Available models: {result.available_models}")Result attributes:
container_id— Full Docker container IDcontainer_name— Container name (e.g.,cosmos-reason2-2b)port— Port where service is runningavailable_models— List of model IDs available in the container
Error handling
Handle deployment errors
Implement proper error handling:
from vi.deployment.nim import NIMDeployer, NIMConfig
from vi.deployment.nim.exceptions import (
InvalidConfigError,
ContainerExistsError,
ModelIncompatibilityError
)
try:
config = NIMConfig(nvidia_api_key="nvapi-...")
deployer = NIMDeployer(config)
result = deployer.deploy()
print(f"✓ Deployed successfully on port {result.port}")
except InvalidConfigError as e:
print(f"✗ Invalid configuration: {e}")
# Check API key format, image name, etc.
except ContainerExistsError as e:
print(f"✗ Container '{e.container_name}' already exists")
# Either reuse with use_existing_container=True
# Or remove with auto_kill_existing_container=True
except ModelIncompatibilityError as e:
print(f"✗ Model incompatible with {e.image_name}")
print(f"Details: {e.details}")
# Check model architecture compatibility
except Exception as e:
print(f"✗ Deployment failed: {e}")Common errors
Invalid API key
# Error: InvalidConfigError
# Cause: API key doesn't start with 'nvapi-'
# Fix: Check your NGC API key format
config = NIMConfig(nvidia_api_key="nvapi-...")Container already exists
# Error: ContainerExistsError
# Solution 1: Reuse existing container
config = NIMConfig(
nvidia_api_key="nvapi-...",
use_existing_container=True
)
# Solution 2: Auto-remove existing
config = NIMConfig(
nvidia_api_key="nvapi-...",
auto_kill_existing_container=True
)Model incompatibility
# Error: ModelIncompatibilityError
# Cause: Custom model architecture not supported by container
# Fix: Use compatible NIM image or base modelComplete deployment example
Full example with all options:
from vi.deployment.nim import NIMDeployer, NIMConfig
from pathlib import Path
import os
# Configure deployment
config = NIMConfig(
# NGC credentials
nvidia_api_key=os.getenv("NGC_API_KEY"),
# Image selection
image_name="cosmos-reason2-2b",
tag="latest",
# Vi credentials for custom weights
secret_key=os.getenv("DATATURE_VI_SECRET_KEY"),
organization_id=os.getenv("DATATURE_VI_ORGANIZATION_ID"),
run_id="your-run-id",
# Model paths
model_save_path=Path("./models"),
overwrite=False,
# Container configuration
port=8000,
shm_size="32GB",
max_model_len=8192,
local_cache_dir=str(Path.home() / ".cache" / "nim"),
# Lifecycle options
use_existing_container=True,
auto_kill_existing_container=False,
# Output options
stream_logs=True,
force_pull=False
)
# Deploy with error handling
try:
deployer = NIMDeployer(config, quiet=False)
result = deployer.deploy()
print("\n" + "="*50)
print("Deployment successful!")
print("="*50)
print(f"Container ID: {result.container_id}")
print(f"Container name: {result.container_name}")
print(f"Port: {result.port}")
print(f"Available models: {', '.join(result.available_models or [])}")
except Exception as e:
print(f"\nDeployment failed: {e}")
raiseBest practices
1. Use environment variables
Store sensitive credentials in environment variables for security:
# .env file
NGC_API_KEY=nvapi-...
DATATURE_VI_SECRET_KEY=your-secret-key
DATATURE_VI_ORGANIZATION_ID=your-org-idimport os
from dotenv import load_dotenv
load_dotenv()
config = NIMConfig(
nvidia_api_key=os.getenv("NGC_API_KEY"),
secret_key=os.getenv("DATATURE_VI_SECRET_KEY"),
organization_id=os.getenv("DATATURE_VI_ORGANIZATION_ID"),
run_id="your-run-id"
)2. Reuse containers in development
Enable container reuse for faster iteration:
config = NIMConfig(
nvidia_api_key="nvapi-...",
use_existing_container=True # Instant redeployment
)3. Monitor deployment progress
Stream logs to monitor deployment:
config = NIMConfig(
nvidia_api_key="nvapi-...",
stream_logs=True # Watch container startup
)
deployer = NIMDeployer(config)
result = deployer.deploy()4. Clean up resources
Stop containers when done to free GPU resources:
try:
# Deploy and use container
result = deployer.deploy()
# ... run inference ...
finally:
# Clean up
NIMDeployer.stop(result.container_name)5. Handle errors gracefully
Always use try-except for deployment error handling:
from vi.deployment.nim.exceptions import NIMDeploymentError
try:
result = deployer.deploy()
except NIMDeploymentError as e:
print(f"Deployment failed: {e}")
# Handle error appropriatelyView common deployment errors →
Troubleshooting
Deployment hangs
If deployment appears to hang:
# Check container logs
docker logs cosmos-reason2-2b
# Check GPU availability
nvidia-smi
# Check Docker daemon
docker infoOut of GPU memory
Reduce memory usage:
config = NIMConfig(
nvidia_api_key="nvapi-...",
max_model_len=4096, # Reduce context length
shm_size="16GB" # Reduce shared memory
)Image pull fails
Check registry authentication:
# Verify NGC API key
echo $NGC_API_KEY
# Test Docker login
docker login nvcr.io
# Username: $oauthtoken
# Password: <your-ngc-api-key>See also
- NIM Overview — Introduction to NVIDIA NIM deployment
- Run inference — Execute predictions with deployed containers
- Configuration reference — Complete configuration options
- Troubleshooting — Common problems and solutions
Need help?
We're here to support your VLMOps journey. Reach out through any of these channels:
Updated 1 day ago
