NIM Configuration
The Datature Vi SDK provides two configuration classes for NVIDIA NIM: NIMConfig controls container deployment, and NIMSamplingParams controls inference behavior. This page documents every parameter for both classes.
NIMConfig controls container deployment, and NIMSamplingParams controls inference behavior. This page documents every parameter for both classes.NIMConfig
NIMConfig specifies how the NIM container is deployed: which image to pull, which port to use, whether to load custom weights, and how to manage the container lifecycle.
from vi.deployment.nim import NIMConfig
config = NIMConfig(
nvidia_api_key="nvapi-...",
image_name="cosmos-reason2-2b",
port=8000,
# ... additional options
)Credentials
config = NIMConfig(
nvidia_api_key="nvapi-...",
secret_key="your-secret-key",
organization_id="your-org-id"
)# Set in shell:
# export NGC_API_KEY="nvapi-..."
# export DATATURE_VI_SECRET_KEY="your-secret-key"
# export DATATURE_VI_ORGANIZATION_ID="your-org-id"
config = NIMConfig() # reads from environmentImage selection
# 7B model, image reasoning only
config = NIMConfig(nvidia_api_key="nvapi-...", image_name="cosmos-reason1-7b")
# 2B model with video support (default)
config = NIMConfig(nvidia_api_key="nvapi-...", image_name="cosmos-reason2-2b")
# 8B model with video support
config = NIMConfig(nvidia_api_key="nvapi-...", image_name="cosmos-reason2-8b")
# Pin to a specific version
config = NIMConfig(nvidia_api_key="nvapi-...", image_name="cosmos-reason2-2b", tag="1.0.0")Network
Resources
Container lifecycle
config = NIMConfig(
nvidia_api_key="nvapi-...",
use_existing_container=True # default (instant on second deploy)
)config = NIMConfig(
nvidia_api_key="nvapi-...",
use_existing_container=False,
auto_kill_existing_container=True # removes any existing container first
)Output
Custom weights
Advanced
NIMSamplingParams
NIMSamplingParams controls how the model generates tokens during inference. Pass an instance to the sampling_params argument of NIMPredictor.__call__().
from vi.deployment.nim import NIMSamplingParams
params = NIMSamplingParams(
temperature=0.7,
max_tokens=1024,
top_p=0.95
)Sampling
from vi.deployment.nim import NIMSamplingParams
# Deterministic output
params = NIMSamplingParams(temperature=0.0)
# Fast, focused
params = NIMSamplingParams(temperature=0.2, max_tokens=256)
# Balanced (recommended starting point)
params = NIMSamplingParams(temperature=0.7, top_p=0.95, top_k=50)
# Creative
params = NIMSamplingParams(temperature=1.0, max_tokens=2048)Length control
Repetition control
Stop sequences
params = NIMSamplingParams(stop="END")
params = NIMSamplingParams(stop=["\n\n", "END", "STOP"])Determinism
Log probabilities
Guided decoding
Guided decoding constrains the model's output to a specific format. Only one guided decoding option can be active at a time.
params = NIMSamplingParams(
temperature=0.2,
guided_choice=["yes", "no", "maybe"]
)schema = {
"type": "object",
"properties": {
"objects": {"type": "array", "items": {"type": "string"}},
"count": {"type": "integer"}
},
"required": ["objects", "count"]
}
params = NIMSamplingParams(temperature=0.3, guided_json=schema)params = NIMSamplingParams(
temperature=0.1,
guided_regex=r"\d{4}-\d{2}-\d{2}" # YYYY-MM-DD
)grammar = """
root ::= "The answer is " answer "."
answer ::= "yes" | "no" | "maybe"
"""
params = NIMSamplingParams(temperature=0.2, guided_grammar=grammar)Video processing
These parameters apply only to Cosmos-Reason2 models (cosmos-reason2-2b, cosmos-reason2-8b).
params = NIMSamplingParams(
temperature=0.2,
max_tokens=4096,
media_io_kwargs={"fps": 2.0}, # 2 frames per second
mm_processor_kwargs={"shortest_edge": 336, "longest_edge": 672}
)params = NIMSamplingParams(
temperature=0.2,
media_io_kwargs={"num_frames": 16}, # exactly 16 frames
mm_processor_kwargs={"shortest_edge": 336, "longest_edge": 672}
)Use either fps or num_frames in media_io_kwargs, not both. Providing both may cause an error.
Configuration examples
Development (quick iteration)
from vi.deployment.nim import NIMConfig, NIMSamplingParams
config = NIMConfig(
nvidia_api_key="nvapi-...",
image_name="cosmos-reason2-2b",
port=8000,
use_existing_container=True,
stream_logs=True
)
params = NIMSamplingParams(
temperature=0.7,
max_tokens=1024,
top_p=0.95,
top_k=50
)Production (custom weights, stable output)
import os
from pathlib import Path
from vi.deployment.nim import NIMConfig, NIMSamplingParams
config = NIMConfig(
nvidia_api_key=os.getenv("NGC_API_KEY"),
secret_key=os.getenv("DATATURE_VI_SECRET_KEY"),
organization_id=os.getenv("DATATURE_VI_ORGANIZATION_ID"),
run_id="your-run-id",
image_name="cosmos-reason2-2b",
port=8000,
shm_size="64GB",
max_model_len=8192,
local_cache_dir="/mnt/ssd/nim_cache",
model_save_path=Path("/mnt/models"),
use_existing_container=True,
auto_kill_existing_container=False,
stream_logs=False,
force_pull=False,
overwrite=False
)
params = NIMSamplingParams(
temperature=0.3,
max_tokens=2048,
top_p=0.95,
repetition_penalty=1.05,
seed=42
)Video analysis
from vi.deployment.nim import NIMConfig, NIMSamplingParams
config = NIMConfig(
nvidia_api_key="nvapi-...",
image_name="cosmos-reason2-2b",
port=8000,
shm_size="64GB",
max_model_len=16384
)
params = NIMSamplingParams(
temperature=0.2,
max_tokens=4096,
media_io_kwargs={"fps": 2.0},
mm_processor_kwargs={"shortest_edge": 336, "longest_edge": 672}
)Related resources
Updated about 1 month ago
