Troubleshoot Issues
This page covers the most common errors when running inference with Datature Vi: out-of-memory failures, slow performance, model loading problems, runtime errors, and unexpected outputs.
- Vi SDK installed with inference dependencies
- A trained model loaded with ViModel
- Familiarity with inference basics
Out of memory errors
GPU out of memory
Symptoms:
CUDA out of memory. Tried to allocate X GB...
RuntimeError: CUDA error: out of memoryStep 1: Switch to 8-bit quantization
from vi.inference import ViModel
model = ViModel(
run_id="your-run-id",
load_in_8bit=True,
device_map="auto"
)Step 2: If still OOM, switch to 4-bit
model = ViModel(
run_id="your-run-id",
load_in_4bit=True,
device_map="auto"
)Step 3: Enable low CPU memory usage
model = ViModel(
run_id="your-run-id",
load_in_4bit=True,
low_cpu_mem_usage=True,
device_map="auto"
)Step 4: Clear GPU cache before loading
import torch
import gc
torch.cuda.empty_cache()
gc.collect()
model = ViModel(run_id="your-run-id", load_in_8bit=True)Step 5: Process in smaller chunks
def process_in_chunks(model, images, chunk_size=25):
results = []
for i in range(0, len(images), chunk_size):
chunk = images[i:i + chunk_size]
batch_results = model(source=chunk)
results.extend(batch_results)
torch.cuda.empty_cache()
return resultsSee improve performance for more memory management strategies.
CPU out of memory
Symptoms:
MemoryError
Killed (process terminated)# Enable low CPU memory usage during loading
model = ViModel(run_id="your-run-id", low_cpu_mem_usage=True)
# Force model onto GPU instead of CPU
model = ViModel(run_id="your-run-id", device_map="cuda")If your machine has a GPU, make sure the model is loading there rather than staying on CPU.
Model loading issues
Model download fails
Symptoms:
ValueError: Failed to download model
ConnectionError: Failed to connectCheck your credentials:
import os
print(f"Secret Key set: {bool(os.getenv('DATATURE_VI_SECRET_KEY'))}")
print(f"Org ID set: {bool(os.getenv('DATATURE_VI_ORGANIZATION_ID'))}")Verify the run ID exists and training is complete:
import vi
client = vi.Client()
run = client.runs.get(run_id="your-run-id")
print(f"Status: {run.status.phase}")
# Status must be "completed" before you can downloadTest network connectivity:
import requests
try:
response = requests.get("https://vi.datature.com", timeout=5)
print(f"Connection OK: {response.status_code}")
except Exception as e:
print(f"Connection failed: {e}")Model loading hangs
Symptoms: Loading freezes with no progress for an extended period.
import shutil
total, used, free = shutil.disk_usage("/")
print(f"Free space: {free // (2**30)} GB")If you have insufficient disk space, the download may stall. Clear old cached models and retry:
rm -rf ~/.datature/vi/models/model = ViModel(
run_id="your-run-id",
overwrite=True
)Slow inference
General slowness
Symptoms: Inference takes much longer than expected, or throughput is low.
Use a GPU:
import torch
print(f"CUDA available: {torch.cuda.is_available()}")
model = ViModel(run_id="your-run-id", device_map="cuda")Enable FP16 and Flash Attention 2:
model = ViModel(
run_id="your-run-id",
dtype="float16",
attn_implementation="flash_attention_2",
device_map="auto"
)Use batch inference instead of single-image loops:
# Good: batch inference
results = model(source=["img1.jpg", "img2.jpg", "img3.jpg"])
# Slow: sequential single-image calls
for img in ["img1.jpg", "img2.jpg", "img3.jpg"]:
result, error = model(source=img)First inference is slow
Symptom: The first prediction takes much longer than subsequent ones.
This is expected behavior. The model runs initialization work on the first call. Add a warm-up call before your timed measurements:
model = ViModel(run_id="your-run-id")
# Warm-up (discard result)
model(source="any_image.jpg")
# Subsequent calls will be faster
result, error = model(source="real_image.jpg")Runtime errors
File not found
Symptoms:
FileNotFoundError: [Errno 2] No such file or directory: 'image.jpg'import os
from pathlib import Path
image_path = os.path.abspath("image.jpg")
result, error = model(source=image_path)
# Or verify the file exists first
if Path("image.jpg").exists():
result, error = model(source="image.jpg")
else:
print("File not found")Unsupported image format
Symptoms:
PIL.UnidentifiedImageError: cannot identify image filefrom PIL import Image
img = Image.open("image.webp")
img = img.convert("RGB")
img.save("image.jpg", "JPEG")
result, error = model(source="image.jpg")Supported formats: .jpg, .jpeg, .png, .bmp, .gif, .tiff, .tif, .webp
Prompt length mismatch
Symptoms:
ValueError: user_prompt length must match sources length# Good: lengths match
results = model(
source=["img1.jpg", "img2.jpg"],
user_prompt=["Prompt 1", "Prompt 2"]
)
# Bad: one prompt for two images
results = model(
source=["img1.jpg", "img2.jpg"],
user_prompt=["Prompt 1"] # raises ValueError
)Result issues
No grounded phrases
Symptoms: result.grounded_phrases attribute is missing or bounding boxes are not returned.
info = ViModel.inspect(run_id="your-run-id")
print(f"Task type: {info.task_type}")Not all models support phrase grounding. If info.task_type is VQA, the model will not return bounding boxes. Load a model trained for phrase grounding, or omit the prompt to let the model use its default detection behavior.
Also check that you are accessing the correct field:
from vi.inference.task_types.phrase_grounding import PhraseGroundingResponse
result, error = model(source="image.jpg")
if error is None and isinstance(result, PhraseGroundingResponse):
print(f"Found {len(result.result.groundings)} objects")
for grounding in result.result.groundings:
print(f" {grounding.phrase}: {grounding.grounding}")Unexpected output format
Symptom: Accessing result.caption or result.grounded_phrases raises AttributeError.
The response field names depend on the response type. Always use isinstance() and access the correct fields:
from vi.inference.task_types.vqa import VQAResponse
from vi.inference.task_types.phrase_grounding import PhraseGroundingResponse
result, error = model(source="image.jpg")
if error is None:
if isinstance(result, VQAResponse):
print(f"Answer: {result.result.answer}")
elif isinstance(result, PhraseGroundingResponse):
print(f"Caption: {result.result.sentence}")
else:
print(f"Raw output: {result.result}")See complete prediction schemas →
Poor quality results
Symptom: Predictions are inaccurate or vague.
Lower the temperature for factual tasks:
result, error = model(
source="image.jpg",
user_prompt="Describe this image",
generation_config={
"temperature": 0.0,
"max_new_tokens": 256,
"do_sample": False
}
)Write more specific prompts:
# Too vague
"Tell me about this"
# Specific
"What objects are visible in this image and where are they located?"Check training completion:
client = vi.Client()
run = client.runs.get(run_id="your-run-id")
print(f"Status: {run.status.phase}")
# "completed" means training and export are donePermission errors
Access denied
Symptoms:
PermissionError: [Errno 13] Permission denied
HTTPError: 403 Forbiddenimport vi
client = vi.Client()
org = client.organizations
print(f"Organization: {org.name}")import os
file_path = "image.jpg"
if os.access(file_path, os.R_OK):
print("File is readable")
else:
print("Cannot read file, check permissions")
output_dir = "./results"
os.makedirs(output_dir, exist_ok=True)Debugging tips
Enable detailed logging
import logging
logging.basicConfig(level=logging.DEBUG)
model = ViModel(run_id="your-run-id")
result, error = model(source="image.jpg")Check system resources
import psutil
import torch
cpu_percent = psutil.cpu_percent()
ram_percent = psutil.virtual_memory().percent
print(f"CPU: {cpu_percent}%")
print(f"RAM: {ram_percent}%")
if torch.cuda.is_available():
for i in range(torch.cuda.device_count()):
allocated = torch.cuda.memory_allocated(i) / 1e9
reserved = torch.cuda.memory_reserved(i) / 1e9
print(f"GPU {i}: Allocated: {allocated:.2f} GB, Reserved: {reserved:.2f} GB")Build a minimal reproducible example
When reporting an issue, include a minimal script that shows the problem:
from vi.inference import ViModel
import vi, sys, torch
print(f"Vi SDK: {vi.__version__}")
print(f"Python: {sys.version}")
print(f"PyTorch: {torch.__version__}")
print(f"CUDA: {torch.version.cuda if torch.cuda.is_available() else 'N/A'}")
model = ViModel(run_id="your-run-id")
result, error = model(source="test.jpg")
if error:
print(f"Error: {type(error).__name__}: {error}")
else:
print(f"Success: {str(result.result)[:50]}...")Getting help
If these steps don't resolve your issue:
Check result.raw_output for the full unprocessed model output
Review the Vi SDK changelog for known issues
Review the Vi SDK changelog for known issues
Ask in the Datature community
Ask in the Datature community
Contact Datature support
When reaching out, include your Vi SDK version, Python version, GPU details, the complete error message, and a minimal reproducible script.
Related resources
Updated about 1 month ago
