Troubleshoot Issues
Troubleshoot issues
Common inference issues and solutions for memory errors, performance problems, and model loading failures.
Prerequisites
- Vi SDK installed with inference dependencies
- A trained model loaded with ViModel
- Understanding of inference basics
- Familiarity with performance optimization
Overview
This guide covers:
- Memory errors - Out of memory (OOM) issues with GPU and CPU
- Performance issues - Slow inference and throughput problems
- Loading problems - Model loading and download failures
- Runtime errors - File and execution issues
- Result issues - Unexpected outputs and poor quality predictions
Out of Memory Errors
GPU Out of Memory
Symptoms:
CUDA out of memory. Tried to allocate X GB...
RuntimeError: CUDA error: out of memory
Solutions:
1. Use quantization
Learn more about quantization for memory reduction.
# Try 8-bit quantization first
model = ViModel(
run_id="your-run-id",
load_in_8bit=True,
device_map="auto"
)
# If still OOM, try 4-bit
model = ViModel(
run_id="your-run-id",
load_in_4bit=True,
device_map="auto"
)2. Enable low CPU memory usage
model = ViModel(
run_id="your-run-id",
load_in_8bit=True,
low_cpu_mem_usage=True,
device_map="auto"
)3. Clear GPU cache
import torch
import gc
# Clear cache before loading
torch.cuda.empty_cache()
gc.collect()
# Load model
model = ViModel(run_id="your-run-id", load_in_8bit=True)4. Process in smaller batches
See batch inference guide for more details.
# Reduce batch size
def process_in_chunks(model, images, chunk_size=25): # Smaller chunks
results = []
for i in range(0, len(images), chunk_size):
chunk = images[i:i+chunk_size]
batch_results = model(source=chunk)
results.extend(batch_results)
# Clear cache between batches
torch.cuda.empty_cache()
return resultsCPU Out of Memory
Symptoms:
MemoryError
Killed (process terminated)
Solutions:
1. Enable Low CPU Memory
model = ViModel(
run_id="your-run-id",
low_cpu_mem_usage=True
)2. Use GPU Instead
# Ensure model loads on GPU, not CPU
model = ViModel(
run_id="your-run-id",
device_map="cuda" # Force GPU
)3. Close Other Applications
Free up system memory before loading models.
Model Loading Issues
Model Download Fails
Symptoms:
ValueError: Failed to download model
ConnectionError: Failed to connect
Solutions:
1. Check Credentials
# Verify credentials
import os
print(f"Secret Key: {os.getenv('DATATURE_VI_SECRET_KEY')[:10]}...")
print(f"Org ID: {os.getenv('DATATURE_VI_ORGANIZATION_ID')}")2. Verify Run ID
import vi
# List available runs
client = vi.Client()
for run in client.runs:
print(f"Run ID: {run.run_id}, Status: {run.status.phase}")3. Check Network Connection
# Test connection
import requests
try:
response = requests.get("https://vi.datature.io", timeout=5)
print(f"Connection OK: {response.status_code}")
except Exception as e:
print(f"Connection failed: {e}")4. Check Model Status
Ensure the model has finished training and exporting:
client = vi.Client()
run = client.runs.get(run_id="your-run-id")
print(f"Status: {run.status.phase}")
# Status should be "completed"
if run.status.phase != "completed":
print("Model is still training or exporting")Model Loading Hangs
Symptoms:
- Loading process freezes
- No progress for extended period
Solutions:
1. Check Disk Space
import shutil
# Check available space
total, used, free = shutil.disk_usage("/")
print(f"Free space: {free // (2**30)} GB")2. Clear Model Cache
# Remove cached models
rm -rf ~/.datature/vi/models/3. Force Re-download
model = ViModel(
run_id="your-run-id",
overwrite=True # Force fresh download
)Slow Inference
General Slowness
Symptoms:
- Inference takes longer than expected
- Low throughput
Solutions:
1. Use GPU
import torch
# Check if GPU is being used
print(f"CUDA available: {torch.cuda.is_available()}")
# Force GPU usage
model = ViModel(
run_id="your-run-id",
device_map="cuda"
)2. Enable Mixed Precision
model = ViModel(
run_id="your-run-id",
dtype="float16", # Faster than float32
device_map="auto"
)3. Use Flash Attention 2
model = ViModel(
run_id="your-run-id",
attn_implementation="flash_attention_2",
dtype="float16"
)4. Use Batch Processing
# ✅ Good - batch inference
results = model(source=["img1.jpg", "img2.jpg", "img3.jpg"])
# ❌ Bad - sequential processing
for img in ["img1.jpg", "img2.jpg", "img3.jpg"]:
result, error = model(source=img)First Inference Slow
Symptoms:
- First inference much slower than subsequent ones
Solution:
This is normal due to model initialization. Perform a warm-up inference:
# Warm-up inference
_ = model(source="dummy_image.jpg")
# Subsequent inferences will be faster
result, error = model(source="real_image.jpg")Runtime Errors
File Not Found
Symptoms:
FileNotFoundError: [Errno 2] No such file or directory: 'image.jpg'
Solutions:
1. Use Absolute Paths
import os
# Get absolute path
image_path = os.path.abspath("image.jpg")
result, error = model(source=image_path)2. Verify File Exists
from pathlib import Path
image_path = "image.jpg"
if Path(image_path).exists():
result, error = model(source=image_path)
else:
print(f"File not found: {image_path}")Unsupported Image Format
Symptoms:
PIL.UnidentifiedImageError: cannot identify image file
Solutions:
1. Check Image Format
from PIL import Image
try:
img = Image.open("image.jpg")
print(f"Format: {img.format}, Size: {img.size}")
except Exception as e:
print(f"Invalid image: {e}")2. Convert to Supported Format
from PIL import Image
# Convert to JPEG
img = Image.open("image.webp")
img = img.convert("RGB")
img.save("image.jpg", "JPEG")
# Now use converted image
result, error = model(source="image.jpg")Supported formats: .jpg, .jpeg, .png, .bmp, .gif, .tiff, .tif, .webp
Invalid Prompt Error
Symptoms:
ValueError: user_prompt length must match sources length
Solution:
Ensure prompt list matches image list length:
# ✅ Good - matching lengths
results = model(
source=["img1.jpg", "img2.jpg"],
user_prompt=["Prompt 1", "Prompt 2"]
)
# ❌ Bad - length mismatch
results = model(
source=["img1.jpg", "img2.jpg"],
user_prompt=["Prompt 1"] # Wrong!
)Result Issues
No Grounded Phrases
Symptoms:
result.grounded_phrasesattribute missing- Expected bounding boxes not returned
Solutions:
1. Check Task Type
# For phrase grounding, use appropriate prompt or omit it
result, error = model(
source="image.jpg",
user_prompt="Locate all objects" # Or omit for default
)
# Check if grounded phrases available
if hasattr(result, 'grounded_phrases'):
print(f"Found {len(result.grounded_phrases)} objects")
else:
print("No grounded phrases (may be VQA task)")2. Verify Model Supports Phrase Grounding
Not all models support phrase grounding. Check model capabilities:
info = ViModel.inspect(run_id="your-run-id")
print(f"Task type: {info.task_type}")Unexpected Output Format
Symptoms:
- Result format different than expected
- Attributes missing
Solution:
Always check for attributes before accessing:
result, error = model(source="image.jpg")
if error is None:
# Always check before accessing
if hasattr(result, 'caption'):
print(f"Caption: {result.caption}")
if hasattr(result, 'grounded_phrases'):
for phrase in result.grounded_phrases:
print(f"Phrase: {phrase.phrase}")Poor Quality Results
Symptoms:
- Inaccurate predictions
- Low confidence outputs
Solutions:
1. Adjust Generation Config
result, error = model(
source="image.jpg",
user_prompt="Describe this image",
generation_config={
"temperature": 0.0, # More deterministic
"max_new_tokens": 256,
"do_sample": False
}
)2. Improve Prompt Quality
# ❌ Bad - vague prompt
"Tell me about this"
# ✅ Good - specific prompt
"What objects are visible in this image and where are they located?"3. Check Model Training
Verify the model was properly trained:
client = vi.Client()
run = client.runs.get(run_id="your-run-id")
print(f"Status: {run.status.phase}")
print(f"Metrics: {run.metrics if hasattr(run, 'metrics') else 'N/A'}")Permission Errors
Access Denied
Symptoms:
PermissionError: [Errno 13] Permission denied
HTTPError: 403 Forbidden
Solutions:
1. Check API Key Permissions
Ensure your secret key has the necessary permissions:
import vi
client = vi.Client()
org = client.organizations
print(f"Organization: {org.name}")
print(f"Access level: {org.role if hasattr(org, 'role') else 'Unknown'}")2. Verify File Permissions
import os
# Check file permissions
file_path = "image.jpg"
if os.access(file_path, os.R_OK):
print("File is readable")
else:
print("Permission denied - cannot read file")3. Check Write Permissions
For saving results:
import os
output_dir = "./results"
if os.access(output_dir, os.W_OK):
print("Directory is writable")
else:
print("Permission denied - cannot write to directory")
# Create directory with proper permissions
os.makedirs(output_dir, exist_ok=True)Debugging Tips
Enable Detailed Logging
import logging
# Set logging level
logging.basicConfig(level=logging.DEBUG)
# Run inference with detailed logs
model = ViModel(run_id="your-run-id")
result, error = model(source="image.jpg")Check System Resources
import psutil
import torch
# CPU and RAM
cpu_percent = psutil.cpu_percent()
ram_percent = psutil.virtual_memory().percent
print(f"CPU Usage: {cpu_percent}%")
print(f"RAM Usage: {ram_percent}%")
# GPU
if torch.cuda.is_available():
for i in range(torch.cuda.device_count()):
allocated = torch.cuda.memory_allocated(i) / 1e9
reserved = torch.cuda.memory_reserved(i) / 1e9
print(f"GPU {i} - Allocated: {allocated:.2f}GB, Reserved: {reserved:.2f}GB")Minimal Reproducible Example
Create a minimal test case:
from vi.inference import ViModel
# Minimal test
model = ViModel(run_id="your-run-id")
result, error = model(source="test.jpg")
if error:
print(f"Error: {type(error).__name__}: {error}")
else:
print(f"Success: {result.caption[:50]}...")Getting Help
If you're still experiencing issues:
1. Check Documentation
2. Review Error Messages
Read error messages carefully - they often contain the solution:
try:
model = ViModel(run_id="invalid-id")
except Exception as e:
print(f"Error type: {type(e).__name__}")
print(f"Error message: {str(e)}")
# Often contains hints about what went wrong3. Contact Support
If issues persist:
- Visit Contact Us
- Check Datature Community
- Review Vi SDK Changelog for known issues
4. Provide Details
When reporting issues, include:
- Vi SDK version (
vi.__version__) - Python version
- Operating system
- GPU details (if applicable)
- Complete error message
- Minimal reproducible code
import vi
import sys
import torch
print(f"Vi SDK: {vi.__version__}")
print(f"Python: {sys.version}")
print(f"PyTorch: {torch.__version__}")
print(f"CUDA: {torch.version.cuda if torch.cuda.is_available() else 'N/A'}")Related resources
- Inference overview — Getting started with inference
- Optimize performance — Memory management and GPU utilization
- Load models — Model loading from Datature or HuggingFace
- Run inference — Single and batch inference guide
- Handle results — Process prediction outputs
- Configure generation — Adjust generation parameters
- Vi SDK installation — Install inference dependencies
- Vi SDK getting started — Quick start guide for the SDK
- Contact us — Get help from the Datature team
- Vi SDK changelog — Latest updates and known issues
- API resources — Complete SDK reference
Need help?
We're here to support your VLMOps journey. Reach out through any of these channels:
Updated about 1 month ago
