Start a Training Run
Launch training and monitor your VLM's progress in real-time.
Step 3 of 3: Start a Training RunPart of the training workflow quickstart. Next: Deploy and test.
Launch training with your configured workflow and monitor your VLM's learning progress.
⏱️ Time: ~2 minutes to start (training runs 1-3 hours)
Prerequisites
- Configured workflow with system prompt, dataset, and model settings
- Compute Credits for GPU resources
- Dataset validated and ready for training
- Understanding of GPU types and pricing
Start your training run
From your workflow canvas, click Run Training to open the training configuration dialog.
You'll configure four steps before starting training.
1. Advanced Settings
Configure checkpoint and evaluation settings.
Checkpoint Strategy — Configure frequency of evaluation checkpoints
Advanced Evaluation — Enable to view advanced evaluation metrics and previews during training
For quickstartKeep default settings. You can adjust these in future training runs for more control.
Click Next to continue.
2. Hardware Configuration
Select your GPU type and quantity for training.
Choose between:
- Vi Cloud — Train on Vi's GPU infrastructure (recommended for quickstart)
- Custom Runner — Train on your own infrastructure (coming soon)
GPU Type — Select from available GPU models
Number of GPUs — Choose quantity (1, 2, 4, or more depending on GPU type)
For quickstartStart with 1× NVIDIA T4 GPU. Larger models and batch sizes require more GPU memory (VRAM).
View complete GPU list and pricing →
Click Next to continue.
3. Dataset Validation
The system validates your dataset to ensure it's ready for training.
Wait for validation to complete. If issues are found, review and fix them before proceeding.
Click Next when validation shows "Ready for Training".
4. Review Summary
Review your complete training configuration.
The summary shows:
- System Prompt character count
- Model architecture
- Batch size
- Training epochs
- Usage Multiplier (GPU credit consumption rate)
Click Run Training to start.
Monitor training progress
Once started, you can monitor your training in real-time. Training typically takes a few hours depending on dataset size, model architecture, epochs, and GPU type.
Training runs in the backgroundYou can safely close the browser. Training continues, and you'll be notified when complete.
Learn about monitoring training →
Common questions
Can I stop training early?
Yes! You can cancel a run from the training dashboard. Saved checkpoints are preserved.
What if training fails?
Check the training logs for error details. Common issues:
- Insufficient GPU memory — Reduce batch size or use larger GPU
- Dataset errors — Verify annotations are correct
- Configuration issues — Review workflow settings
How do I know if my model is good?
After training:
- Check validation metrics
- Compare with baseline performance
- Test on unseen data
- Evaluate model predictions visually
Can I run multiple trainings at once?
Yes! You can start multiple runs in parallel. Each run uses separate GPU resources and consumes Compute Credits independently.
How much will training cost?
Training costs depend on GPU type and duration. Each GPU type has a usage multiplier that determines Compute Credit consumption per minute.
Example: 1× NVIDIA T4 for 60 minutes = 60 Compute Credits
After training completes
Training complete!Your model is trained and ready for evaluation and deployment.
Once training finishes, you can:
- Evaluate performance — Review metrics and test set results
- Compare runs — Analyze different configurations
- Download the model — Export for deployment
- Start new runs — Experiment with different settings
What's next?
Download your trained model and test it with new images to see how it performs.
Related resources
- Train a model — Complete guide to VLM training
- Configure training settings — Advanced training configuration
- Monitor a run — Track training progress in real-time
- Manage runs — Kill or delete runs
- Evaluate a model — Assess model performance
- Resource usage — Understand GPU pricing and Compute Credits
- Create a workflow — Set up training workflows
- Deploy and test — Download and test your model
- Quickstart overview — Complete quickstart guide
- Configure your model — Select model architecture and settings
- Metrics — Understand training metrics
Need help?
We're here to support your VLMOps journey. Reach out through any of these channels:
Updated about 1 month ago
