Start a Training Run

📍
Step 3 of 3: Start a Training Run
Part of the training workflow quickstart. Next: Deploy and test.

Launch training with your configured workflow and monitor your VLM's learning progress.

⏱️ Time: ~2 minutes to start (training runs 1-3 hours)

📋
Prerequisites

Configured workflow with system prompt, dataset, and model settings

Compute Credits for GPU resources

Dataset validated and ready for training

Understanding of GPU types and pricing

Learn about training workflows →

Start your training run

From your workflow canvas, click Run Training to open the training configuration dialog.

You'll configure four steps before starting training.

1. Advanced Settings

Configure checkpoint and evaluation settings.

Checkpoint Strategy — Configure frequency of evaluation checkpoints

Advanced Evaluation — Enable to view advanced evaluation metrics and previews during training

📘
For quickstart
Keep default settings. You can adjust these in future training runs for more control.

Click Next to continue.

2. Hardware Configuration

Select your GPU type and quantity for training.

Choose between:

Vi Cloud — Train on Vi's GPU infrastructure (recommended for quickstart)
Custom Runner — Train on your own infrastructure (coming soon)

GPU Type — Select from available GPU models

Number of GPUs — Choose quantity (1, 2, 4, or more depending on GPU type)

📘
For quickstart
Start with 1× NVIDIA T4 GPU. Larger models and batch sizes require more GPU memory (VRAM).

View complete GPU list and pricing →

Click Next to continue.

3. Dataset Validation

The system validates your dataset to ensure it's ready for training.

Wait for validation to complete. If issues are found, review and fix them before proceeding.

Click Next when validation shows "Ready for Training".

4. Review Summary

Review your complete training configuration.

The summary shows:

System Prompt character count
Model architecture
Batch size
Training epochs
Usage Multiplier (GPU credit consumption rate)

Click Run Training to start.

Monitor training progress

Once started, you can monitor your training in real-time. Training typically takes a few hours depending on dataset size, model architecture, epochs, and GPU type.

📘
Training runs in the background
You can safely close the browser. Training continues, and you'll be notified when complete.

Learn about monitoring training →

Common questions

Can I stop training early?

Yes! You can cancel a run from the training dashboard. Saved checkpoints are preserved.

Learn about managing runs →

What if training fails?

Check the training logs for error details. Common issues:

Insufficient GPU memory — Reduce batch size or use larger GPU
Dataset errors — Verify annotations are correct
Configuration issues — Review workflow settings

How do I know if my model is good?

After training:

Check validation metrics
Compare with baseline performance
Test on unseen data
Evaluate model predictions visually

Learn about evaluation →

Can I run multiple trainings at once?

Yes! You can start multiple runs in parallel. Each run uses separate GPU resources and consumes Compute Credits independently.

How much will training cost?

Training costs depend on GPU type and duration. Each GPU type has a usage multiplier that determines Compute Credit consumption per minute.

Example: 1× NVIDIA T4 for 60 minutes = 60 Compute Credits

View GPU pricing and usage multipliers →

After training completes

✅
Training complete!
Your model is trained and ready for evaluation and deployment.

Once training finishes, you can:

Evaluate performance — Review metrics and test set results
Compare runs — Analyze different configurations
Download the model — Export for deployment
Start new runs — Experiment with different settings

What's next?

Deploy and Test Your Model →

Download your trained model and test it with new images to see how it performs.

Related resources

Train a model — Complete guide to VLM training
Configure training settings — Advanced training configuration
Monitor a run — Track training progress in real-time
Manage runs — Kill or delete runs
Evaluate a model — Assess model performance
Resource usage — Understand GPU pricing and Compute Credits
Create a workflow — Set up training workflows
Deploy and test — Download and test your model
Quickstart overview — Complete quickstart guide
Configure your model — Select model architecture and settings
Metrics — Understand training metrics

Need help?

We're here to support your VLMOps journey. Reach out through any of these channels:

Contact Support

Get help from our team via our website or email us at [email protected]

Join Our Community

Connect with other Datature users, share ideas, and get community support on Slack

Explore Resources

Read our Blog
Check out GitHub
Watch Tutorials

Schedule a Demo

Book a personalized demo to see how Datature Vi can accelerate your vision AI projects