Train a Model
Fine-tune vision-language models on your annotated data using Datature Vi's five-stage training workflow.
Training a vision-language model (VLM) in Datature Vi follows a five-stage workflow. You create a reusable workflow that captures your system prompt, dataset split, and model configuration, then launch training runs against it.
- A training project already created in Datature Vi
- A dataset with annotations ready for use
- Compute Credits available for GPU training
New to Datature Vi? Learn what it does or follow the quickstart.
Configure and launch a fine-tuning run for a vision-language model using the five-stage workflow.
The five-stage workflow
Each stage builds on the previous one. A workflow stores all configuration so you can run the same setup multiple times without reconfiguring.
1. Create A Workflow
Open the workflow canvas and connect the System Prompt, Dataset, and Model nodes into a reusable training configuration.
2. Configure Your System Prompt
Write natural language instructions that tell the model what to detect or answer. Default prompts are provided for phrase grounding and VQA.
3. Configure Your Dataset
Select a dataset and set the train-test split ratio and shuffle settings to control how training data is divided.
4. Configure Your Model
Choose a VLM architecture and tune hyperparameters.
5. Start A Training Run
Configure training settings, select GPU hardware, validate your dataset, and start a training run.
Next steps
Updated 27 days ago
