Train a Model

Fine-tune vision-language models on your annotated data using Datature Vi's five-stage training workflow.

Training a vision-language model (VLM) in Datature Vi follows a five-stage workflow. You create a reusable workflow that captures your system prompt, dataset split, and model configuration, then launch training runs against it.

Before You Start

New to Datature Vi? Learn what it does or follow the quickstart.

By the end of this guide

Configure and launch a fine-tuning run for a vision-language model using the five-stage workflow.

The five-stage workflow

Each stage builds on the previous one. A workflow stores all configuration so you can run the same setup multiple times without reconfiguring.

Next steps

Create A Workflow

Start here if you have a training project and dataset ready.

Model Architectures

Compare Qwen2.5-VL, NVILA-Lite, Cosmos-Reason1, and InternVL3.5 to pick the right one.

Resource Usage

Understand Compute Credits and GPU pricing before you launch a run.