Training and Evaluation

Datature Vi fine-tunes vision-language models (VLMs) on your annotated data. This section covers each stage of the training-to-inference pipeline: how to write system prompts that guide your model, what training settings control, how LoRA and quantization reduce cost, what evaluation metrics tell you, and how inference generates output.

In this section

What Are System Prompts?

How to instruct your model with role, focus, output format, and hallucination guards.

How Does VLM Training Work?

Epochs, batch size, learning rate, loss curves, and validation splits in plain language.

LoRA and Quantization

How LoRA reduces training cost and quantization shrinks memory. NF4, FP4, QLoRA explained.

How Do I Evaluate My Model?

IoU, F1, BLEU, BERTScore, and what good scores look like for phrase grounding and VQA.

How Does Inference Work?

Token generation, temperature, top-p, top-k, and how to control model output.

Next steps

Annotation Guide

Create effective training data with good annotation practices.

Configure Your Model

Set up training parameters and start a training run.

Deployment

Download your trained model or deploy it with the Vi SDK or NVIDIA NIM.