Quickstart

Get from raw images to training a vision-language model in about 30 minutes with Datature Vi.

Datature Vi is a platform for building vision-language models (VLMs) without managing infrastructure. You prepare a labeled dataset, configure a training workflow, launch a run, and download the trained weights, all in one place.

Choose your starting point


This quickstart covers three focused stages. Each stage has its own step-by-step guide, and the whole process takes about 30 minutes of active work.

You should see
Datature Vi platform dashboard

Prepare a dataset, train a model, and deploy and test it with the Vi SDK.

What you'll need

  • A Datature Vi account (free sign-up available)
  • 20 or more images for your use case
  • Annotations for those images, or a plan to create them in Vi

Next steps

Work through the three stages in order. Start with dataset preparation.