Dataset Types

Explore the different dataset types available in Datature Vi for training vision-language models.

Dataset types define the structure and purpose of your vision AI training data. Datature Vi supports multiple dataset types to accommodate different vision-language model tasks, from object localization to conversational image understanding.


Available dataset types


Choosing the right dataset type

Select your dataset type based on your application requirements:

Dataset TypeBest ForOutputFlexibility
Phrase GroundingObject localization with natural languageBounding boxes with locationsMedium - predefined structure
VQAQuestion-answering about imagesNatural language answersMedium - Q&A format
FreeformCustom annotation requirementsUser-defined formatsHigh - fully customizable

Common use cases by type

Phrase Grounding applications

  • Robotics — Locate objects using natural descriptions
  • Image editing — Select regions with text commands
  • Autonomous vehicles — Identify objects with flexible queries
  • Warehouse automation — Find items using natural language

Explore Phrase Grounding →

Visual Question Answering applications

  • Quality inspection — Ask questions about defects
  • Accessibility — Describe images for visually impaired users
  • Content moderation — Query image content
  • Inventory management — Get information through questions

Explore VQA →

Freeform applications

🚧 Coming soon

  • Research projects — Novel computer vision tasks
  • Medical imaging — Custom diagnostic annotations
  • Scientific imaging — Domain-specific labels
  • Hybrid requirements — Complex multi-modal annotations

Explore Freeform →


Getting started

Ready to create your dataset? Follow these steps:


Learn more