Create a Dataset

Learn how to create a new dataset in Datature Vi, configure settings, and choose your vision task type.

Datasets are the foundation of your computer vision projects. They store your images along with their annotations, enabling you to organize, manage, and prepare data for training.

This document explains how to create a new dataset in Datature Vi, from selecting your vision task type to configuring storage settings.

💡

Quick workflow

Create dataset (you are here) → Upload imagesAdd annotationsTrain a model


📋

Prerequisites

Before creating a dataset, ensure you have:

  • An active Datature Vi account
  • Access to a workspace or organization
  • A clear understanding of your vision task requirements

Create your dataset

Follow these steps to create a new dataset:

  1. In your Datature Vi dashboard, click Dataset in the left sidebar
Datature Vi dashboard with Dataset tab highlighted
  1. Click Create Dataset
Create Dataset button

This opens the dataset creation wizard, which guides you through four configuration steps.


Choose your dataset type

Select the vision task you want to accomplish. Your choice determines how you'll annotate and train models with this dataset.

Dataset type selection screen

Best for: Object detection, defect detection, product identification, inventory management

Phrase grounding enables you to locate and classify multiple objects within images using bounding boxes. This task type is ideal when you need to identify where objects are located and what they are.

Common use cases:

  • Manufacturing defect detection
  • Retail product recognition
  • Traffic monitoring and vehicle detection
  • Medical imaging for region identification

Learn more about Phrase Grounding →

Click Next after selecting your type.

💡

Can I change the dataset type later?

No, the dataset type cannot be changed after creation. If you need a different task type, you'll need to create a new dataset.

💡

Not sure which to choose?

  • Choose Phrase Grounding if you need to find and locate specific objects with bounding boxes
  • Choose VQA if you need to ask questions and get natural language answers
  • Choose Freeform (coming soon) if you need custom annotation schemas for specialized use cases

Choose your data type

Select your input data format:

Data type selection

Choose Image for:

  • Individual photos or screenshots
  • Extracted frames from videos
  • Scanned documents or medical imagery
  • Any static visual content

Click Next to continue.


Configure settings

Enter your dataset details and configuration:

Dataset configuration screen

Dataset name

Choose a descriptive, memorable name for your dataset.

Best practices:

  • Use clear, descriptive names (e.g., "Factory Defects 2024" instead of "Dataset1")
  • Include version numbers if maintaining multiple iterations
  • Follow your organization's naming conventions
  • You can rename it later if needed

Dataset description

(optional) Add context about your dataset's purpose, contents, or specifications.

Recommended information:

  • Data source and collection date
  • Annotation guidelines or standards
  • Expected use cases
  • Any special preprocessing applied

Dataset localization

Select your storage region preference:

  • Multi-Region — Recommended for best performance and reliability. Data is distributed across multiple regions for optimal access speed and redundancy.
  • Single Region — Data is stored in a specific geographic region (useful for compliance requirements)

Recommendation: Choose Multi-Region unless you have specific data sovereignty or compliance requirements.

This setting cannot be changed after dataset creation.

Click Next to review.


Review and create

Verify all your settings in the summary screen:

Dataset summary screen

Review:

  • Dataset type
  • Data type
  • Dataset name and description
  • Localization settings

If everything looks correct, click Create Dataset to finish.

Dataset created successfully!

Your dataset is now ready. You can start uploading assets and adding annotations.


Next steps

After creating your dataset, you can:


Related resources