Create a Dataset
Learn how to create a new dataset in Datature Vi, configure settings, and choose your vision task type.
Datasets are the foundation of your computer vision projects. They store your images along with their annotations, enabling you to organize, manage, and prepare data for training.
This document explains how to create a new dataset in Datature Vi, from selecting your vision task type to configuring storage settings.
Quick workflowCreate dataset (you are here) → Upload images → Add annotations → Train a model
PrerequisitesBefore creating a dataset, ensure you have:
- An active Datature Vi account
- Access to a workspace or organization
- A clear understanding of your vision task requirements
Create your dataset
Follow these steps to create a new dataset:
- In your Datature Vi dashboard, click Dataset in the left sidebar
- Click Create Dataset
This opens the dataset creation wizard, which guides you through four configuration steps.
Choose your dataset type
Select the vision task you want to accomplish. Your choice determines how you'll annotate and train models with this dataset.
Best for: Object detection, defect detection, product identification, inventory management
Phrase grounding enables you to locate and classify multiple objects within images using bounding boxes. This task type is ideal when you need to identify where objects are located and what they are.
Common use cases:
- Manufacturing defect detection
- Retail product recognition
- Traffic monitoring and vehicle detection
- Medical imaging for region identification
Click Next after selecting your type.
Can I change the dataset type later?No, the dataset type cannot be changed after creation. If you need a different task type, you'll need to create a new dataset.
Not sure which to choose?
- Choose Phrase Grounding if you need to find and locate specific objects with bounding boxes
- Choose VQA if you need to ask questions and get natural language answers
- Choose Freeform (coming soon) if you need custom annotation schemas for specialized use cases
Choose your data type
Select your input data format:
Choose Image for:
- Individual photos or screenshots
- Extracted frames from videos
- Scanned documents or medical imagery
- Any static visual content
Click Next to continue.
Configure settings
Enter your dataset details and configuration:
Dataset name
Choose a descriptive, memorable name for your dataset.
Best practices:
- Use clear, descriptive names (e.g., "Factory Defects 2024" instead of "Dataset1")
- Include version numbers if maintaining multiple iterations
- Follow your organization's naming conventions
- You can rename it later if needed
Dataset description
(optional) Add context about your dataset's purpose, contents, or specifications.
Recommended information:
- Data source and collection date
- Annotation guidelines or standards
- Expected use cases
- Any special preprocessing applied
Dataset localization
Select your storage region preference:
- Multi-Region — Recommended for best performance and reliability. Data is distributed across multiple regions for optimal access speed and redundancy.
- Single Region — Data is stored in a specific geographic region (useful for compliance requirements)
Recommendation: Choose Multi-Region unless you have specific data sovereignty or compliance requirements.This setting cannot be changed after dataset creation.
Click Next to review.
Review and create
Verify all your settings in the summary screen:
Review:
- Dataset type
- Data type
- Dataset name and description
- Localization settings
If everything looks correct, click Create Dataset to finish.
Dataset created successfully!Your dataset is now ready. You can start uploading assets and adding annotations.
Next steps
After creating your dataset, you can:
- Upload assets — Add images to your dataset (required next step)
- Upload annotations — Import existing annotations if you have them
- Annotate data — Start labeling your data manually or with AI assistance
- View dataset insights — Explore statistics and analytics about your dataset
- Manage your dataset — Rename, delete, or download your dataset
Related resources
- Manage datasets — Rename, delete, and organize your datasets
- Download data — Export your datasets and annotations
- Phrase grounding concepts — Deep dive into object detection tasks
- Visual question answering concepts — Understanding VQA capabilities
- Training workflows — Train VLMs with your dataset
- Upload data — Add images and annotations to datasets
- Annotate data — Create annotations for training
- View dataset insights — Analyze dataset statistics and quality
- Quickstart — Complete end-to-end workflow
- Team settings — Add members to collaborate
- Vi SDK — Programmatic dataset management
- Create a training project — Set up training environment
Need help?
We're here to support your VLMOps journey. Reach out through any of these channels:
Updated about 1 month ago
