Upload Data
Add images and annotations to your Datature Vi dataset before training. Choose between the web interface and the Vi SDK based on your dataset size.
Before you can train a model in Datature Vi, your dataset needs two things: images (called assets) and annotations that describe what those images contain. This page covers both upload paths and points you to the right guide for each one.
Create a dataset if you don't have one yet.
New to Datature Vi? Learn what it does or follow the quickstart.
Upload images and import annotations in COCO, YOLO, Pascal VOC, CSV, or Vi JSONL formats into your dataset.
What goes into a dataset
Assets
Images and videos are the visual data your model learns from. Datature Vi supports most common image and video formats.
You can upload images through the browser using drag-and-drop, or programmatically using the Vi SDK for large batches and automated pipelines.
Annotations
Annotations or labels tell the model what to look for. Depending on your dataset type, annotations are bounding boxes linked to text phrases (for phrase grounding) or question-answer pairs (for visual question answering).
You can import existing annotations from COCO, YOLO, Pascal VOC, CSV, or Vi JSONL formats, or create them in the visual annotator.
Upload order matters
Always upload images before annotations. The annotation importer matches labels to images by filename. If the image does not exist in the dataset yet, the annotation has nothing to attach to.
Annotations are matched to images by filename. Upload your images before importing annotation files.
Choose your upload method
Web interface
The browser upload is best for small to medium datasets (under 1,000 assets) and one-time or occasional uploads. No setup required: drag files onto the upload area or use the file browser.
Vi SDK
The SDK is better for large datasets (1,000+ assets), automated pipelines, and repeated uploads. It handles batch processing and gives you control over duplicate file behavior.
Supported formats
Images: JPEG, PNG, TIFF, BMP, WebP, GIF, and others. For format-specific details and size limits, see Uploading images.
Videos: MP4, AVI, WEBM, MOV, MKV, WMV, and others. For format-specific details and size limits, see Uploading videos.
Annotations: COCO JSON, Pascal VOC XML, YOLO Darknet TXT, YOLO Keras/PyTorch TXT, CSV (four-corner or width/height), Vi JSONL. VQA datasets only support Vi JSONL. For full format specs, see Uploading annotations.
Next steps
Updated 27 days ago
