Download Annotations

Downloading annotations exports only your annotation data without the asset files. This includes phrase grounding annotations (captions, bounding boxes, grounded phrases), VQA annotations (question-answer pairs), and freeform annotations (coming soon). This is useful for lightweight backups, annotation analysis, format conversion, or sharing annotations without large image files.

📋
Annotation export formats

Vi JSONL — Datature's JSON Lines format for annotations, one record per line

TFRecord — TensorFlow's binary format optimized for training pipelines

Both formats support phrase grounding and VQA annotations from your datasets. Freeform annotations support is coming soon.

💻
Programmatic access with Vi SDK
You can also download annotations programmatically using the Vi SDK:

Download annotations only — Use client.annotations.download() convenience method

Multiple formats — Export in COCO, YOLO, Pascal VOC, or Vi formats

Automated exports — Integrate annotation downloads into your workflows

Parse and process — Directly load JSONL files in Python for analysis

Format conversion — Build custom scripts to convert Vi annotations to other formats

Learn more about Vi SDK annotations → | Vi SDK datasets →

Navigate to annotation export

In your organization, navigate to the Explorer section from the sidebar
Select the dataset containing the annotations you want to download
Click the Annotations tab in the Dataset Explorer header
The Annotations page displays the Export Annotations section with an Export button

📘
Dataset Explorer - Annotations tab
Access the Export Annotations section to download annotation data in various formats.

Export Vi JSONL annotations

Vi JSONL (JSON Lines) is Datature's native annotation format, storing each annotation record on a separate line for efficient streaming and processing.

Configure Vi JSONL export

Click the Export button in the Export Annotations section
The Annotation Export dialog opens
In the Export Format dropdown, select Vi JSONL

📘
Vi JSONL export configuration
Configure export settings for Vi JSONL annotation format.

Configure export options

Test Split Ratio (Optional)

Split your annotations into training and testing subsets:

Check the Enabled checkbox to activate test splitting
Enter a decimal value between 0 and 1:
- 0.2 — 20% test set, 80% training set
- 0.1 — 10% test set, 90% training set
- 0.3 — 30% test set, 70% training set
Leave unchecked to export all annotations together

Normalization

Select Normalized to organize annotations in a standardized structure:

Normalized — Annotations structured with consistent coordinate systems and formats
Ensures compatibility with training frameworks and processing tools

Export preview

Review the export summary before downloading:

You are about to export [N] assets with [M] annotations.

Verify the count matches your expectations.

Download Vi JSONL

Review your configuration:
- Export Format: Vi JSONL
- Test Split Ratio: Your chosen value or disabled
- Normalization: Normalized
Click the Export button to start the download
The export processes and downloads a .jsonl file containing your annotations

Export TFRecord annotations

TFRecord is TensorFlow's binary format optimized for high-performance data loading in training pipelines. Learn more about TFRecord format →

Configure TFRecord export

Click the Export button in the Export Annotations section
In the Annotation Export dialog, select TFRecord from the Export Format dropdown
Configure the same export options:
- Test Split Ratio — Optional train/test split
- Normalization — Select Normalized for standardized output
Review the export preview summary
Click Export to download the TFRecord file

🔧
TFRecord use cases

TensorFlow training — Native format for TensorFlow data pipelines

Performance optimization — Binary format loads faster than JSON

Large datasets — Efficient for datasets with thousands of annotations

Production pipelines — Ideal for production ML workflows

Vi JSONL format specification

Vi JSONL files contain one JSON object per line, making them efficient for streaming and parallel processing.

For the complete Vi JSONL format specification including:

Full record structure and field descriptions
Phrase grounding annotation format with examples
VQA annotation format with examples
Coordinate systems and normalization details

See the complete Vi JSONL specification →

Download annotations with Vi SDK

You can download annotations programmatically using the Vi SDK, which is useful for automated workflows and batch processing.

Download annotations only (recommended)

import vi

# Initialize client
client = vi.Client(
    secret_key="your-secret-key",
    organization_id="your-organization-id"
)

# Download annotations using the annotations API (recommended)
result = client.annotations.download(
    dataset_id="your-dataset-id",
    save_dir="./annotations"
)

print(result.summary())
print(f"Downloaded to: {result.save_dir}")

Alternative method:

# Download annotations using the datasets API
result = client.datasets.download(
    dataset_id="your-dataset-id",
    save_dir="./annotations",
    annotations_only=True
)

Both methods produce identical results. The client.annotations.download() method is a convenience wrapper that's more intuitive for annotation-specific downloads.

Download with custom export settings

from vi.api.resources.datasets.types import (
    DatasetExportSettings,
    DatasetExportFormat,
    DatasetExportOptions
)

# Configure JSONL export with train/test split
settings = DatasetExportSettings(
    format=DatasetExportFormat.VI_JSONL,
    options=DatasetExportOptions(
        normalized=True,
        split_ratio=0.2  # 20% test, 80% training
    )
)

# Download annotations with custom settings
result = client.annotations.download(
    dataset_id="your-dataset-id",
    export_settings=settings,
    save_dir="./annotations",
    show_progress=True
)

Download in different formats

from vi.api.resources.datasets.types import DatasetExportFormat

# Download in COCO format (phrase grounding datasets only)
result = client.annotations.download(
    dataset_id="your-dataset-id",
    export_settings={"format": DatasetExportFormat.COCO},
    save_dir="./coco_annotations"
)

# Download in YOLO format (phrase grounding datasets only)
result = client.annotations.download(
    dataset_id="your-dataset-id",
    export_settings={"format": DatasetExportFormat.YOLO_DARKNET},
    save_dir="./yolo_annotations"
)

# Download in TFRecord format
result = client.annotations.download(
    dataset_id="your-dataset-id",
    export_settings={"format": DatasetExportFormat.VI_TFRECORD},
    save_dir="./tfrecord_annotations"
)

Supported formats by dataset type:

Phrase Grounding: COCO, YOLO, Pascal VOC, CSV variants, Vi JSONL, Vi TFRecord
VQA: Vi JSONL, Vi TFRecord
Freeform: Vi JSONL (coming soon)

Parse downloaded annotations

import json
from pathlib import Path

# Load annotations from downloaded JSONL file
annotations_file = Path(result.save_dir) / "dump" / "annotations" / "annotations.jsonl"

with open(annotations_file, 'r') as f:
    for line in f:
        record = json.loads(line)
        print(f"Asset: {record['filename']}")
        print(f"  Annotations: {len(record['annotations'])}")

Complete Vi SDK Annotations API → | Complete Vi SDK Datasets API →

Common use cases

Annotation backup

Create lightweight backups without large asset files:

Export annotations in Vi JSONL format via web UI or Vi SDK
Store in version control or cloud storage
Restore annotations without re-uploading assets
Track annotation changes over time
Automate periodic backups using SDK scripts

Format conversion

Convert annotations for external tools:

Export in Vi JSONL for readable format
Write scripts to transform to target format
Import to other annotation or training platforms
Maintain annotation fidelity across conversions

Annotation analysis

Analyze annotation statistics and quality:

Export Vi JSONL for easy parsing
Use Python or other tools to process JSON
Calculate class distributions and statistics
Identify annotation errors or gaps

Training pipeline integration

Integrate annotations into ML pipelines:

Export TFRecord for TensorFlow pipelines
Export Vi JSONL for custom PyTorch dataloaders
Automate periodic exports for continuous training
Sync annotations with training infrastructure

Sharing annotations

Share labels without large asset files:

Export lightweight annotation files
Send to team members or collaborators
Review and validate labels separately
Reimport corrections or updates

Best practices

Choose the right format

Use Vi JSONL for general use, TFRecord for TensorFlow training

Include test splits

Enable test split ratio for training-ready exports

Version your annotations

Include version numbers or dates in export filenames

Validate exports

Verify annotation counts and sample records after export

Automate with SDK

Use Vi SDK to download and process annotations programmatically

Backup regularly

Export annotations at key project milestones

Working with exported annotations

Parse Vi JSONL in Python

import json

with open('annotations.jsonl', 'r') as f:
    for line in f:
        record = json.loads(line)
        asset_id = record['asset_id']
        filename = record['filename']
        annotations = record['annotations']

        # Process phrase grounding annotations
        for ann in annotations:
            if 'caption' in ann:
                print(f"Caption: {ann['caption']}")
                for phrase in ann['grounded_phrases']:
                    print(f"  Phrase: {phrase['phrase']}")
                    print(f"  BBox: {phrase['bbox']}")

            # Process VQA annotations
            elif 'interactions' in ann:
                for qa in ann['interactions']:
                    print(f"Q: {qa['question']}")
                    print(f"A: {qa['answer']}")

Load TFRecord in TensorFlow

TFRecord format can be loaded using TensorFlow's data pipeline for efficient training workflows:

import tensorflow as tf

# Load TFRecord dataset
dataset = tf.data.TFRecordDataset('annotations.tfrecord')

# Parse based on your annotation type (phrase grounding, VQA, or freeform)
# Feature descriptions will vary based on export format
# Refer to Vi SDK documentation for complete parsing examples

External resources:

TensorFlow TFRecord tutorial — Complete guide to reading and writing TFRecord files
TensorFlow data pipeline guide — Build efficient input pipelines with tf.data

Monitor export progress

Track your annotation exports in the Annotation Job History section:

Job type — Shows "Export" for download operations
User — Who initiated the export
File count — Number of annotation records
Status — In Progress, Finished, or Failed
Completion time — When the export finished

Troubleshooting

Export file is empty

No annotations — Verify your dataset has completed annotations
Filter active — Check if filters are excluding all assets
Export failed — Look for error messages in job history
Retry export — Try exporting again with correct settings

Cannot parse JSONL file

File encoding — Ensure file is UTF-8 encoded
Line endings — Check for proper newline characters
JSON errors — Validate each line is valid JSON
Truncated download — Redownload if file was interrupted

TFRecord not loading

Format version — Ensure TensorFlow version compatibility
Corrupted file — Redownload the export
Parser mismatch — Verify feature description matches export format
Dependencies — Install required TensorFlow packages

Missing annotations

Incomplete annotations — Some assets may lack annotations
Export filters — Check if annotation types were filtered
Verification needed — Review dataset before export
Partial export — Ensure export completed successfully

Test split issues

Inconsistent splits — Same split ratio may produce different results
Small datasets — Very small datasets may have uneven splits
Class imbalance — Some classes may be missing from test set
Random seed — Splits use random sampling without fixed seeds

Next steps

Vi SDK datasets API

Download and manage annotations programmatically

Download full dataset

Export both assets and annotations together

Upload annotations

Import annotations from external sources

Annotate data

Create and edit annotations in Vi

Train a model

Use your annotations for model training

Dataset insights

Analyze annotation statistics

Need help?

We're here to support your VLMOps journey. Reach out through any of these channels:

Contact Support

Get help from our team via our website or email us at [email protected]

Join Our Community

Connect with other Datature users, share ideas, and get community support on Slack

Explore Resources

Read our Blog
Check out GitHub
Watch Tutorials

Schedule a Demo

Book a personalized demo to see how Datature Vi can accelerate your vision AI projects