Upload Videos

Add videos to a Datature Vi dataset. Covers supported formats, file size limits, processing details, data row consumption, and how to plan your quota usage.

Videos in Datature Vi are processed frame-by-frame for annotation. Once processing is complete, each frame behaves like an individual image. This guide covers the upload steps, supported formats, what happens during processing, and how to manage your data row quota.

Before You Start
  • A dataset created in Datature Vi. Create one now if you haven't yet.
  • A video file in a supported format (see the table below), under 512 GB.
  • An estimate of how many data rows your video will consume (see Data row consumption).
1

Open your dataset's Explorer tab

Open your dataset's Explorer tab

Open your dataset and click the Explorer tab. This is where your uploaded videos will appear.

You should see
A newly created empty dataset page in Datature Vi with no images

Your dataset is created when you see the dataset page with empty statistics.

Supported video formats

Supported video formats

File type
Extension
MIME type
MP4, F4V
.mp4, .f4v
video/mp4
MOV, MOVIE, QT
.mov, .movie, .qt
video/quicktime
AVI
.avi
video/x-msvideo
WebM
.webm
video/webm
MKV
.mkv
video/matroska
M4V
.m4v
video/x-m4v
FLV
.flv
video/x-flv
OGG, OGV
.ogg, .ogv
video/ogg
3GP
.3gp
video/3gpp
ASF, WMV
.asf, .wmv
video/x-ms-asf
RM, RMV
.rm, .rmv
application/vnd.rn-realmedia

For best compatibility and processing speed, use MP4 (H.264) or MOV formats.

File size limits

  • Maximum: 512 GB per video
  • Recommended: Under 100 MB for faster uploads and processing

For large videos, consider splitting them into shorter segments before uploading. This improves upload reliability and makes processing faster.

How Datature Vi processes videos

When you upload a video, the platform does three things before it is available for annotation:

  • Frame extraction: The video is broken into individual frames. Each frame becomes an annotatable asset.
  • Resolution optimization: Videos are resized so the longest dimension is 1024 pixels. Some lossy compression is applied to individual frames. This keeps the annotator fast without affecting annotation accuracy for bounding boxes and text labels.
  • Audio removal: Audio tracks are stripped. The platform focuses on visual content only.

Variable frame rate (VFR) conversion

Some recording devices (screen capture software, smartphones) produce videos with a variable frame rate, where the time gap between frames changes throughout the clip. Datature Vi converts VFR videos to a constant frame rate during processing, using the video's average frame rate. This ensures consistent frame spacing for annotation and training. The conversion happens automatically; you do not need to pre-process your videos.

Resolution and compression

Frames are resized so the longest edge is 1024 pixels, and light lossy compression is applied. This reduces file size for faster loading in the web annotator. The quality is high enough that bounding box placement, text reading, and object identification are not affected. If you need to reference the original resolution for any reason, keep your source video files.

Data row consumption

Each video frame consumes data rows, the same way a single image does. Frame count determines your total usage.

Formula: frames = duration (seconds) × frame rate (FPS) and each frame costs 5 data rows.

Data row consumption

Scenario
Duration
Frame rate
Frames
Data rows
Short clip
10 s
30 FPS
300
1,500
One minute
60 s
30 FPS
1,800
9,000
Low frame rate
10 s
15 FPS
150
750
Reduce Data Row Usage

Consider lower frame rates (e.g., 15 FPS) if your use case does not require high temporal resolution. Halving the frame rate halves the data row cost. See resource usage for quota information.

Troubleshooting

Processing time scales with video length, resolution, and frame rate. A 1-minute video at 30 FPS takes longer than a 10-second clip. Continue working on other tasks and wait for the notification. For long videos, split them into shorter segments before uploading.

This is expected. Videos are resized to 1024 pixels on the longest dimension and lightly compressed to keep the web annotator responsive. The quality is sufficient for annotation work.

Audio tracks are removed during processing. The platform is designed for visual annotation. Keep your original video files if you need audio context for reference.

Refresh the page or Explorer tab. If the issue persists after processing shows as complete, try re-uploading the video.

Do this with the Vi SDK

import vi

client = vi.Client(
    secret_key="your-secret-key",
    organization_id="your-organization-id"
)

result = client.assets.upload(
    dataset_id="your-dataset-id",
    paths="./videos/",
    wait_until_done=True
)
print(f"Uploaded: {result.total_succeeded} assets")

For more details, see the full SDK reference.

Next steps

Upload Annotations

Import existing frame-level annotations from supported formats.

Annotate Data

Label video frames using the visual annotator.

Resource Usage

Check your data row quota and understand how video consumes it.