AI-Assisted Tools

Speed up phrase grounding and freeform image annotation with IntelliScribe's auto-caption generation and automatic phrase-box linking.

IntelliScribe is Datature Vi's AI-assisted annotation tool for phrase grounding and freeform image annotation. It reduces manual annotation work by generating text automatically: for phrase grounding, it creates captions and links phrases to bounding boxes; for freeform image annotation, it generates text content that you can edit to match your schema. You still review and refine the output, but the repetitive text work goes much faster.

IntelliScribe is available in the phrase grounding and freeform image annotators. It does not apply to VQA or video annotations.

Before You Start

What IntelliScribe does

IntelliScribe provides AI features inside both the phrase grounding and freeform image annotators.

IntelliScribe Caption

When you press C in the annotator, IntelliScribe analyzes the image and generates text automatically.

  • In phrase grounding: The caption appears in the Phrase Grounding panel describing the main objects. You can accept it as-is or edit it with the Text tool (press T).
  • In freeform image annotation: The generated text appears in the Freeform panel. Edit it to match your annotation schema before saving.

The caption generation works well on common objects in clear images. For specialized equipment or domain-specific vocabulary, the AI caption gives you a starting point. Review and edit the terminology before continuing.

IntelliScribe Phrases

When you press P, IntelliScribe reads the current caption and your drawn bounding boxes, then creates phrase-box links automatically. Each phrase in the caption gets highlighted and connected to the box that most closely matches it.

You need at least one bounding box and a caption before running IntelliScribe Phrases. Review the output: the AI is accurate on clear, distinct objects, but may mismatch phrases when objects overlap or captions are ambiguous.

IntelliScribe Phrases is available in phrase grounding only, since freeform text annotations do not use bounding boxes.


How it fits into annotation

The AI-assisted workflow runs after you draw your boxes:

1

Open the Annotator tab

Open the Annotator tab

From the Dataset Overview page, click the Annotator tab to open the labeling interface.

You should see
The phrase grounding annotator showing highlighted phrases linked to bounding boxes on an image, with all phrase-box connections verified

Your AI-assisted annotation is complete when each phrase in the caption is linked to the correct bounding box and all connections are reviewed.

Manual phrase grounding annotation walkthrough


When to use AI assistance

Best for
  • Large datasets with hundreds or thousands of images
  • Common objects in standard photography conditions
  • Getting started quickly and refining later
  • Maintaining consistent terminology across a team
Not for
  • Highly specialized domains where generic terms won't work
  • Images with heavy occlusion or unusual angles
  • Small datasets where manual annotation is equally fast
  • Use cases where every phrase link must be exact

Manual vs AI-assisted annotation

Manual
AI-assisted
Speed
2-3 min per image
30-45 sec per image
Accuracy
Depends on annotator
Good, requires review
Consistency
Varies by person
High, consistent terminology
Domain terms
Correct for specialized needs
Needs editing
Best use
Small datasets, specialized domains
Large datasets, common objects

The recommended approach for most projects: use IntelliScribe for speed, then edit the caption for domain accuracy before running IntelliScribe Phrases.


Tips for better AI output

The quality of IntelliScribe's output depends on the image quality and how you use it.

Image quality:

  • Use clear, well-lit photos. Low-contrast or blurry images produce weaker captions.
  • Front-facing angles with minimal occlusion work best.
  • Higher resolution images generally yield more detailed captions.

Caption editing:

  • Replace generic terms with your domain vocabulary before drawing boxes. A caption that says "integrated circuit" when you need "FPGA chip" will produce incorrect phrase links.
  • Keep captions structured so phrases are distinctly separated. Overlapping phrases cannot be linked simultaneously.

After running IntelliScribe Phrases:

  • Check each link. Click any highlighted phrase to confirm it connects to the right box.
  • For incorrect links, press D and click the phrase to unlink it, then manually highlight the correct phrase.

Frequently asked questions

No. IntelliScribe works in the phrase grounding and freeform image annotators only. VQA annotation does not have AI-assisted tools.

Check your internet connection. IntelliScribe requires an active connection. If the connection is fine, refresh the page and try again. If the issue persists, try a different image to rule out a format problem with the current one.

Edit the caption with the Text tool (press T) before running IntelliScribe Phrases. The phrase-linking step uses your caption, so correcting the vocabulary first produces accurate links. Domain-specific annotation is faster when you treat IntelliScribe Caption as a first draft rather than a final output.

Two common causes: the caption has ambiguous or overlapping descriptions for nearby objects, or the boxes and phrases don't correspond to each other. To fix a wrong link, press D and click the highlighted phrase to unlink it. Then press H and manually highlight the correct phrase to relink it to the box.

It works, but link accuracy drops when objects are stacked or share visual features. For complex scenes, draw boxes for overlapping objects first, then run IntelliScribe Phrases and plan to do more manual corrections than usual. If a scene has more than 15-20 overlapping objects, manual phrase linking may be faster overall.


Do this with the Vi SDK

import vi

client = vi.Client(
    secret_key="your-secret-key",
    organization_id="your-organization-id"
)

result = client.annotations.upload(
    dataset_id="your-dataset-id",
    paths="annotations.jsonl",
    wait_until_done=True
)
print(f"Imported: {result.total_annotations}")

For more details, see the full SDK reference.

Related resources

Annotate For Phrase Grounding

Full step-by-step guide to phrase grounding annotation, including how IntelliScribe fits into the workflow.

Annotate With Freeform Text

Write freeform text annotations for images, with IntelliScribe for AI-generated starting text.

Dataset Overview

Check annotation coverage and quality after annotating your dataset.