Screenshot to HTML

Frontend development often starts with a design mockup. A designer creates the visual layout, and a developer translates it into HTML and CSS. That translation step is time-consuming, especially when the design uses a custom design system with specific tokens, spacing, and component conventions.

Datature Vi trains a model on your own design system. You pair screenshots of UI components with the corresponding HTML and CSS code, and the model learns to generate code that matches your conventions. New mockups go in, production-ready markup comes out.

This does not replace a frontend developer. It handles the mechanical translation from visual layout to code, freeing developers to focus on interactivity, state management, and business logic.

For an interactive overview of this application, visit the screenshot to HTML use case on vi.datature.com.

Common applications

Task

What the model does

Task type: Freeform Text

This use case uses the freeform text dataset type. Each training pair is a screenshot (image) and the corresponding HTML/CSS code (text annotation).

Annotation structure

Each image annotation contains the HTML and CSS that faithfully reproduces the screenshot:

<nav class="nav-bar">
  <a href="/" class="nav-logo">Aura Collective</a>
  <ul class="nav-links">
    <li><a href="/collections">Collections</a></li>
    <li><a href="/artisans">Artisans</a></li>
    <li><a href="/stories">Stories</a></li>
  </ul>
</nav>

<style>
.nav-bar {
  display: flex;
  align-items: center;
  justify-content: space-between;
  padding: var(--spacing-4) var(--spacing-8);
  background: var(--color-surface);
}
.nav-links {
  display: flex;
  gap: var(--spacing-6);
  list-style: none;
}
</style>

System prompt

You are a frontend code generator. Given a screenshot of a UI component or page, produce semantic HTML and CSS that reproduces the layout. Use the following design system conventions:
- CSS custom properties for colors (--color-*), spacing (--spacing-*), and typography (--font-*)
- BEM-style class naming
- Flexbox or Grid for layout
- No inline styles

Output only the HTML and CSS. Do not include JavaScript.

Deploy and test

from vi.inference import ViModel

model = ViModel(
    run_id="your-run-id",
    secret_key=".your-secret-key.",
    organization_id="your-organization-id",
)

result, error = model(
    source="mockup_screenshot.png",
    user_prompt="Generate HTML and CSS for this UI.",
    generation_config={"temperature": 0.0, "do_sample": False}
)

if error is None:
    html_code = result.result
    with open("output.html", "w") as f:
        f.write(html_code)
    print("HTML written to output.html")

Training tips

Use your own design system: the model should generate code that follows your team's conventions, not generic HTML. Train on screenshots paired with code that uses your actual CSS variables, class names, and component structure.

Start with components, not full pages: train on individual components first (nav bars, cards, forms, footers). Full-page generation works better once the model understands your component vocabulary.

Include multiple viewport sizes: if you want responsive output, include screenshots at mobile, tablet, and desktop widths, each paired with the responsive code.

Clean up training code: the code in your annotations should be the code you want the model to generate. Remove dead CSS, commented-out blocks, and framework boilerplate from training examples.