Safety Monitoring

Train a VLM to detect PPE violations, exclusion zone breaches, and unsafe behaviors from construction and industrial site camera feeds.

IP camera view of an industrial site with workers in hard hats and coveralls

Construction sites and industrial facilities depend on strict safety rules: hard hats on at all times, high-visibility vests in active zones, no one inside exclusion areas during equipment operation. Enforcing those rules with periodic walk-throughs catches some violations but misses many more between audits.

Datature Vi trains an AI model on your site's own camera feeds. You label images showing correct PPE use and images showing violations, and the model learns to tell them apart. Once deployed, it monitors feeds continuously and flags violations as they happen, not hours or days later.

For an interactive overview of this application, visit the construction safety use case on vi.datature.com.


Why continuous monitoring matters

Safety audits are snapshots. An inspector walks the site once or twice a shift, notes what they see, and moves on. Between audits, violations go unrecorded. Camera-based monitoring with Datature Vi fills that gap by watching every feed, every minute.

The value is not replacing safety officers. It is giving them real-time awareness of what is happening across the entire site, so they can focus their time where it matters most.

No ML experience required

This guide is written for safety and operations teams, not data scientists. You need site camera images. Budget about 30 minutes to set up your first training run; after that, GPU training typically runs 1-3 hours.


Common applications

Task
What the model does
PPE compliance
Flags workers missing hard hats, vests, gloves, or safety glasses
Exclusion zone monitoring
Detects people inside restricted areas during equipment operation
Unsafe behavior detection
Identifies workers near moving machinery without proper clearance
Incident documentation
Generates timestamped violation records from camera frames

Choose your task type

Approach
Best for
Output
Visual Question Answering (VQA)
PPE compliance checks, yes/no safety questions
Text answer: "No, the worker near the crane is not wearing a hard hat."
Phrase Grounding
Locating the person or area of concern in the image
Bounding box around the worker or zone violation
Freeform Text (JSON)
Structured violation reports with multiple fields
JSON: {"violation": "missing_hardhat", "zone": "crane_area", "severity": "high"}

Recommended starting point: VQA for straightforward PPE checks. Use phrase grounding when you need to highlight who or where the violation is in the frame.


Annotation examples

Use Visual Question Answering with a standard safety question:

Image
Question
Answer
Worker with full PPE
Is every worker in this image wearing the required PPE?
Yes. Both workers are wearing hard hats, high-visibility vests, and safety glasses.
Worker missing hard hat
Is every worker in this image wearing the required PPE?
No. The worker on the right side of the frame is not wearing a hard hat.
Empty exclusion zone
Is anyone inside the crane exclusion zone?
No. The exclusion zone is clear.
Person in exclusion zone
Is anyone inside the crane exclusion zone?
Yes. One person is standing inside the marked exclusion zone near the crane base.

Tips:

  • Use the same question across all images of the same check type
  • Describe locations relative to landmarks visible in the camera view ("near the crane base," "at the east entrance")
  • Include images from different times of day and lighting conditions

Deploy and test

from vi.inference import ViModel

model = ViModel(
    run_id="your-run-id",
    secret_key=".your-secret-key.",
    organization_id="your-organization-id",
)

result, error = model(
    source="site_camera_frame.jpg",
    user_prompt="Is every worker in this image wearing the required PPE?"
)

if error is None:
    print(result.result.answer)

Structured violation reports

For integration with incident management systems, use structured data extraction:

import json
from vi.inference import ViModel

model = ViModel(
    run_id="your-run-id",
    secret_key=".your-secret-key.",
    organization_id="your-organization-id",
)

result, error = model(
    source="site_camera_frame.jpg",
    user_prompt="Check this frame for safety violations.",
    generation_config={"temperature": 0.0, "do_sample": False}
)

if error is None:
    report = json.loads(result.result)
    # {"violation_found": true, "type": "missing_hardhat", "location": "near crane base", "severity": "high"}
    if report["violation_found"]:
        print(f"VIOLATION: {report['type']} at {report['location']}")

Training tips

Match your camera views exactly: train on frames from the same cameras, angles, and lighting conditions used in production. A model trained on ground-level photos will perform poorly on elevated camera angles.

Cover shift variations: include images from day shifts, night shifts, and transition periods. Lighting changes between shifts affect how PPE appears in the frame.

Include partial compliance: real violations are often partial (hard hat on but chin strap unfastened, vest worn but unzipped). Include these edge cases in training to teach the model the difference between full and partial compliance.

Balance violation and compliant examples: if 95% of your training images show full compliance, the model may under-report violations. Include enough violation examples to teach the distinction.


Next steps

Structured Data Extraction

Return structured JSON violation reports for integration with incident management systems.

Phrase Grounding

Draw bounding boxes around people or zones for visual violation highlighting.

Chain-of-Thought Reasoning

Multi-step safety checks: PPE first, then zone compliance, then behavior.