Case Study / 01
Latest Project · 2026
All work

The Video Engine.

An AI football documentary engine. Give it a title and a director's brief, and 15 chained agents produce the full package: research, blueprint, storyboard, narrated script, animated graphics, a clip sourcing sheet, and a timeline editor to ship from.

Python Flask Groq LLMs Remotion ElevenLabs 15 agents

Overview.

"The engine takes a single title, such as Why Brazil Stopped Producing Playmakers and a director's brief, and outputs a 10–15 minute football documentary that feels directed, continuous, and visually authored rather than templated together."

The Python side decides what to make. Fifteen agents run in sequence behind a Flask UI: entity & research agents pull background from Wikipedia and Google News, an analysis agent shapes a director's brief, then script, narration, graphics, player_image, motion, music_selector, and production agents build out the storyboard, score, and clip sourcing list — coordinated by orchestrator.py.

The Remotion side decides how it looks. A separate React project renders 30+ animated graphic templates — stat bars, radars, lineups, tactical boards, transfer records, league tables — driven by structured props the engine emits. ElevenLabs handles narration TTS; a centralised WorldStateRoot lets graphics share an infinite spatial canvas so consecutive scenes feel connected instead of resetting. The Flask UI then exposes a 5-step pipeline (title → context → blueprint → storyboard → render) plus a studio grid and a 4-track timeline editor for review and export.

Real renders.

From the engine
Graphic · Player Radar Why Brazil Stopped Producing Playmakers
Graphic · StatBars How money has changed football
Graphic · World Cup Why Brazil Stopped Producing Playmakers

From title to export.

The pipeline is a 5-step UI on top of a deterministic agent chain. Each step lands real artifacts on disk under output/<safe_name>/ so a run can be paused, reviewed, and resumed.

01 · Title

Title + director's brief

User enters a title; entity_agent extracts the subject, research_agent pulls Wikipedia + Google News, and an LLM drafts the director's brief.

02 · Context

Editable fact checklist

The brief is parsed into structured facts the user can tick on or off — checked items become MUST INCLUDE constraints downstream, written to context.md + facts.md.

03 · Blueprint

Act-by-act structure

An LLM lays out the 5-act blueprint with required facts injected as constraints. The user adjusts emphasis and act count before storyboard generation runs.

04 · Storyboard

70–90 scenes, drag & drop

The script agent expands the blueprint into a full storyboard. A drag-drop editor lets the user reorder, splice, and approve before the agent chain runs end-to-end via orchestrator.py.

05 · Render

Script · narration · graphics · clips

Sequential pipeline: script_agentnarration_agent (ElevenLabs TTS) → graphics_agent (Remotion renders) → production_agent (clip sourcing sheet).

06 · Studio + Edit

Review grid & 4-track timeline

Two interfaces: a render grid for approve / reject / re-render, and a 4-track timeline editor for splicing graphics, narration, music, and sourced clips before final export.

Architecture.

The system has two halves. The Python engine (Flask + Groq + ElevenLabs) decides what to make — research, script, narration, clip sourcing. The Remotion project decides how it looks — 30+ React graphic templates rendered to MP4.

A central WorldStateRoot in VideoSequence.tsx keeps consecutive graphics on a shared spatial canvas (cameraX = 0 → 1920 → 3840…), so a sequence of stat cards reads as one continuous camera move rather than 8 hard cuts. _break_data_runs in the orchestrator reorders scenes to avoid 3+ consecutive pure-stat blocks.

title + brief
     │
     ▼
┌─────────────── Python engine (Flask · port 5000) ─────────────┐
│                                                                │
│   entity_agent     →  research_agent   →  analysis_agent       │
│   (subject)           (Wikipedia/News)    (director's brief)   │
│                                ▼                               │
│                  context.md + facts.md  ←── user edits         │
│                                ▼                               │
│                     blueprint (5 acts) ←── user edits          │
│                                ▼                               │
│                     storyboard (70–90 scenes) ←── drag/drop    │
│                                ▼                               │
│   orchestrator.py runs in sequence:                            │
│     script_agent  → narration_agent (ElevenLabs TTS)           │
│     graphics_agent → production_agent (clip sheet)             │
│                                ▼                               │
└────────────────────────────────│───────────────────────────────┘
                                 │ render props (JSON)
                                 ▼
┌─────────────── Remotion (React · /remotiontest) ──────────────┐
│   30+ templates · WorldStateRoot · per-scene cameraX offset   │
│   StatBars · Radar · Lineup · Tactical · TransferRecord …      │
└────────────────────────────────│───────────────────────────────┘
                                 ▼
                renders/*.mp4 + manifest.json
                                 ▼
              /studio (review)  ·  /edit (4-track timeline)
                                 ▼
                          export → final cut