Case Study / 01
Latest Project · 2026
All work

The Video Engine.

An AI football documentary engine. Give it a title and a director's brief, and 15 chained agents produce the full package: research, blueprint, storyboard, narrated script, animated graphics, a clip sourcing sheet, and a timeline editor to ship from.

Python Flask Groq LLMs Remotion ElevenLabs 15 agents

Overview.

"The engine takes a single title, such as Why Brazil Stopped Producing Playmakers and a director's brief, and outputs a 10–15 minute football documentary that feels directed, continuous, and visually authored rather than templated together."

The Python side decides what to make. Fifteen agents run in sequence behind a Flask UI: entity & research agents pull background from Wikipedia and Google News, an analysis agent shapes a director's brief, then script, narration, graphics, player_image, motion, music_selector, and production agents build out the storyboard, score, and clip sourcing list, all coordinated by orchestrator.py.

The Remotion side decides how it looks. A separate React project renders 30+ animated graphic templates (stat bars, radars, lineups, tactical boards, transfer records, league tables) driven by structured props the engine emits. ElevenLabs handles narration TTS; a centralised WorldStateRoot lets graphics share an infinite spatial canvas so consecutive scenes feel connected instead of resetting. The Flask UI then exposes a 5-step pipeline (title → context → blueprint → storyboard → render) plus a studio grid and a 4-track timeline editor for review and export.

Real renders.

From the engine
Graphic · Player Radar Why Brazil Stopped Producing Playmakers
Graphic · StatBars How money has changed football
Graphic · World Cup Why Brazil Stopped Producing Playmakers

From title to export.

The pipeline is a 5-step UI on top of a deterministic agent chain. Each step lands real artifacts on disk under output/<safe_name>/ so a run can be paused, reviewed, and resumed.

01 · Title

Title + director's brief

User enters a title; entity_agent extracts the subject, research_agent pulls Wikipedia + Google News, and an LLM drafts the director's brief.

02 · Context

Editable fact checklist

The brief is parsed into structured facts the user can tick on or off. Checked items become MUST INCLUDE constraints downstream, written to context.md + facts.md.

03 · Blueprint

Act-by-act structure

An LLM lays out the 5-act blueprint with required facts injected as constraints. The user adjusts emphasis and act count before storyboard generation runs.

04 · Storyboard

70–90 scenes, drag & drop

The script agent expands the blueprint into a full storyboard. A drag-drop editor lets the user reorder, splice, and approve before the agent chain runs end-to-end via orchestrator.py.

05 · Render

Script · narration · graphics · clips

Sequential pipeline: script_agentnarration_agent (ElevenLabs TTS) → graphics_agent (Remotion renders) → production_agent (clip sourcing sheet).

06 · Studio + Edit

Review grid & 4-track timeline

Two interfaces: a render grid for approve / reject / re-render, and a 4-track timeline editor for splicing graphics, narration, music, and sourced clips before final export.

Architecture.

The system has two halves. The Python engine (Flask + Groq + ElevenLabs) decides what to make: research, script, narration, clip sourcing. The Remotion project decides how it looks: 30+ React graphic templates rendered to MP4.

A central WorldStateRoot in VideoSequence.tsx keeps consecutive graphics on a shared spatial canvas (cameraX = 0 → 1920 → 3840…), so a sequence of stat cards reads as one continuous camera move rather than 8 hard cuts. _break_data_runs in the orchestrator reorders scenes to avoid 3+ consecutive pure-stat blocks.

title + director's brief
PY Python engine Flask · port 5000
entity_agent extracts subject from the title
research_agent Wikipedia + Google News pull
analysis_agent drafts the director's brief
context.md + facts.md user edits
blueprint (5 acts) user edits
storyboard (70–90 scenes) drag / drop
orchestrator.py · sequential
script_agent writes the line-by-line voice script
narration_agent ElevenLabs TTS render
graphics_agent picks templates + canonical props
production_agent builds the clip-sourcing sheet
render props (JSON)
RX Remotion React · /remotiontest

30+ templates rendered through a shared WorldStateRoot with per-scene cameraX offsets, so consecutive graphics read as one continuous camera move.

StatBars Radar Lineup Tactical TransferRecord …26 more
renders/*.mp4 + manifest.json
/ studio  ·  review grid / edit  ·  4-track timeline
export → final cut