ROLE — pm + builder + operator IN FLIGHT 03-2026 → 06-2026
A-1 ACTIVE

Anima

animation · agentic · pipeline

◐ ACT 1 SHIPPED · ACT 2 IN FLIGHT

A pencil-test character-design sketch on cream sketchbook paper: a young warrior in a hooded, tattered cloak grips a tall spear with an amber ribbon tied near its tip, braced in a determined stance atop a rocky outcrop. Three smaller pose studies of the same figure run down the left margin.

─ OPENER ─

I drew a pencil sketch of myself on the iPad: me at 33, slightly stooped, stylus in the right hand. Then I taught Gemini Nano Banana 2 to redraw the same guy 220 times in a row, in 16:9, on cream paper, with the same facial scruff in every frame. The model has a list of ways it can fail me now. They have numbers. SF02 is “wrong jaw.”

It could’ve been Midjourney. Midjourney would’ve been faster. Midjourney would’ve also been gauzy, vertical, photo-real, and someone else’s. The pipeline is a pencil test on purpose: construction lines visible, paper grain, hole-punch marks down the left edge. It’s an animator’s idiom, not a content-farm idiom. The discipline is the proof that a human still owns the timing.

I block the motion in Procreate Dreams. The agents fill the frames. Seedance 2.0 interpolates between the approved keys. A critic agent flags identity drift before the render burns budget. The pipeline is called anima. Act 1 plays clean. Act 2 is on the board.

─ INVESTIGATION BOARD ─

A-1 Pipeline rev 3: final cleanup chain

220-frame cycle locks. Move to case-study hero media (not portfolio hero).

Rev 3 is consistent enough to ship; portfolio hero stays with the 94-frame loop per hero-spec §7.5.

Render time per frame (seconds)

v1v1.5v2v2a 22 4

Production board: first cleaned cycle

A1-014

Land 146-frame raw cycle with consistent line

─ METHODS ─

Tools, agents, and models used on this project
TASK AGENT / TOOL MODEL / COST
keyframe stills Gemini Nano Banana 2 ~$0.04/frame
motion interpolation Seedance 2.0 ~$0.40/clip (Fast tier)
orchestration Code Brain Claude Sonnet 4.6 (HybridRouter)
vision critic (T2) Gemini 3.1 Pro via Anti-Gravity CLI $0 incremental (subscription)
multi-CLI critic (T3) Codex CLI + Anti-Gravity CLI in parallel $0 incremental (subscriptions absorb)
planner Opus 4.7 (Maya persona) + Sonnet 4.6 adversarial per-token billing

─ 4Q ─

A-1.Q1 What is this?

anima is a 10-phase pipeline for shipping 2D animated stories made by a human and a fleet of named agents working together. A brief becomes a plan. A plan becomes an animatic. An animatic constrains motion. Motion gets cleaned to the aesthetic. The fleet runs the volume work (a line producer, a character designer, a scriptwriter, a storyboard artist, a frame generator, a script-supervisor critic, a museum writer) while the human owns timing, taste, and the decision to ship. The Pencil Test short is the first reference implementation; anima is the system.

A-1.Q2 Why this approach?

The decision wasn't which model to use. The decision was which working method: a studio budget and a four-year project timeline (out of reach), AI-only generation (fast but someone else's idiom), or a human-author plus AI-fleet partnership. The third one is anima: production-company speed at solo-creator cost, without losing four years of life to a single piece. The model layer is replaceable; the architecture and the human role are not.

A-1.Q3 What would break?

Three named failure modes the architecture guards against, not three encoder bugs. Correlated blind spots when the orchestrator and the vision critic share a model family: 'valid output' and 'acceptable frame' silently align, and bad frames slip through. anima pairs a Sonnet orchestrator with a Gemini vision critic at the busiest checkpoint by construction. Local-optimization drift when every phase ships 'better' output that no longer matches the approved brief. Phase 0's planner emits an immutable acceptance_criteria.json; every downstream critic cites criteria IDs when it blocks. Cheap-judge failures with documented base rates: 58% sycophancy, up to 90% self-preference bias, length and position bias. T3 runs three peers from three vendors (Codex CLI + Anti-Gravity CLI + Claude SDK), with a separate Opus chairman that never grades its own work.

A-1.Q4 What did I learn?

Validators cannot recover taste that was absent at generation time. The pipeline is the artifact. The character is the test the pipeline has to pass.