Three Modes, One Session, Ship Everything
What Needed to Happen
Mio v0.0.3 had a laundry list: implement tiered input validation for the Telegram bot's onboarding flow, deploy it, iterate on the limits based on real usage feedback, update six documentation files, write a changelog, tag a release, run a full codebase audit, fix everything the audit found, and push it all to production.
The traditional workflow would serialize this into: code, test, commit, write docs, run linter, fix lint, open PR, wait for review, fix review comments, merge, tag, release. A full day's work if you're disciplined, two days if you're human.
This session did all of it in about 26 minutes of active time. Not by going faster through the same pipeline — by running three different modes of collaboration simultaneously.
Mode 1: Direct Pair Programming
The session started with a concrete task: add input validation to Mio's onboarding flow. The bot asks users for nicknames, hobbies, personality preferences — all free-text inputs that were sharing a single 500-character max limit. A nickname field accepting 500 characters is absurd.
I told Claude Code what I wanted. It read the onboarding code, identified all the input fields, and implemented a two-tier system: SHORT_INPUT_KEYS with a 50-character limit for names and hobbies, and the existing 500-character limit for long-form fields like backstories.
Then it built the Docker image, pushed to the registry, and deployed to Cloud Run. Revision 44. The whole thing — reading the code, implementing the change, building, pushing, deploying — took about 14 minutes.
This is what I call direct mode: you and Claude Code, working on the same problem, with the AI handling the mechanical parts (Docker build, gcloud run deploy) while you focus on the design decisions.
The Feedback Loop
I tested the deployed bot. 50 characters for a name field was still too much. And blank inputs weren't being rejected.
"50 chars feels too much for name.. doing a one fit all max is bad, also validate blanks"
Claude Code immediately pivoted. Instead of the two-tier system, it built a per-key configuration:
Names: 10 chars
Style/personality: 30 chars
Hobbies: 50 chars
Long-form: 500 chars
Plus blank rejection — empty inputs now got bounced with a friendly message. Build, push, deploy. Revision 45. About two minutes from feedback to production.
Then I looked at the name limit again. 10 characters for a Chinese name? Chinese names are typically 2-4 characters. Even nicknames rarely exceed 6.
"max 10 char seems still a lot for chinese chars?"
Claude Code changed it to 6. Another deploy. The name limit went 500 → 50 → 10 → 6 through three rounds of user feedback, each round taking about two minutes from "this doesn't feel right" to "it's live in production."
This is the part that feels qualitatively different from traditional development. The feedback loop isn't "file a ticket, wait for the next sprint, test in staging." It's "say what's wrong, watch it get fixed, test it live, repeat." Three production deploys driven by a conversation.
Mode 2: Background Agent
Here's where it gets interesting. While I was giving feedback on the name limits (the 10 → 6 change), I also needed documentation updated. Six docs files, a changelog entry, and a v0.0.3 git tag.
In a traditional workflow, this would be sequential: finish the code changes, then write the docs. Or worse — "I'll update the docs later" (you won't).
Instead, I said:
"spawn subagent to update all docs in docs/, todo, etc, and update changelog and tag v0.0.3 release"
Claude Code spawned a background agent and kept talking to me about the name limits. The background agent was reading docs, updating six files, writing the changelog, creating the git tag — all while I was still iterating on the implementation with the main session.
A few minutes later, the background agent reported back: six docs updated, changelog written, v0.0.3 tagged. One catch — it used the old 10-character name limit in the docs because it ran concurrently with (and slightly before) my final 10 → 6 change. A minor discrepancy, easily noted and correctable.
This is background mode: spawn an agent for work that doesn't need your attention, keep doing your primary task. It's the AI equivalent of kicking off a CI pipeline and going back to coding. The work happens in parallel, you get notified when it's done.
Mode 3: Autonomous Review Team
With the implementation done and docs updated, I wanted a full codebase audit. Not just a quick scan — a thorough review with fixes.
"spawn agent team to do full review and audit of current codebase, then fix any issues. Do this iteration until review agent cannot find issues"
Claude Code created a team: a reviewer agent and a fixer agent, chained in a dependency loop.
Round 1
The reviewer scanned the entire codebase and produced 22 findings — security issues, reliability concerns, correctness bugs. The fixer picked up the findings and resolved 21 of them. One was deferred: a LOW-priority performance optimization in the memory accumulator (batch efficiency). Acceptable tech debt.
Round 2
The reviewer ran again on the now-fixed codebase. It found 3 new issues — things that were introduced or exposed by the round 1 fixes. The fixer resolved all 3.
Round 3
Final verification pass. The reviewer found no new issues. Verdict: APPROVED.
Total across all rounds: 24 out of 25 issues resolved. The one remaining item was a performance optimization that wasn't a bug, wasn't a security risk, and wasn't worth the complexity to fix right now.
This is team mode: you define the objective ("review until clean"), Claude Code creates the agents, sets up the dependency chain, and the agents iterate autonomously. I didn't review any of the intermediate findings or fixes — I just read the final summary.
The key difference from Part 5 is the iterative loop. Part 5 was a single pass: four reviewers found issues, three fixers resolved them, done. This session ran three rounds — the reviewer checked the fixer's work, found gaps, and the fixer fixed those too. It's not just parallel review, it's convergent review: iterate until the delta is zero.
The Overlap
Here's the timeline of what was actually happening:
0:00 Start implementing input validation
├── Direct mode: reading code, implementing tiered limits
14:00 Deploy revision 44
├── Direct mode: user tests, gives feedback
├── "50 chars too much, validate blanks"
16:00 Deploy revision 45 (per-key limits + blank rejection)
├── Direct mode: more feedback
├── "10 chars still too much for Chinese"
├── Background mode: docs agent spawned ← concurrent
17:00 Deploy revision 46 (names → 6 chars)
├── Background agent still running docs/changelog
19:00 Background agent completes (6 docs, changelog, v0.0.3 tag)
├── Team mode: review team spawned ← concurrent
├── Round 1: 22 findings → 21 fixed
├── Round 2: 3 new findings → 3 fixed
├── Round 3: verified clean → APPROVED
26:00 All committed, pushed, gh release created
Three modes, overlapping. I was pair-programming on the implementation while the docs agent updated documentation in the background. The review team was autonomously iterating while I was handling the release logistics.
In a sequential workflow, this is at minimum five phases: implement → iterate → document → review → release. Here, phases 2-3 overlapped (background docs while iterating on code), and phase 4 was fully autonomous (review team iterated without me).
The Release
After the review team approved, I committed everything and pushed:
git push origin main --tags
gh release create v0.0.3 --title "v0.0.3: Multimodal Input, Enhanced Onboarding, Selfie Generation"
From first line of code to GitHub release, one session. Implementation, three rounds of user feedback with production deploys, documentation, three-round autonomous code review, and a tagged release. The product perspective of this same work is covered separately — this post is purely about the workflow mechanics.
Why This Matters
The insight isn't "AI is fast." We know that. The insight is that the three modes compose.
Direct mode gives you tight feedback loops — say what's wrong, watch it change. But it requires your attention.
Background mode frees your attention — spawn the work, forget about it, get notified. But it can't respond to your feedback.
Team mode gives you autonomous iteration — define the goal, let agents converge. But it's overkill for a quick fix.
No single mode covers everything. The power is in switching between them fluidly, within the same session, sometimes running two or three simultaneously.
This is what Part 4 hinted at but didn't fully explore. Agent Teams showed parallel execution — five agents building different modules. This post shows something different: mixed-mode orchestration, where the human is pair-programming, delegating background work, AND running autonomous loops, all at the same time.
The Old Way vs. The New Way
Traditional workflow:
Code (2h) → Test (30m) → Commit → Docs (1h) → Lint (15m)
→ Fix lint (30m) → PR → Review (wait 1-3 days) → Fix comments (1h)
→ Merge → Tag → Release notes (30m) → Release
Total: 1-4 days
Mixed-mode orchestration:
Direct: Code + test + deploy (14m)
Direct: Iterate on feedback (3 deploys in 12m)
Background: Docs + changelog (concurrent, ~8m)
Team: 3-round review + fix (autonomous, ~15m)
Direct: Tag + release (2m)
Total: ~26 minutes active time
The bottleneck moved. It used to be execution time — writing code, writing docs, waiting for reviews. Now it's decision time — what to build, what limits feel right, when to ship. The mechanical work runs in parallel streams that I orchestrate but don't execute.
That's the shift. Not "AI writes code faster." It's "the entire pipeline from idea to release runs concurrently, and you're the conductor."