Kick off your book project in 2 hours! Live workshop on Zoom. You’ll leave with a real book project, progress on your first chapter, and a clear plan to keep going. Tuesday, June 16, 2026. Learn more…

Leanpub Header

Skip to main content

Orchestrating AI Agents

Coordinating Claude Code, Codex, Local Models, and MCP with a Persistent Control Plane

A practical guide to operating a fleet of AI coding agents through routing, memory, skills, MCP, guardrails, and a persistent control plane (322 manuscript pages).

Minimum price

$14.99

$24.99

You pay

Author earns

$

Also available for 1 book credit with a Reader Membership

PDF
About

About

About the Book

Orchestrating AI Agents is a practical, architecture-first guide to operating multiple AI coding agents as a

coordinated system. The book uses the Hermes control-plane pattern to explain how single-agent tools, local

models, MCP servers, memory, skills, queues, budgets, and guardrails fit into one operating layer.

The book does not teach one vendor tool from scratch. It focuses on the missing coordination layer: how work is

routed, delegated, remembered, reviewed, scheduled, and escalated when a developer or team uses more than one

AI coding harness.

Author

About the Author

Yohan Rodriguez

Yohan is a Senior Full-Stack Software Engineer with extensive experience delivering scalable, end-to-end software solutions across web, enterprise, and cloud-based environments. He specializes in architecting robust platforms, modernizing legacy systems, driving cloud transformation efforts, and building integration-heavy applications that support critical business workflows. He is recognized for translating complex requirements into reliable, maintainable, and high-value solutions across industries such as insurance, cybersecurity, and professional services.

Known for combining strong technical execution with a practical business mindset, he has contributed to projects from concept and design through production delivery and long-term support. His experience includes collaborating with cross-functional teams, improving development workflows, solving complex technical challenges, and helping organizations deliver dependable software products that adapt to changing business needs. He brings a balanced approach to engineering that values quality, efficiency, and continuous improvement.

Contents

Table of Contents

  • Preface i
  • 1 The Multi-Tool Problem 2
    • 1.1 The Year Everyone Got a Fleet 2
    • 1.2 A Tuesday With Four Agents 4
    • 1.3 Five Things Nothing Owns 5
    • 1.4 Why ``Which Agent Is Best?'' Is the Wrong Question 7
    • 1.5 Why the Problem Stayed Invisible 8
    • 1.6 What an Operating Layer Would Do 9
    • 1.7 Hands-On: Audit Your Own Agent Stack 9
    • 1.8 The Operator's Mindset 9
    • 1.9 What This Book Builds 11
  • 2 Orchestrate, Don't Replace 13
    • 2.1 How the Command-Line Agent Got Its Shape 13
    • 2.2 The Limitation That Is the Opening 14
    • 2.3 Agents Have a Back Door: Headless Mode 15
    • 2.4 Two Planes: Control and Execution 17
    • 2.5 A Task, Traced Through Both Planes 19
    • 2.6 Operate, Don't Replace 21
    • 2.7 What the Control Plane Actually Is 21
    • 2.8 Hands-On: The Seed of Delegation 22
    • 2.9 Why This Scales, and Where We Go Next 26
    • 2.10 Implementation Guidance: One Adapter Per Agent 26
    • 2.11 Where the Control Plane Runs 28
    • 2.12 Beyond One Coordinator: Governance and Scale 28
    • 2.13 Mistakes to Avoid 29
  • 3 The Agent Landscape, 2026 32
    • 3.1 How to Read an Agent 34
    • 3.2 Claude Code 36
    • 3.3 Codex CLI 36
    • 3.4 Antigravity CLI 37
    • 3.5 Aider and OpenCode: The Local-First Pair 38
    • 3.6 Local Models as an Execution Tool 38
    • 3.7 The Capability Matrix 39
    • 3.8 Hands-On: The Same Task Through Two Harnesses 41
    • 3.9 A Routing Scenario: One Day, Five Tasks 42
    • 3.10 Onboarding a New Agent 43
    • 3.11 Harness Migration and Version Management 45
    • 3.12 Provider Independence and Exit Strategy 46
    • 3.13 Trade-offs and Routing Implications 48
    • 3.14 Mistakes to Avoid 49
    • 3.15 Where the Landscape Is Going, and Where We Go Next 50
  • 4 Hermes as the Control Plane 52
    • 4.1 Why a Daemon, Not a Command 52
    • 4.2 The Anatomy of a Control Plane 55
    • 4.3 The Agent Loop 55
    • 4.4 Memory as Layers 56
    • 4.5 The Terminal Backend Is a Config Knob 57
    • 4.6 Approvals and the Hardline Blocklist 57
    • 4.7 Gateways and the Scheduler 59
    • 4.8 Configuration: The Knobs That Matter 59
    • 4.9 Hands-On: Reading a Control Plane's State 61
    • 4.10 A Request's Journey Through the Control Plane 63
    • 4.11 Implementation Guidance: Adopt or Build 65
    • 4.12 Availability and Multi-Environment Operation 65
    • 4.13 Change Management and Disaster Recovery 66
    • 4.14 Operational Considerations 67
    • 4.15 Trade-offs 68
    • 4.16 Mistakes to Avoid 68
    • 4.17 Lessons Learned, and Where We Go Next 69
  • 5 Delegation via Headless CLIs 71
    • 5.1 What a Delegation Skill Is 71
    • 5.2 The Brief: Context Assembly 72
    • 5.3 The Delegation Sequence 73
    • 5.4 Capturing and Validating Structured Output 76
    • 5.5 Writing the Outcome Back 76
    • 5.6 Auth Isolation and the Exfiltration Channel 77
    • 5.7 A Delegation, Traced --- and One That Fails 78
    • 5.8 Implementation Guidance and Operational Considerations 79
    • 5.9 Patterns of Delegation 81
    • 5.10 Long-Running and Asynchronous Delegation 82
    • 5.11 Advanced Delegation Patterns 83
    • 5.12 The Limits of the Shell-Out Primitive 84
    • 5.13 Trade-offs 85
    • 5.14 Mistakes to Avoid 87
    • 5.15 Lessons Learned, and Where We Go Next 87
  • 6 The Routing Engine 90
    • 6.1 The Four Axes 90
    • 6.2 The Three Outcomes 91
    • 6.3 The Decision and the Hard Rule 92
    • 6.4 The Router in the Loop 94
    • 6.5 Classifying the Task, Choosing the Agent 96
    • 6.6 Routing Under Uncertainty 98
    • 6.7 Worked Routing: Nine Real Workloads 100
    • 6.8 Routing on Evidence 101
    • 6.9 A Week of Routing, Reviewed 102
    • 6.10 Implementation Guidance and Operational Considerations 103
    • 6.11 Routing at Scale: Tables as Code, and Learned Routing 104
    • 6.12 Trade-offs 105
    • 6.13 Mistakes to Avoid 105
    • 6.14 Lessons Learned, and Where We Go Next 106
  • 7 Memory Across Harnesses 107
    • 7.1 Why Memory Is the Moat 107
    • 7.2 The Layered Model, Revisited in Depth 108
    • 7.3 The Bounded Core: What Earns a Permanent Seat 110
    • 7.4 What a Memory Record Contains 110
    • 7.5 Write Policy: Archive, Propose, Curate 111
    • 7.6 Cross-Harness Write-Back 113
    • 7.7 Relevance Retrieval: The Read Path 115
    • 7.8 Measuring Retrieval Quality 117
    • 7.9 Hands-On: Designing a Memory Schema 118
    • 7.10 Portability Across Harnesses 119
    • 7.11 A Bad Memory, Traced and Repaired 120
    • 7.12 Memory Hygiene: Drift, Poisoning, and Pruning 121
    • 7.13 Memory at Scale: Sharding, Federation, and Conflict 122
    • 7.14 Implementation Guidance and Trade-offs 123
    • 7.15 Mistakes to Avoid 124
    • 7.16 Lessons Learned, and Where We Go Next 125
  • 8 Skills as a Capability Registry 127
    • 8.1 A Skill Is a Procedure on Disk 127
    • 8.2 The Shape of a Complete Skill 128
    • 8.3 The Three Loading Levels 129
    • 8.4 Matching and Debugging the Registry 131
    • 8.5 The Three Classes of Skill 131
    • 8.6 Hands-On: Classify Your Procedures 133
    • 8.7 From Repeated Procedure to Trusted Skill 133
    • 8.8 Testing, Versioning, and Deprecation 137
    • 8.9 Permissions and Environment 138
    • 8.10 Agent-Authored Skills, and the Self-Improvement Claim 138
    • 8.11 Community Skills Are Code Paths 140
    • 8.12 Implementation Guidance and Operational Considerations 141
    • 8.13 Trade-offs and Mistakes to Avoid 141
    • 8.14 Lessons Learned, and Where We Go Next 143
  • 9 MCP: The Shared Tool Layer 145
    • 9.1 What MCP Is 145
    • 9.2 MCP as the Device-Driver Layer 146
    • 9.3 The Wiring 146
    • 9.4 Registration: Making a Server Real 148
    • 9.5 The Protocol Shape: Discover, Invoke, Validate 148
    • 9.6 Designing a Tool Surface: Read Versus Write 149
    • 9.7 Hands-On: Map a Server You Would Build 151
    • 9.8 Authentication and Token Scope 151
    • 9.9 Output Validation and Injection Handling 152
    • 9.10 Routing Writes Through One Seam 152
    • 9.11 Coordinator Access Versus Delegated Access 154
    • 9.12 The Server Landscape: Existing and Proposed 154
    • 9.13 A Memory Server, Made Concrete 155
    • 9.14 A Real Workflow: Build, Verify, Publish 155
    • 9.15 The Security Surface 157
    • 9.16 When the Shared Layer Fails: MCP Outages and Degradation 158
    • 9.17 Protocol Versioning, Transport, and Mixed-Version Fleets 159
    • 9.18 Implementation Guidance, Trade-offs, and Mistakes 161
    • 9.19 Lessons Learned, and Where We Go Next 162
  • 10 Local vs Cloud Routing 164
    • 10.1 Why Local-First Is the Default 164
    • 10.2 The Privacy Classes, Made Operational 165
    • 10.3 The Secrets Boundary 166
    • 10.4 The Three Variables: Cost, Latency, Quality 168
    • 10.5 The Budget and Audit Chokepoint 170
    • 10.6 The Local-vs-Cloud Decision Flow 174
    • 10.7 Hands-On: Classify Your Task Mix 175
    • 10.8 A Week of Local-vs-Cloud Routing, Reviewed 177
    • 10.9 Advanced Routing: Cascades, Fallbacks, and Speculation 177
    • 10.10 Capacity Planning for Local Inference 178
    • 10.11 Deployment Scenarios: Where the Boundary Falls 180
    • 10.12 A Routing Failure, Traced 180
    • 10.13 Operating It: Monitoring, Debugging, Scaling 181
    • 10.14 Implementation Guidance, Trade-offs, and Mistakes 182
    • 10.15 Lessons Learned, and Where We Go Next 183
  • 11 Sandboxing and Guardrails 186
    • 11.1 The Threat Model 186
    • 11.2 The Layered Model, Made Precise 190
    • 11.3 The Isolation Spectrum: Terminal Backends 191
    • 11.4 Approval Modes: Manual, Smart, Off 193
    • 11.5 The Hardline Blocklist 194
    • 11.6 The Pre-Exec Scanner 194
    • 11.7 The Cross-Cutting Controls: Kill-Switch, Caps, Audit, Secrets 196
    • 11.8 The Pre-Flight Checklist for Autonomous Operation 197
    • 11.9 Multi-Agent Considerations 198
    • 11.10 A Contained Failure, Traced 199
    • 11.11 Advanced Isolation Architectures 200
    • 11.12 Guardrails at Fleet Scale: Policy as Code 200
    • 11.13 Identity, Secrets, and Credential Lifecycle 201
    • 11.14 Audit Trails and Compliance for Regulated Operation 203
    • 11.15 A Guardrail Failure, Traced 204
    • 11.16 Operating It: Monitoring, Debugging, Maintenance 205
    • 11.17 Trade-offs and Mistakes 206
    • 11.18 Lessons Learned, and Where We Go Next 206
  • 12 Scheduling and the Work Queue 208
    • 12.1 From One-Shot to Standing Operation 208
    • 12.2 The Work Queue as the Coordination Primitive 209
    • 12.3 Claim and Lock: The Heart of Multi-Agent Safety 210
    • 12.4 Status Reporting and Observability 214
    • 12.5 Observability: Tracing a Request Across the Fleet 215
    • 12.6 Human-Escalation Gates 216
    • 12.7 Deduplication and Idempotency 217
    • 12.8 Wiring the Budget Contract 218
    • 12.9 A Multi-Agent Scheduled Workflow, Traced 218
    • 12.10 Backfill, Catch-Up, and the Cold Start 219
    • 12.11 Advanced Queue Architectures 220
    • 12.12 Hierarchical and Supervisor Orchestration 221
    • 12.13 A Scheduling Failure, Traced 222
    • 12.14 Scaling the Fleet: 10 to 1,000 Agents 223
    • 12.15 Capacity Math: Utilization, Queue Depth, and Little's Law 224
    • 12.16 Operating It: Monitoring, Debugging, Scaling 226
    • 12.17 Trade-offs and Mistakes 227
    • 12.18 Lessons Learned, and Where We Go Next 227
  • 13 A Worked Orchestration 229
    • 13.1 The Factory as an Orchestration 229
    • 13.2 Routing a Feature Through the Pipeline 230
    • 13.3 Where Every Mechanism Appears 233
    • 13.4 The Compile-Verify Loop in Depth 233
    • 13.5 Testing the Orchestration: A Test Pyramid for Agent Systems 236
    • 13.6 The Human Gates 237
    • 13.7 Hands-On: Trace Your Own Pipeline 238
    • 13.8 A Run That Went Wrong, and Recovered 238
    • 13.9 The Same Shape, Other Factories 239
    • 13.10 A Second Worked Case: A Software-Release Factory 240
    • 13.11 A Third Worked Case: A Customer-Support Fleet 241
    • 13.12 Cross-Team and Multi-Repo Orchestration 243
    • 13.13 When Orchestration Is the Wrong Choice 245
    • 13.14 Operating the Factory: Monitoring, Debugging, Scaling 245
    • 13.15 Trade-offs, Mistakes, and Lessons 247
    • 13.16 Lessons Learned, and Where We Go Next 247
  • 14 Budgets and Spend Governance 249
    • 14.1 The Accounting Seam 249
    • 14.2 Per-Task-Class Cost Envelopes 251
    • 14.3 The Cloud-versus-Local Break-Even 253
    • 14.4 Fleet-Wide Caps and the Runaway 255
    • 14.5 Cost Curves and the Shape of Fleet Spend 256
    • 14.6 The Cost of Human Review 256
    • 14.7 A Cost Explosion, Traced 257
    • 14.8 FinOps for Agent Fleets: Chargeback and Incentives at Scale 258
    • 14.9 What the Expensive Model Should Cost 259
    • 14.10 Hands-On: Build a Cost Model for Your Task Mix 260
    • 14.11 Showback: Attributing Cost to Projects and Owners 260
    • 14.12 A Month of Spend, Reviewed 261
    • 14.13 Caching, Token Efficiency, and Performance 262
    • 14.14 Reasoning About the Numbers: Cost and Latency Models 263
    • 14.15 Operating It: Monitoring, Forecasting, Maintenance 265
    • 14.16 Trade-offs and Mistakes 266
    • 14.17 Lessons Learned, and Where We Go Next 266
  • 15 Failure and Escalation 268
    • 15.1 Why Standing Fleets Fail Differently 268
    • 15.2 Failure Mode: Collisions 269
    • 15.3 Failure Mode: Compounding Error 270
    • 15.4 Failure Mode: Poisoned Memory 270
    • 15.5 Failure Mode: Prompt Injection 271
    • 15.6 Failure Mode: Runaway 271
    • 15.7 Advanced Failure Modes: Cascades, Correlation, and Drift 272
    • 15.8 A Cascading Incident, Fully Traced 273
    • 15.9 Evaluation: Measuring Agent Quality Over Time 274
    • 15.10 Chaos Testing for Agent Fleets 276
    • 15.11 The Escalation Discipline 277
    • 15.12 Debugging an Agent in the Wild 279
    • 15.13 Reproducibility and Replay 282
    • 15.14 What to Never Automate 283
    • 15.15 Building a Recovery Posture 284
    • 15.16 Reliability Engineering: SLOs, Error Budgets, and On-Call 285
    • 15.17 An Incident, Traced End to End 286
    • 15.18 A Worked System: A Multi-Agent Operations Center 287
    • 15.19 Trade-offs and Mistakes 289
    • 15.20 Lessons Learned, and Where We Go Next 290
  • 16 The Three-Year Horizon 292
    • 16.1 The Maturity Arc 293
    • 16.2 What Each Phase Unlocks and Gates On 294
    • 16.3 Migration Paths Between Phases 296
    • 16.4 The Honest Ceiling 297
    • 16.5 Hands-On: Locate Yourself and Pick the Next Gate 298
    • 16.6 Operational Lessons for the Long Haul 300
    • 16.7 Deployment Archetypes: The Arc in Four Contexts 300
    • 16.8 A Worked System: An Enterprise Research Organization 302
    • 16.9 How the Team Evolves with the Fleet 304
    • 16.10 How Maturity Regresses: Migration Failure Modes 305
    • 16.11 A Three-Year Trajectory, Traced 306
    • 16.12 Component Lifecycle: Versioning, Compatibility, and Retirement 307
    • 16.13 A Dated Snapshot, So You Can Measure the Drift 308
    • 16.14 Standing Review Triggers 309
    • 16.15 Closing: Orchestrate, Don't Replace 310

Get the free sample chapters

Click the buttons to get the free sample in PDF or EPUB, or read the sample online here

The Leanpub 60 Day 100% Happiness Guarantee

Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.

See full terms...

Earn $8 on a $10 Purchase, and $16 on a $20 Purchase

We pay 80% royalties on purchases of $7.99 or more, and 80% royalties minus a 50 cent flat fee on purchases between $0.99 and $7.98. You earn $8 on a $10 sale, and $16 on a $20 sale. So, if we sell 5000 non-refunded copies of your book for $20, you'll earn $80,000.

(Yes, some authors have already earned much more than that on Leanpub.)

In fact, authors have earned over $15 million writing, publishing and selling on Leanpub.

Learn more about writing on Leanpub

Free Updates. DRM Free.

If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).

Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.

Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.

Learn more about Leanpub's ebook formats and where to read them

Write and Publish on Leanpub

You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!

Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.

Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF and/or EPUB files and then publish with one click!) It really is that easy.

Learn more about writing on Leanpub