Yohan is a Senior Full-Stack Software Engineer with extensive experience delivering scalable, end-to-end software solutions across web, enterprise, and cloud-based environments. He specializes in architecting robust platforms, modernizing legacy systems, driving cloud transformation efforts, and building integration-heavy applications that support critical business workflows. He is recognized for translating complex requirements into reliable, maintainable, and high-value solutions across industries such as insurance, cybersecurity, and professional services.

Known for combining strong technical execution with a practical business mindset, he has contributed to projects from concept and design through production delivery and long-term support. His experience includes collaborating with cross-functional teams, improving development workflows, solving complex technical challenges, and helping organizations deliver dependable software products that adapt to changing business needs. He brings a balanced approach to engineering that values quality, efficiency, and continuous improvement.

Preface i
1 The Multi-Tool Problem 2
- 1.1 The Year Everyone Got a Fleet 2
- 1.2 A Tuesday With Four Agents 4
- 1.3 Five Things Nothing Owns 5
- 1.4 Why ``Which Agent Is Best?'' Is the Wrong Question 7
- 1.5 Why the Problem Stayed Invisible 8
- 1.6 What an Operating Layer Would Do 9
- 1.7 Hands-On: Audit Your Own Agent Stack 9
- 1.8 The Operator's Mindset 9
- 1.9 What This Book Builds 11
2 Orchestrate, Don't Replace 13
- 2.1 How the Command-Line Agent Got Its Shape 13
- 2.2 The Limitation That Is the Opening 14
- 2.3 Agents Have a Back Door: Headless Mode 15
- 2.4 Two Planes: Control and Execution 17
- 2.5 A Task, Traced Through Both Planes 19
- 2.6 Operate, Don't Replace 21
- 2.7 What the Control Plane Actually Is 21
- 2.8 Hands-On: The Seed of Delegation 22
- 2.9 Why This Scales, and Where We Go Next 26
- 2.10 Implementation Guidance: One Adapter Per Agent 26
- 2.11 Where the Control Plane Runs 28
- 2.12 Beyond One Coordinator: Governance and Scale 28
- 2.13 Mistakes to Avoid 29
3 The Agent Landscape, 2026 32
- 3.1 How to Read an Agent 34
- 3.2 Claude Code 36
- 3.3 Codex CLI 36
- 3.4 Antigravity CLI 37
- 3.5 Aider and OpenCode: The Local-First Pair 38
- 3.6 Local Models as an Execution Tool 38
- 3.7 The Capability Matrix 39
- 3.8 Hands-On: The Same Task Through Two Harnesses 41
- 3.9 A Routing Scenario: One Day, Five Tasks 42
- 3.10 Onboarding a New Agent 43
- 3.11 Harness Migration and Version Management 45
- 3.12 Provider Independence and Exit Strategy 46
- 3.13 Trade-offs and Routing Implications 48
- 3.14 Mistakes to Avoid 49
- 3.15 Where the Landscape Is Going, and Where We Go Next 50
4 Hermes as the Control Plane 52
- 4.1 Why a Daemon, Not a Command 52
- 4.2 The Anatomy of a Control Plane 55
- 4.3 The Agent Loop 55
- 4.4 Memory as Layers 56
- 4.5 The Terminal Backend Is a Config Knob 57
- 4.6 Approvals and the Hardline Blocklist 57
- 4.7 Gateways and the Scheduler 59
- 4.8 Configuration: The Knobs That Matter 59
- 4.9 Hands-On: Reading a Control Plane's State 61
- 4.10 A Request's Journey Through the Control Plane 63
- 4.11 Implementation Guidance: Adopt or Build 65
- 4.12 Availability and Multi-Environment Operation 65
- 4.13 Change Management and Disaster Recovery 66
- 4.14 Operational Considerations 67
- 4.15 Trade-offs 68
- 4.16 Mistakes to Avoid 68
- 4.17 Lessons Learned, and Where We Go Next 69
5 Delegation via Headless CLIs 71
- 5.1 What a Delegation Skill Is 71
- 5.2 The Brief: Context Assembly 72
- 5.3 The Delegation Sequence 73
- 5.4 Capturing and Validating Structured Output 76
- 5.5 Writing the Outcome Back 76
- 5.6 Auth Isolation and the Exfiltration Channel 77
- 5.7 A Delegation, Traced --- and One That Fails 78
- 5.8 Implementation Guidance and Operational Considerations 79
- 5.9 Patterns of Delegation 81
- 5.10 Long-Running and Asynchronous Delegation 82
- 5.11 Advanced Delegation Patterns 83
- 5.12 The Limits of the Shell-Out Primitive 84
- 5.13 Trade-offs 85
- 5.14 Mistakes to Avoid 87
- 5.15 Lessons Learned, and Where We Go Next 87
6 The Routing Engine 90
- 6.1 The Four Axes 90
- 6.2 The Three Outcomes 91
- 6.3 The Decision and the Hard Rule 92
- 6.4 The Router in the Loop 94
- 6.5 Classifying the Task, Choosing the Agent 96
- 6.6 Routing Under Uncertainty 98
- 6.7 Worked Routing: Nine Real Workloads 100
- 6.8 Routing on Evidence 101
- 6.9 A Week of Routing, Reviewed 102
- 6.10 Implementation Guidance and Operational Considerations 103
- 6.11 Routing at Scale: Tables as Code, and Learned Routing 104
- 6.12 Trade-offs 105
- 6.13 Mistakes to Avoid 105
- 6.14 Lessons Learned, and Where We Go Next 106
7 Memory Across Harnesses 107
- 7.1 Why Memory Is the Moat 107
- 7.2 The Layered Model, Revisited in Depth 108
- 7.3 The Bounded Core: What Earns a Permanent Seat 110
- 7.4 What a Memory Record Contains 110
- 7.5 Write Policy: Archive, Propose, Curate 111
- 7.6 Cross-Harness Write-Back 113
- 7.7 Relevance Retrieval: The Read Path 115
- 7.8 Measuring Retrieval Quality 117
- 7.9 Hands-On: Designing a Memory Schema 118
- 7.10 Portability Across Harnesses 119
- 7.11 A Bad Memory, Traced and Repaired 120
- 7.12 Memory Hygiene: Drift, Poisoning, and Pruning 121
- 7.13 Memory at Scale: Sharding, Federation, and Conflict 122
- 7.14 Implementation Guidance and Trade-offs 123
- 7.15 Mistakes to Avoid 124
- 7.16 Lessons Learned, and Where We Go Next 125
8 Skills as a Capability Registry 127
- 8.1 A Skill Is a Procedure on Disk 127
- 8.2 The Shape of a Complete Skill 128
- 8.3 The Three Loading Levels 129
- 8.4 Matching and Debugging the Registry 131
- 8.5 The Three Classes of Skill 131
- 8.6 Hands-On: Classify Your Procedures 133
- 8.7 From Repeated Procedure to Trusted Skill 133
- 8.8 Testing, Versioning, and Deprecation 137
- 8.9 Permissions and Environment 138
- 8.10 Agent-Authored Skills, and the Self-Improvement Claim 138
- 8.11 Community Skills Are Code Paths 140
- 8.12 Implementation Guidance and Operational Considerations 141
- 8.13 Trade-offs and Mistakes to Avoid 141
- 8.14 Lessons Learned, and Where We Go Next 143
9 MCP: The Shared Tool Layer 145
- 9.1 What MCP Is 145
- 9.2 MCP as the Device-Driver Layer 146
- 9.3 The Wiring 146
- 9.4 Registration: Making a Server Real 148
- 9.5 The Protocol Shape: Discover, Invoke, Validate 148
- 9.6 Designing a Tool Surface: Read Versus Write 149
- 9.7 Hands-On: Map a Server You Would Build 151
- 9.8 Authentication and Token Scope 151
- 9.9 Output Validation and Injection Handling 152
- 9.10 Routing Writes Through One Seam 152
- 9.11 Coordinator Access Versus Delegated Access 154
- 9.12 The Server Landscape: Existing and Proposed 154
- 9.13 A Memory Server, Made Concrete 155
- 9.14 A Real Workflow: Build, Verify, Publish 155
- 9.15 The Security Surface 157
- 9.16 When the Shared Layer Fails: MCP Outages and Degradation 158
- 9.17 Protocol Versioning, Transport, and Mixed-Version Fleets 159
- 9.18 Implementation Guidance, Trade-offs, and Mistakes 161
- 9.19 Lessons Learned, and Where We Go Next 162
10 Local vs Cloud Routing 164
- 10.1 Why Local-First Is the Default 164
- 10.2 The Privacy Classes, Made Operational 165
- 10.3 The Secrets Boundary 166
- 10.4 The Three Variables: Cost, Latency, Quality 168
- 10.5 The Budget and Audit Chokepoint 170
- 10.6 The Local-vs-Cloud Decision Flow 174
- 10.7 Hands-On: Classify Your Task Mix 175
- 10.8 A Week of Local-vs-Cloud Routing, Reviewed 177
- 10.9 Advanced Routing: Cascades, Fallbacks, and Speculation 177
- 10.10 Capacity Planning for Local Inference 178
- 10.11 Deployment Scenarios: Where the Boundary Falls 180
- 10.12 A Routing Failure, Traced 180
- 10.13 Operating It: Monitoring, Debugging, Scaling 181
- 10.14 Implementation Guidance, Trade-offs, and Mistakes 182
- 10.15 Lessons Learned, and Where We Go Next 183
11 Sandboxing and Guardrails 186
- 11.1 The Threat Model 186
- 11.2 The Layered Model, Made Precise 190
- 11.3 The Isolation Spectrum: Terminal Backends 191
- 11.4 Approval Modes: Manual, Smart, Off 193
- 11.5 The Hardline Blocklist 194
- 11.6 The Pre-Exec Scanner 194
- 11.7 The Cross-Cutting Controls: Kill-Switch, Caps, Audit, Secrets 196
- 11.8 The Pre-Flight Checklist for Autonomous Operation 197
- 11.9 Multi-Agent Considerations 198
- 11.10 A Contained Failure, Traced 199
- 11.11 Advanced Isolation Architectures 200
- 11.12 Guardrails at Fleet Scale: Policy as Code 200
- 11.13 Identity, Secrets, and Credential Lifecycle 201
- 11.14 Audit Trails and Compliance for Regulated Operation 203
- 11.15 A Guardrail Failure, Traced 204
- 11.16 Operating It: Monitoring, Debugging, Maintenance 205
- 11.17 Trade-offs and Mistakes 206
- 11.18 Lessons Learned, and Where We Go Next 206
12 Scheduling and the Work Queue 208
- 12.1 From One-Shot to Standing Operation 208
- 12.2 The Work Queue as the Coordination Primitive 209
- 12.3 Claim and Lock: The Heart of Multi-Agent Safety 210
- 12.4 Status Reporting and Observability 214
- 12.5 Observability: Tracing a Request Across the Fleet 215
- 12.6 Human-Escalation Gates 216
- 12.7 Deduplication and Idempotency 217
- 12.8 Wiring the Budget Contract 218
- 12.9 A Multi-Agent Scheduled Workflow, Traced 218
- 12.10 Backfill, Catch-Up, and the Cold Start 219
- 12.11 Advanced Queue Architectures 220
- 12.12 Hierarchical and Supervisor Orchestration 221
- 12.13 A Scheduling Failure, Traced 222
- 12.14 Scaling the Fleet: 10 to 1,000 Agents 223
- 12.15 Capacity Math: Utilization, Queue Depth, and Little's Law 224
- 12.16 Operating It: Monitoring, Debugging, Scaling 226
- 12.17 Trade-offs and Mistakes 227
- 12.18 Lessons Learned, and Where We Go Next 227
13 A Worked Orchestration 229
- 13.1 The Factory as an Orchestration 229
- 13.2 Routing a Feature Through the Pipeline 230
- 13.3 Where Every Mechanism Appears 233
- 13.4 The Compile-Verify Loop in Depth 233
- 13.5 Testing the Orchestration: A Test Pyramid for Agent Systems 236
- 13.6 The Human Gates 237
- 13.7 Hands-On: Trace Your Own Pipeline 238
- 13.8 A Run That Went Wrong, and Recovered 238
- 13.9 The Same Shape, Other Factories 239
- 13.10 A Second Worked Case: A Software-Release Factory 240
- 13.11 A Third Worked Case: A Customer-Support Fleet 241
- 13.12 Cross-Team and Multi-Repo Orchestration 243
- 13.13 When Orchestration Is the Wrong Choice 245
- 13.14 Operating the Factory: Monitoring, Debugging, Scaling 245
- 13.15 Trade-offs, Mistakes, and Lessons 247
- 13.16 Lessons Learned, and Where We Go Next 247
14 Budgets and Spend Governance 249
- 14.1 The Accounting Seam 249
- 14.2 Per-Task-Class Cost Envelopes 251
- 14.3 The Cloud-versus-Local Break-Even 253
- 14.4 Fleet-Wide Caps and the Runaway 255
- 14.5 Cost Curves and the Shape of Fleet Spend 256
- 14.6 The Cost of Human Review 256
- 14.7 A Cost Explosion, Traced 257
- 14.8 FinOps for Agent Fleets: Chargeback and Incentives at Scale 258
- 14.9 What the Expensive Model Should Cost 259
- 14.10 Hands-On: Build a Cost Model for Your Task Mix 260
- 14.11 Showback: Attributing Cost to Projects and Owners 260
- 14.12 A Month of Spend, Reviewed 261
- 14.13 Caching, Token Efficiency, and Performance 262
- 14.14 Reasoning About the Numbers: Cost and Latency Models 263
- 14.15 Operating It: Monitoring, Forecasting, Maintenance 265
- 14.16 Trade-offs and Mistakes 266
- 14.17 Lessons Learned, and Where We Go Next 266
15 Failure and Escalation 268
- 15.1 Why Standing Fleets Fail Differently 268
- 15.2 Failure Mode: Collisions 269
- 15.3 Failure Mode: Compounding Error 270
- 15.4 Failure Mode: Poisoned Memory 270
- 15.5 Failure Mode: Prompt Injection 271
- 15.6 Failure Mode: Runaway 271
- 15.7 Advanced Failure Modes: Cascades, Correlation, and Drift 272
- 15.8 A Cascading Incident, Fully Traced 273
- 15.9 Evaluation: Measuring Agent Quality Over Time 274
- 15.10 Chaos Testing for Agent Fleets 276
- 15.11 The Escalation Discipline 277
- 15.12 Debugging an Agent in the Wild 279
- 15.13 Reproducibility and Replay 282
- 15.14 What to Never Automate 283
- 15.15 Building a Recovery Posture 284
- 15.16 Reliability Engineering: SLOs, Error Budgets, and On-Call 285
- 15.17 An Incident, Traced End to End 286
- 15.18 A Worked System: A Multi-Agent Operations Center 287
- 15.19 Trade-offs and Mistakes 289
- 15.20 Lessons Learned, and Where We Go Next 290
16 The Three-Year Horizon 292
- 16.1 The Maturity Arc 293
- 16.2 What Each Phase Unlocks and Gates On 294
- 16.3 Migration Paths Between Phases 296
- 16.4 The Honest Ceiling 297
- 16.5 Hands-On: Locate Yourself and Pick the Next Gate 298
- 16.6 Operational Lessons for the Long Haul 300
- 16.7 Deployment Archetypes: The Arc in Four Contexts 300
- 16.8 A Worked System: An Enterprise Research Organization 302
- 16.9 How the Team Evolves with the Fleet 304
- 16.10 How Maturity Regresses: Migration Failure Modes 305
- 16.11 A Three-Year Trajectory, Traced 306
- 16.12 Component Lifecycle: Versioning, Compatibility, and Retirement 307
- 16.13 A Dated Snapshot, So You Can Measure the Drift 308
- 16.14 Standing Review Triggers 309
- 16.15 Closing: Orchestrate, Don't Replace 310

Orchestrating AI Agents

You pay

Author earns

You pay

Author earns

About

Share this book

Categories

Feedback

Author

Contents

Get the free sample chapters

The Leanpub 60 Day 100% Happiness Guarantee

Earn $8 on a $10 Purchase, and $16 on a $20 Purchase

Free Updates. DRM Free.

Write and Publish on Leanpub