Leanpub Header

Skip to main content

Codex CLI

Agentic Engineering from First Principles

The definitive guide to agentic software engineering with Codex CLI, from prompting fundamentals to multi-agent orchestration, CI/CD integration, and enterprise deployment across 32 hands-on chapters.

Minimum price

$19.00

$29.00

You pay

$29.00

Author earns

$23.20
$

...Or Buy With Credits!

You can get credits with a paid monthly or annual Reader Membership, or you can buy them here.
PDF
EPUB
WEB
About

About

About the Book

Codex CLI is the most comprehensive guide to agentic software engineering with OpenAI's command-line coding agent. Across 32 chapters, you'll move from first principles to advanced orchestration patterns — covering prompting, AGENTS.md configuration, MCP servers, hooks, skills, sub-agents, worktrees, CI/CD integration, security hardening, and enterprise deployment.

Whether you're a solo developer looking to multiply your output or an engineering lead rolling out agentic workflows across a team, this book gives you the mental models and practical techniques to work effectively with AI coding agents.

Written by Daniel Vaughan, drawing on real-world experience and community insights, every chapter includes learning objectives, worked examples, and hands-on exercises.

Author

About the Author

Contents

Table of Contents

Copyright and Trademarks

  1. Trademarks
  2. Disclaimer

Introduction

  1. I Started This on a Train
  2. What Changed My Mind
  3. Who This Book Is For
  4. What You Won’t Find Here
  5. How This Book Is Organised
  6. A Note on Version Currency
  7. How to Use This Book
  8. Notes
  9. Getting Your Bearings

Chapter 1. What Is Codex CLI?

  1. Learning Objectives
  2. Before Codex: A Brief History
  3. The Problem with Autocomplete
  4. Pre-Codex Agentic Landscape
  5. Defining Codex CLI
  6. Why the Terminal Matters
  7. The Core Proposition
  8. Summary
  9. Exercises
  10. Notes

Chapter 2. Getting Started with Codex CLI

  1. Learning Objectives
  2. Prerequisites
  3. Installation
  4. Authentication
  5. Your First Session
  6. Interactive REPL vs. codex exec
  7. Where Configuration Lives
  8. Summary
  9. Exercises
  10. Notes

Chapter 3. What’s New in Codex CLI

  1. Learning Objectives
  2. The Three Themes of 2026
  3. The Model Landscape
  4. Subagents: Generally Available
  5. The Hooks System
  6. New CLI Features
  7. Enterprise Features
  8. Staying Current
  9. Summary
  10. Exercises
  11. Notes

Chapter 4. Benchmarks and Real-World Performance

  1. Learning Objectives
  2. The Benchmark Landscape
  3. SWE-bench: Gold Standard to Cautionary Tale
  4. Terminal-Bench 2.0: CLI-First Benchmarking
  5. The Scaffolding Effect
  6. What the Numbers Actually Mean for Your Team
  7. Running Your Own Benchmarks
  8. The Benchmark Hierarchy
  9. Summary
  10. Exercises
  11. Notes

Chapter 5. Competing Tools and When to Use Each

  1. Learning Objectives
  2. The Two-Tier Landscape
  3. Terminal Tier: The CLI Agents
  4. IDE Tier: The Editor Agents
  5. The Decision Framework
  6. The Multi-Tool Pattern
  7. Summary
  8. Exercises
  9. Notes

Chapter 6. Codex in the Wild: Interfaces and Community

  1. Learning Objectives
  2. Usage Limits: The Hidden Variable
  3. What the Benchmarks Actually Tell You
  4. The Core Personality Difference
  5. Practical Handoff Patterns
  6. The Power Stack: Using Both
  7. Team Adoption Patterns
  8. Interfaces: Desktop, CLI, and IDE
  9. Summary
  10. Exercises
  11. Notes
  12. Foundations

Chapter 7. Prompting Codex CLI Effectively

  1. Learning Objectives
  2. Why Codex CLI Prompting Is Different
  3. The Anatomy of an Effective Prompt
  4. Task Scoping and Reasoning Effort
  5. Iterative Prompting and Mid-Session Corrections
  6. Prompt Patterns for Common Tasks
  7. Moving Durable Context Out of Prompts
  8. Summary
  9. Exercises
  10. Notes

Chapter 8. AGENTS.md: Patterns and Pitfalls

  1. Learning Objectives
  2. What AGENTS.md Is and How Codex CLI Reads It
  3. Essential Sections: Commits, Testing, Style
  4. Project-Specific Context: What to Include and Omit
  5. Common Mistakes and How They Manifest
  6. AGENTS.md for Monorepos and Multi-Service Repos
  7. Testing and Validating Your AGENTS.md
  8. What Not to Do: Common AGENTS.md Mistakes
  9. Starter Template
  10. Summary
  11. Exercises
  12. Notes

Chapter 9. Approval Modes and Trust Boundaries

  1. Learning Objectives
  2. The Trust Model: What Codex Can Touch
  3. The Four Approval Modes
  4. Auto-Approve: When to Use It and When Not To
  5. Sandboxing: Filesystem and Network Restrictions
  6. Kernel-Level vs. Hook-Based Sandboxing
  7. Approval Mode Strategy for Teams
  8. Summary
  9. Exercises
  10. Notes

Chapter 10. Debugging and Diagnosing Agent Failures

  1. Learning Objectives
  2. Reading the Session Transcript
  3. Using Approval Mode as a Diagnostic
  4. Diagnosing AGENTS.md Failures
  5. Context Overflow Symptoms
  6. Recovering a Runaway Session
  7. Structured Logging and --debug
  8. Summary
  9. Exercises
  10. Notes

Chapter 11. Model Selection and Reasoning Effort

  1. Learning Objectives
  2. The Available Models
  3. Reasoning Effort: The Second Knob
  4. Task Taxonomy: Matching Model and Effort to Task
  5. Cost Modelling: Estimating Monthly Spend
  6. Model Selection in Automated Pipelines
  7. Part 2 Summary
  8. Summary
  9. Exercises
  10. Notes
  11. The Extension Stack

Chapter 12. MCP: Consuming and Serving

  1. Learning Objectives
  2. What MCP Is—and What It Isn’t
  3. The Architecture: Hosts, Clients, and Servers
  4. Connecting Codex CLI to MCP Servers
  5. Connecting to Common Servers (GitHub, Browser, Database)
  6. The Context Cost of MCP: What Gets Loaded
  7. Enterprise MCP: Authentication, Scoping, and Restrictions
  8. Building a Simple MCP Server
  9. Serving MCP from Codex
  10. Codex CLI as an MCP Server
  11. Beyond Read-Only: Write-Back Integration
  12. Ticketing System Integration (Jira, Linear)
  13. Communication Platform Integration (Slack, Teams)
  14. Bidirectional Database Patterns
  15. Safety Boundaries for Write-Enabled Agents
  16. Summary
  17. Exercises
  18. Notes

Chapter 13. Hooks: Intercepting the Agent Lifecycle

  1. Learning Objectives
  2. The Hook System: Overview and Events
  3. SessionStart: Configuring the Environment
  4. UserPromptSubmit: Shaping Input Before the Agent Acts
  5. PreToolUse: Intercepting Shell Commands
  6. PostToolUse: Observing Without Blocking
  7. Stop: Teardown and Cleanup
  8. Writing Robust Hooks
  9. Real Hook Patterns: Enforcement, Audit, and Notification
  10. Summary
  11. Exercises
  12. Notes

Chapter 14. The Skills Ecosystem: Using and Writing Skills

  1. Learning Objectives
  2. Part 1: The Consumer’s View — Using and Browsing the Ecosystem
  3. Part 2: The Producer’s View — Writing Your Own Skills
  4. Summary
  5. Exercises
  6. Notes
  7. Scale and Automation

Chapter 15. Context Window Management

  1. Learning Objectives
  2. The Quadratic Growth Problem
  3. Thread Resume and Fork: Preserving Context Without Restarting
  4. What Consumes Context (and What Doesn’t)
  5. The /compact Command and Automatic Summarisation
  6. Sub-Agent Delegation as Context Management
  7. Strategies for Large Codebases
  8. Monitoring Context Usage
  9. The Model Lineage Context Compaction Breakthrough
  10. Prompt Caching: Economics of Long Sessions
  11. Summary
  12. Exercises
  13. Notes

Chapter 16. Sub-Agents and Parallel Execution

  1. Learning Objectives
  2. The Sub-Agent Model
  3. The TOML Subagent Definition Format
  4. Task Decomposition: What to Parallelise
  5. spawn_agents_on_csv: Fan-Out Patterns
  6. Aggregating Results and Handling Failures
  7. Path-Based Sub-Agent Addressing
  8. When Not to Parallelise
  9. Summary
  10. Exercises
  11. Notes

Chapter 17. Cost Management and Quota Strategy

  1. Learning Objectives
  2. How Codex CLI quota works
  3. Estimating Team Costs
  4. Configuring Cost Ceilings
  5. Monitoring and Alerting with Hooks
  6. Cost-Quality Decision Matrix
  7. Summary
  8. Exercises
  9. Notes

Chapter 18. Multi-Agent Orchestration Patterns

  1. Learning Objectives
  2. Pattern 1: Sequential Gated Chain
  3. Pattern 2: Parallel Worker Swarm
  4. Pattern 3: Wave-Based Hybrid
  5. Choosing the Right Pattern
  6. Debugging Orchestration Failures
  7. Orchestration Anti-Patterns to Avoid
  8. Summary
  9. Exercises
  10. Notes

Chapter 19. Worktrees and Isolated Execution

  1. Learning Objectives
  2. Git Worktrees: A Brief Recap
  3. Why Worktrees Matter for Agentic Workflows
  4. One Agent, One Worktree: The Isolation Principle
  5. Worktrees in the Codex Desktop App
  6. CLI Worktree Workflows
  7. Merging Agent Work Back to Main
  8. Worktree Lifecycle in CI
  9. Worktree Workflow Patterns by Team Size
  10. Summary
  11. Exercises
  12. Notes

Chapter 20. CI/CD Integration

  1. Learning Objectives
  2. Running Codex in Non-Interactive Mode
  3. The openai/codex-action GitHub Action
  4. Automated Code Review on Every PR
  5. Test Generation on Merge
  6. Dependency Update Agents
  7. Structured Output and Session Resume
  8. Safety Strategies for CI Agents
  9. Summary
  10. Exercises
  11. Notes

Chapter 21. Security Hardening

  1. Learning Objectives
  2. The Threat Model for Agentic Systems
  3. Prompt Injection: Attack Patterns and Defences
  4. Filesystem Restrictions and Sandboxing
  5. Network Allowlisting
  6. Secret Management for Agent Environments
  7. Audit Logging and Observability
  8. Compliance Considerations
  9. Summary
  10. Exercises
  11. Notes

Chapter 22. Enterprise Deployment

  1. Learning Objectives
  2. Distributing config.toml at Scale
  3. RBAC: Three Admin Roles and Access Control
  4. Managed Policies: requirements.toml
  5. AGENTS.override.md: Enforcing Policy Across Teams
  6. Onboarding an Engineering Team
  7. Measuring ROI: Metrics That Work
  8. Codex Cloud vs Self-Hosted
  9. Governance Frameworks for Agentic AI
  10. The Compliance API
  11. Rollout Checklist
  12. Summary
  13. Exercises
  14. Notes

Chapter 23. Testing and Evaluation Strategy for Agentic Workflows

  1. Learning Objectives
  2. What Makes a Test Suite Agent-Friendly
  3. Designing for Agent Execution
  4. The Feedback Signal Problem
  5. Using Codex CLI to audit your test suite
  6. Evaluation Beyond Unit Tests
  7. Building an Evaluation Harness
  8. TDD as an Agent Feedback Loop
  9. The 4-File Durable Memory Pattern for Long-Horizon Evaluation
  10. Summary
  11. Exercises
  12. Notes
  13. Specialised Workflows

Chapter 24. AI Code Review

  1. Learning Objectives
  2. Why AI Code Review Works (and Where It Doesn’t)
  3. Configuring Codex for Code Review
  4. The /review Command
  5. PR Integration: Automated Review on Every PR
  6. Writing Review Checklists in AGENTS.md
  7. Human-AI Review Collaboration Patterns
  8. Summary
  9. Exercises
  10. Notes

Chapter 25. Frontend Engineering with React and TypeScript

  1. Learning Objectives
  2. Frontend-Specific AGENTS.md Configuration
  3. Component Generation and Scaffolding
  4. Test Generation for React Components
  5. The Explorer/Worker Sub-Agent Pattern
  6. Accessibility Audit Automation
  7. Design-to-Code Workflows
  8. Summary
  9. Exercises
  10. Notes

Chapter 26. Python Team Workflows

  1. Learning Objectives
  2. Python-Specific AGENTS.md
  3. Pytest Integration and Test Generation
  4. Type Hints and Docstring Automation
  5. uv, ruff, and Modern Python Toolchain
  6. Data Pipeline Code Generation
  7. Multi-Service Python Workflows
  8. Summary
  9. Exercises
  10. Notes

Chapter 27. Web Search and Research Agents

  1. Learning Objectives
  2. Enabling Web Search in Codex CLI
  3. Research-to-Code Workflows
  4. Dependency Research and Evaluation
  5. Staying Current with API Changes
  6. Knowledge-Augmented Agents: MCP Knowledge Servers
  7. Combining Web Search with Sub-Agents
  8. Summary
  9. Exercises
  10. Notes

Chapter 28. Codebase Migration

  1. Learning Objectives
  2. Why Codex CLI Excels at Migration Work
  3. Planning the Migration with Codex CLI
  4. Incremental Migration Patterns
  5. Validation Strategies During Migration
  6. Migrating from Claude Code to Codex CLI
  7. Framework and Language Version Migrations
  8. Summary
  9. Exercises
  10. Notes
  11. Architecture and Vision

Chapter 29. The Agents SDK

  1. Learning Objectives
  2. What the Agents SDK Adds
  3. The SDK Architecture: Agents, Handoffs, Tools, and Guardrails
  4. Codex CLI as an MCP Server in SDK Pipelines
  5. Building a Designer-Developer-Tester Pipeline
  6. Tracing and Observability
  7. SDK vs CLI: Choosing the Right Level
  8. TypeScript SDK
  9. Summary
  10. Exercises
  11. Notes

Chapter 30. Agentic Primitives Compared

  1. Learning Objectives
  2. The Four Primitives: Agents, Handoffs, Tools, Guardrails
  3. How Codex CLI Implements Each Primitive
  4. LangChain and LangGraph
  5. AutoGen and CrewAI
  6. Google Gemini Agents and ADK
  7. Choosing a Primitives Model
  8. Codex CLI’s Custom Agent TOML in Practice
  9. Extending the Primitives: Custom Agents in TOML vs Code
  10. Summary
  11. Exercises
  12. Notes

Chapter 31. Harness Engineering for Long-Running Agents

  1. Learning Objectives
  2. What Harness Engineering Is
  3. The WORKFLOW.md Pattern
  4. State Persistence Across Agent Runs
  5. Error Recovery and Resumption
  6. The Proof-of-Work Principle
  7. Symphony: A Harness in Practice
  8. Summary
  9. Exercises
  10. Notes

Chapter 32. The Agentic Engineering Pod

  1. Learning Objectives
  2. Section 1: Why Three Roles?
  3. Section 2: The Context Architect: Human Role
  4. Section 3: The Value Engineer: Human Role
  5. Section 4: The Quality Engineer: Human Role
  6. Section 5: The Pod in Practice: A Feature Lifecycle
  7. Section 6: The Agentic Pod Principles
  8. Section 7: Failure Modes and Pod Anti-Patterns
  9. Closing
  10. Exercises
  11. Notes

Conclusion: The Agentic Engineer

A Note from the Author, Or Rather, the Tool

  1. How This Book Was Made
  2. What This Means for You
  3. The Book as Artefact
  4. An Invitation

Bibliography

  1. 1. Research Papers
  2. 2. OpenAI Sources
  3. 3. Standards and Security
  4. 4. Frameworks and Protocols
  5. 5. Benchmarks and Evaluations
  6. 6. Developer Tooling
  7. 7. Industry Reports and Blogs

The Leanpub 60 Day 100% Happiness Guarantee

Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.

Now, this is technically risky for us, since you'll have the book or course files either way. But we're so confident in our products and services, and in our authors and readers, that we're happy to offer a full money back guarantee for everything we sell.

You can only find out how good something is by trying it, and because of our 100% money back guarantee there's literally no risk to do so!

So, there's no reason not to click the Add to Cart button, is there?

See full terms...

Earn $8 on a $10 Purchase, and $16 on a $20 Purchase

We pay 80% royalties on purchases of $7.99 or more, and 80% royalties minus a 50 cent flat fee on purchases between $0.99 and $7.98. You earn $8 on a $10 sale, and $16 on a $20 sale. So, if we sell 5000 non-refunded copies of your book for $20, you'll earn $80,000.

(Yes, some authors have already earned much more than that on Leanpub.)

In fact, authors have earned over $14 million writing, publishing and selling on Leanpub.

Learn more about writing on Leanpub

Free Updates. DRM Free.

If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).

Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.

Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.

Learn more about Leanpub's ebook formats and where to read them

Write and Publish on Leanpub

You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!

Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.

Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF and/or EPUB files and then publish with one click!) It really is that easy.

Learn more about writing on Leanpub