Copyright and Trademarks
- Trademarks
- Disclaimer
Introduction
- I Started This on a Train
- What Changed My Mind
- Who This Book Is For
- What You Won’t Find Here
- How This Book Is Organised
- A Note on Version Currency
- How to Use This Book
- Notes
- Getting Your Bearings
Chapter 1. What Is Codex CLI?
- Learning Objectives
- Before Codex: A Brief History
- The Problem with Autocomplete
- Pre-Codex Agentic Landscape
- Defining Codex CLI
- Why the Terminal Matters
- The Core Proposition
- Summary
- Exercises
- Notes
Chapter 2. Getting Started with Codex CLI
- Learning Objectives
- Prerequisites
- Installation
- Authentication
- Your First Session
- Interactive REPL vs.
codex exec - Where Configuration Lives
- Summary
- Exercises
- Notes
Chapter 3. What’s New in Codex CLI
- Learning Objectives
- The Three Themes of 2026
- The Model Landscape
- Subagents: Generally Available
- The Hooks System
- New CLI Features
- Enterprise Features
- Staying Current
- Summary
- Exercises
- Notes
Chapter 4. Benchmarks and Real-World Performance
- Learning Objectives
- The Benchmark Landscape
- SWE-bench: Gold Standard to Cautionary Tale
- Terminal-Bench 2.0: CLI-First Benchmarking
- The Scaffolding Effect
- What the Numbers Actually Mean for Your Team
- Running Your Own Benchmarks
- The Benchmark Hierarchy
- Summary
- Exercises
- Notes
Chapter 5. Competing Tools and When to Use Each
- Learning Objectives
- The Two-Tier Landscape
- Terminal Tier: The CLI Agents
- IDE Tier: The Editor Agents
- The Decision Framework
- The Multi-Tool Pattern
- Summary
- Exercises
- Notes
Chapter 6. Codex in the Wild: Interfaces and Community
- Learning Objectives
- Usage Limits: The Hidden Variable
- What the Benchmarks Actually Tell You
- The Core Personality Difference
- Practical Handoff Patterns
- The Power Stack: Using Both
- Team Adoption Patterns
- Interfaces: Desktop, CLI, and IDE
- Summary
- Exercises
- Notes
- Foundations
Chapter 7. Prompting Codex CLI Effectively
- Learning Objectives
- Why Codex CLI Prompting Is Different
- The Anatomy of an Effective Prompt
- Task Scoping and Reasoning Effort
- Iterative Prompting and Mid-Session Corrections
- Prompt Patterns for Common Tasks
- Moving Durable Context Out of Prompts
- Summary
- Exercises
- Notes
Chapter 8. AGENTS.md: Patterns and Pitfalls
- Learning Objectives
- What AGENTS.md Is and How Codex CLI Reads It
- Essential Sections: Commits, Testing, Style
- Project-Specific Context: What to Include and Omit
- Common Mistakes and How They Manifest
- AGENTS.md for Monorepos and Multi-Service Repos
- Testing and Validating Your AGENTS.md
- What Not to Do: Common AGENTS.md Mistakes
- Starter Template
- Summary
- Exercises
- Notes
Chapter 9. Approval Modes and Trust Boundaries
- Learning Objectives
- The Trust Model: What Codex Can Touch
- The Four Approval Modes
- Auto-Approve: When to Use It and When Not To
- Sandboxing: Filesystem and Network Restrictions
- Kernel-Level vs. Hook-Based Sandboxing
- Approval Mode Strategy for Teams
- Summary
- Exercises
- Notes
Chapter 10. Debugging and Diagnosing Agent Failures
- Learning Objectives
- Reading the Session Transcript
- Using Approval Mode as a Diagnostic
- Diagnosing AGENTS.md Failures
- Context Overflow Symptoms
- Recovering a Runaway Session
- Structured Logging and
--debug - Summary
- Exercises
- Notes
Chapter 11. Model Selection and Reasoning Effort
- Learning Objectives
- The Available Models
- Reasoning Effort: The Second Knob
- Task Taxonomy: Matching Model and Effort to Task
- Cost Modelling: Estimating Monthly Spend
- Model Selection in Automated Pipelines
- Part 2 Summary
- Summary
- Exercises
- Notes
- The Extension Stack
Chapter 12. MCP: Consuming and Serving
- Learning Objectives
- What MCP Is—and What It Isn’t
- The Architecture: Hosts, Clients, and Servers
- Connecting Codex CLI to MCP Servers
- Connecting to Common Servers (GitHub, Browser, Database)
- The Context Cost of MCP: What Gets Loaded
- Enterprise MCP: Authentication, Scoping, and Restrictions
- Building a Simple MCP Server
- Serving MCP from Codex
- Codex CLI as an MCP Server
- Beyond Read-Only: Write-Back Integration
- Ticketing System Integration (Jira, Linear)
- Communication Platform Integration (Slack, Teams)
- Bidirectional Database Patterns
- Safety Boundaries for Write-Enabled Agents
- Summary
- Exercises
- Notes
Chapter 13. Hooks: Intercepting the Agent Lifecycle
- Learning Objectives
- The Hook System: Overview and Events
- SessionStart: Configuring the Environment
- UserPromptSubmit: Shaping Input Before the Agent Acts
- PreToolUse: Intercepting Shell Commands
- PostToolUse: Observing Without Blocking
- Stop: Teardown and Cleanup
- Writing Robust Hooks
- Real Hook Patterns: Enforcement, Audit, and Notification
- Summary
- Exercises
- Notes
Chapter 14. The Skills Ecosystem: Using and Writing Skills
- Learning Objectives
- Part 1: The Consumer’s View — Using and Browsing the Ecosystem
- Part 2: The Producer’s View — Writing Your Own Skills
- Summary
- Exercises
- Notes
- Scale and Automation
Chapter 15. Context Window Management
- Learning Objectives
- The Quadratic Growth Problem
- Thread Resume and Fork: Preserving Context Without Restarting
- What Consumes Context (and What Doesn’t)
- The /compact Command and Automatic Summarisation
- Sub-Agent Delegation as Context Management
- Strategies for Large Codebases
- Monitoring Context Usage
- The Model Lineage Context Compaction Breakthrough
- Prompt Caching: Economics of Long Sessions
- Summary
- Exercises
- Notes
Chapter 16. Sub-Agents and Parallel Execution
- Learning Objectives
- The Sub-Agent Model
- The TOML Subagent Definition Format
- Task Decomposition: What to Parallelise
- spawn_agents_on_csv: Fan-Out Patterns
- Aggregating Results and Handling Failures
- Path-Based Sub-Agent Addressing
- When Not to Parallelise
- Summary
- Exercises
- Notes
Chapter 17. Cost Management and Quota Strategy
- Learning Objectives
- How Codex CLI quota works
- Estimating Team Costs
- Configuring Cost Ceilings
- Monitoring and Alerting with Hooks
- Cost-Quality Decision Matrix
- Summary
- Exercises
- Notes
Chapter 18. Multi-Agent Orchestration Patterns
- Learning Objectives
- Pattern 1: Sequential Gated Chain
- Pattern 2: Parallel Worker Swarm
- Pattern 3: Wave-Based Hybrid
- Choosing the Right Pattern
- Debugging Orchestration Failures
- Orchestration Anti-Patterns to Avoid
- Summary
- Exercises
- Notes
Chapter 19. Worktrees and Isolated Execution
- Learning Objectives
- Git Worktrees: A Brief Recap
- Why Worktrees Matter for Agentic Workflows
- One Agent, One Worktree: The Isolation Principle
- Worktrees in the Codex Desktop App
- CLI Worktree Workflows
- Merging Agent Work Back to Main
- Worktree Lifecycle in CI
- Worktree Workflow Patterns by Team Size
- Summary
- Exercises
- Notes
Chapter 20. CI/CD Integration
- Learning Objectives
- Running Codex in Non-Interactive Mode
- The openai/codex-action GitHub Action
- Automated Code Review on Every PR
- Test Generation on Merge
- Dependency Update Agents
- Structured Output and Session Resume
- Safety Strategies for CI Agents
- Summary
- Exercises
- Notes
Chapter 21. Security Hardening
- Learning Objectives
- The Threat Model for Agentic Systems
- Prompt Injection: Attack Patterns and Defences
- Filesystem Restrictions and Sandboxing
- Network Allowlisting
- Secret Management for Agent Environments
- Audit Logging and Observability
- Compliance Considerations
- Summary
- Exercises
- Notes
Chapter 22. Enterprise Deployment
- Learning Objectives
- Distributing config.toml at Scale
- RBAC: Three Admin Roles and Access Control
- Managed Policies: requirements.toml
- AGENTS.override.md: Enforcing Policy Across Teams
- Onboarding an Engineering Team
- Measuring ROI: Metrics That Work
- Codex Cloud vs Self-Hosted
- Governance Frameworks for Agentic AI
- The Compliance API
- Rollout Checklist
- Summary
- Exercises
- Notes
Chapter 23. Testing and Evaluation Strategy for Agentic Workflows
- Learning Objectives
- What Makes a Test Suite Agent-Friendly
- Designing for Agent Execution
- The Feedback Signal Problem
- Using Codex CLI to audit your test suite
- Evaluation Beyond Unit Tests
- Building an Evaluation Harness
- TDD as an Agent Feedback Loop
- The 4-File Durable Memory Pattern for Long-Horizon Evaluation
- Summary
- Exercises
- Notes
- Specialised Workflows
Chapter 24. AI Code Review
- Learning Objectives
- Why AI Code Review Works (and Where It Doesn’t)
- Configuring Codex for Code Review
- The /review Command
- PR Integration: Automated Review on Every PR
- Writing Review Checklists in AGENTS.md
- Human-AI Review Collaboration Patterns
- Summary
- Exercises
- Notes
Chapter 25. Frontend Engineering with React and TypeScript
- Learning Objectives
- Frontend-Specific AGENTS.md Configuration
- Component Generation and Scaffolding
- Test Generation for React Components
- The Explorer/Worker Sub-Agent Pattern
- Accessibility Audit Automation
- Design-to-Code Workflows
- Summary
- Exercises
- Notes
Chapter 26. Python Team Workflows
- Learning Objectives
- Python-Specific AGENTS.md
- Pytest Integration and Test Generation
- Type Hints and Docstring Automation
- uv, ruff, and Modern Python Toolchain
- Data Pipeline Code Generation
- Multi-Service Python Workflows
- Summary
- Exercises
- Notes
Chapter 27. Web Search and Research Agents
- Learning Objectives
- Enabling Web Search in Codex CLI
- Research-to-Code Workflows
- Dependency Research and Evaluation
- Staying Current with API Changes
- Knowledge-Augmented Agents: MCP Knowledge Servers
- Combining Web Search with Sub-Agents
- Summary
- Exercises
- Notes
Chapter 28. Codebase Migration
- Learning Objectives
- Why Codex CLI Excels at Migration Work
- Planning the Migration with Codex CLI
- Incremental Migration Patterns
- Validation Strategies During Migration
- Migrating from Claude Code to Codex CLI
- Framework and Language Version Migrations
- Summary
- Exercises
- Notes
- Architecture and Vision
Chapter 29. The Agents SDK
- Learning Objectives
- What the Agents SDK Adds
- The SDK Architecture: Agents, Handoffs, Tools, and Guardrails
- Codex CLI as an MCP Server in SDK Pipelines
- Building a Designer-Developer-Tester Pipeline
- Tracing and Observability
- SDK vs CLI: Choosing the Right Level
- TypeScript SDK
- Summary
- Exercises
- Notes
Chapter 30. Agentic Primitives Compared
- Learning Objectives
- The Four Primitives: Agents, Handoffs, Tools, Guardrails
- How Codex CLI Implements Each Primitive
- LangChain and LangGraph
- AutoGen and CrewAI
- Google Gemini Agents and ADK
- Choosing a Primitives Model
- Codex CLI’s Custom Agent TOML in Practice
- Extending the Primitives: Custom Agents in TOML vs Code
- Summary
- Exercises
- Notes
Chapter 31. Harness Engineering for Long-Running Agents
- Learning Objectives
- What Harness Engineering Is
- The WORKFLOW.md Pattern
- State Persistence Across Agent Runs
- Error Recovery and Resumption
- The Proof-of-Work Principle
- Symphony: A Harness in Practice
- Summary
- Exercises
- Notes
Chapter 32. The Agentic Engineering Pod
- Learning Objectives
- Section 1: Why Three Roles?
- Section 2: The Context Architect: Human Role
- Section 3: The Value Engineer: Human Role
- Section 4: The Quality Engineer: Human Role
- Section 5: The Pod in Practice: A Feature Lifecycle
- Section 6: The Agentic Pod Principles
- Section 7: Failure Modes and Pod Anti-Patterns
- Closing
- Exercises
- Notes
Conclusion: The Agentic Engineer
A Note from the Author, Or Rather, the Tool
- How This Book Was Made
- What This Means for You
- The Book as Artefact
- An Invitation
Bibliography
- 1. Research Papers
- 2. OpenAI Sources
- 3. Standards and Security
- 4. Frameworks and Protocols
- 5. Benchmarks and Evaluations
- 6. Developer Tooling
- 7. Industry Reports and Blogs