Codex CLI [Leanpub PDF/iPad/Kindle]

Codex CLI is the most comprehensive guide to agentic software engineering with OpenAI's command-line coding agent. Across 32 chapters, you'll move from first principles to advanced orchestration patterns — covering prompting, AGENTS.md configuration, MCP servers, hooks, skills, sub-agents, worktrees, CI/CD integration, security hardening, and enterprise deployment.

Whether you're a solo developer looking to multiply your output or an engineering lead rolling out agentic workflows across a team, this book gives you the mental models and practical techniques to work effectively with AI coding agents.

Written by Daniel Vaughan, drawing on real-world experience and community insights, every chapter includes learning objectives, worked examples, and hands-on exercises.

Copyright and Trademarks

Trademarks
Disclaimer

Introduction

I Started This on a Train
What Changed My Mind
Who This Book Is For
What You Won’t Find Here
How This Book Is Organised
A Note on Version Currency
How to Use This Book
Notes
Getting Your Bearings

Chapter 1. What Is Codex CLI?

Learning Objectives
Before Codex: A Brief History
The Problem with Autocomplete
Pre-Codex Agentic Landscape
Defining Codex CLI
Why the Terminal Matters
The Core Proposition
Summary
Exercises
Notes

Chapter 2. Getting Started with Codex CLI

Learning Objectives
Prerequisites
Installation
Authentication
Your First Session
Interactive REPL vs. codex exec
Where Configuration Lives
Summary
Exercises
Notes

Chapter 3. What’s New in Codex CLI

Learning Objectives
The Three Themes of 2026
The Model Landscape
Subagents: Generally Available
The Hooks System
New CLI Features
Enterprise Features
Staying Current
Summary
Exercises
Notes

Chapter 4. Benchmarks and Real-World Performance

Learning Objectives
The Benchmark Landscape
SWE-bench: Gold Standard to Cautionary Tale
Terminal-Bench 2.0: CLI-First Benchmarking
The Scaffolding Effect
What the Numbers Actually Mean for Your Team
Running Your Own Benchmarks
The Benchmark Hierarchy
Summary
Exercises
Notes

Chapter 5. Competing Tools and When to Use Each

Learning Objectives
The Two-Tier Landscape
Terminal Tier: The CLI Agents
IDE Tier: The Editor Agents
The Decision Framework
The Multi-Tool Pattern
Summary
Exercises
Notes

Chapter 6. Codex in the Wild: Interfaces and Community

Learning Objectives
Usage Limits: The Hidden Variable
What the Benchmarks Actually Tell You
The Core Personality Difference
Practical Handoff Patterns
The Power Stack: Using Both
Team Adoption Patterns
Interfaces: Desktop, CLI, and IDE
Summary
Exercises
Notes
Foundations

Chapter 7. Prompting Codex CLI Effectively

Learning Objectives
Why Codex CLI Prompting Is Different
The Anatomy of an Effective Prompt
Task Scoping and Reasoning Effort
Iterative Prompting and Mid-Session Corrections
Prompt Patterns for Common Tasks
Moving Durable Context Out of Prompts
Summary
Exercises
Notes

Chapter 8. AGENTS.md: Patterns and Pitfalls

Learning Objectives
What AGENTS.md Is and How Codex CLI Reads It
Essential Sections: Commits, Testing, Style
Project-Specific Context: What to Include and Omit
Common Mistakes and How They Manifest
AGENTS.md for Monorepos and Multi-Service Repos
Testing and Validating Your AGENTS.md
What Not to Do: Common AGENTS.md Mistakes
Starter Template
Summary
Exercises
Notes

Chapter 9. Approval Modes and Trust Boundaries

Learning Objectives
The Trust Model: What Codex Can Touch
The Four Approval Modes
Auto-Approve: When to Use It and When Not To
Sandboxing: Filesystem and Network Restrictions
Kernel-Level vs. Hook-Based Sandboxing
Approval Mode Strategy for Teams
Summary
Exercises
Notes

Chapter 10. Debugging and Diagnosing Agent Failures

Learning Objectives
Reading the Session Transcript
Using Approval Mode as a Diagnostic
Diagnosing AGENTS.md Failures
Context Overflow Symptoms
Recovering a Runaway Session
Structured Logging and --debug
Summary
Exercises
Notes

Chapter 11. Model Selection and Reasoning Effort

Learning Objectives
The Available Models
Reasoning Effort: The Second Knob
Task Taxonomy: Matching Model and Effort to Task
Cost Modelling: Estimating Monthly Spend
Model Selection in Automated Pipelines
Part 2 Summary
Summary
Exercises
Notes
The Extension Stack

Chapter 12. MCP: Consuming and Serving

Learning Objectives
What MCP Is—and What It Isn’t
The Architecture: Hosts, Clients, and Servers
Connecting Codex CLI to MCP Servers
Connecting to Common Servers (GitHub, Browser, Database)
The Context Cost of MCP: What Gets Loaded
Enterprise MCP: Authentication, Scoping, and Restrictions
Building a Simple MCP Server
Serving MCP from Codex
Codex CLI as an MCP Server
Beyond Read-Only: Write-Back Integration
Ticketing System Integration (Jira, Linear)
Communication Platform Integration (Slack, Teams)
Bidirectional Database Patterns
Safety Boundaries for Write-Enabled Agents
Summary
Exercises
Notes

Chapter 13. Hooks: Intercepting the Agent Lifecycle

Learning Objectives
The Hook System: Overview and Events
SessionStart: Configuring the Environment
UserPromptSubmit: Shaping Input Before the Agent Acts
PreToolUse: Intercepting Shell Commands
PostToolUse: Observing Without Blocking
Stop: Teardown and Cleanup
Writing Robust Hooks
Real Hook Patterns: Enforcement, Audit, and Notification
Summary
Exercises
Notes

Chapter 14. The Skills Ecosystem: Using and Writing Skills

Learning Objectives
Part 1: The Consumer’s View — Using and Browsing the Ecosystem
Part 2: The Producer’s View — Writing Your Own Skills
Summary
Exercises
Notes
Scale and Automation

Chapter 15. Context Window Management

Learning Objectives
The Quadratic Growth Problem
Thread Resume and Fork: Preserving Context Without Restarting
What Consumes Context (and What Doesn’t)
The /compact Command and Automatic Summarisation
Sub-Agent Delegation as Context Management
Strategies for Large Codebases
Monitoring Context Usage
The Model Lineage Context Compaction Breakthrough
Prompt Caching: Economics of Long Sessions
Summary
Exercises
Notes

Chapter 16. Sub-Agents and Parallel Execution

Learning Objectives
The Sub-Agent Model
The TOML Subagent Definition Format
Task Decomposition: What to Parallelise
spawn_agents_on_csv: Fan-Out Patterns
Aggregating Results and Handling Failures
Path-Based Sub-Agent Addressing
When Not to Parallelise
Summary
Exercises
Notes

Chapter 17. Cost Management and Quota Strategy

Learning Objectives
How Codex CLI quota works
Estimating Team Costs
Configuring Cost Ceilings
Monitoring and Alerting with Hooks
Cost-Quality Decision Matrix
Summary
Exercises
Notes

Chapter 18. Multi-Agent Orchestration Patterns

Learning Objectives
Pattern 1: Sequential Gated Chain
Pattern 2: Parallel Worker Swarm
Pattern 3: Wave-Based Hybrid
Choosing the Right Pattern
Debugging Orchestration Failures
Orchestration Anti-Patterns to Avoid
Summary
Exercises
Notes

Chapter 19. Worktrees and Isolated Execution

Learning Objectives
Git Worktrees: A Brief Recap
Why Worktrees Matter for Agentic Workflows
One Agent, One Worktree: The Isolation Principle
Worktrees in the Codex Desktop App
CLI Worktree Workflows
Merging Agent Work Back to Main
Worktree Lifecycle in CI
Worktree Workflow Patterns by Team Size
Summary
Exercises
Notes

Chapter 20. CI/CD Integration

Learning Objectives
Running Codex in Non-Interactive Mode
The openai/codex-action GitHub Action
Automated Code Review on Every PR
Test Generation on Merge
Dependency Update Agents
Structured Output and Session Resume
Safety Strategies for CI Agents
Summary
Exercises
Notes

Chapter 21. Security Hardening

Learning Objectives
The Threat Model for Agentic Systems
Prompt Injection: Attack Patterns and Defences
Filesystem Restrictions and Sandboxing
Network Allowlisting
Secret Management for Agent Environments
Audit Logging and Observability
Compliance Considerations
Summary
Exercises
Notes

Chapter 22. Enterprise Deployment

Learning Objectives
Distributing config.toml at Scale
RBAC: Three Admin Roles and Access Control
Managed Policies: requirements.toml
AGENTS.override.md: Enforcing Policy Across Teams
Onboarding an Engineering Team
Measuring ROI: Metrics That Work
Codex Cloud vs Self-Hosted
Governance Frameworks for Agentic AI
The Compliance API
Rollout Checklist
Summary
Exercises
Notes

Chapter 23. Testing and Evaluation Strategy for Agentic Workflows

Learning Objectives
What Makes a Test Suite Agent-Friendly
Designing for Agent Execution
The Feedback Signal Problem
Using Codex CLI to audit your test suite
Evaluation Beyond Unit Tests
Building an Evaluation Harness
TDD as an Agent Feedback Loop
The 4-File Durable Memory Pattern for Long-Horizon Evaluation
Summary
Exercises
Notes
Specialised Workflows

Chapter 24. AI Code Review

Learning Objectives
Why AI Code Review Works (and Where It Doesn’t)
Configuring Codex for Code Review
The /review Command
PR Integration: Automated Review on Every PR
Writing Review Checklists in AGENTS.md
Human-AI Review Collaboration Patterns
Summary
Exercises
Notes

Chapter 25. Frontend Engineering with React and TypeScript

Learning Objectives
Frontend-Specific AGENTS.md Configuration
Component Generation and Scaffolding
Test Generation for React Components
The Explorer/Worker Sub-Agent Pattern
Accessibility Audit Automation
Design-to-Code Workflows
Summary
Exercises
Notes

Chapter 26. Python Team Workflows

Learning Objectives
Python-Specific AGENTS.md
Pytest Integration and Test Generation
Type Hints and Docstring Automation
uv, ruff, and Modern Python Toolchain
Data Pipeline Code Generation
Multi-Service Python Workflows
Summary
Exercises
Notes

Chapter 27. Web Search and Research Agents

Learning Objectives
Enabling Web Search in Codex CLI
Research-to-Code Workflows
Dependency Research and Evaluation
Staying Current with API Changes
Knowledge-Augmented Agents: MCP Knowledge Servers
Combining Web Search with Sub-Agents
Summary
Exercises
Notes

Chapter 28. Codebase Migration

Learning Objectives
Why Codex CLI Excels at Migration Work
Planning the Migration with Codex CLI
Incremental Migration Patterns
Validation Strategies During Migration
Migrating from Claude Code to Codex CLI
Framework and Language Version Migrations
Summary
Exercises
Notes
Architecture and Vision

Chapter 29. The Agents SDK

Learning Objectives
What the Agents SDK Adds
The SDK Architecture: Agents, Handoffs, Tools, and Guardrails
Codex CLI as an MCP Server in SDK Pipelines
Building a Designer-Developer-Tester Pipeline
Tracing and Observability
SDK vs CLI: Choosing the Right Level
TypeScript SDK
Summary
Exercises
Notes

Chapter 30. Agentic Primitives Compared

Learning Objectives
The Four Primitives: Agents, Handoffs, Tools, Guardrails
How Codex CLI Implements Each Primitive
LangChain and LangGraph
AutoGen and CrewAI
Google Gemini Agents and ADK
Choosing a Primitives Model
Codex CLI’s Custom Agent TOML in Practice
Extending the Primitives: Custom Agents in TOML vs Code
Summary
Exercises
Notes

Chapter 31. Harness Engineering for Long-Running Agents

Learning Objectives
What Harness Engineering Is
The WORKFLOW.md Pattern
State Persistence Across Agent Runs
Error Recovery and Resumption
The Proof-of-Work Principle
Symphony: A Harness in Practice
Summary
Exercises
Notes

Chapter 32. The Agentic Engineering Pod

Learning Objectives
Section 1: Why Three Roles?
Section 2: The Context Architect: Human Role
Section 3: The Value Engineer: Human Role
Section 4: The Quality Engineer: Human Role
Section 5: The Pod in Practice: A Feature Lifecycle
Section 6: The Agentic Pod Principles
Section 7: Failure Modes and Pod Anti-Patterns
Closing
Exercises
Notes

Conclusion: The Agentic Engineer

A Note from the Author, Or Rather, the Tool

How This Book Was Made
What This Means for You
The Book as Artefact
An Invitation

Bibliography

1. Research Papers
2. OpenAI Sources
3. Standards and Security
4. Frameworks and Protocols
5. Benchmarks and Evaluations
6. Developer Tooling
7. Industry Reports and Blogs

About

Share this book

Categories

Feedback

Author

Contents