Securing Enterprise AI Agents
- For Marilin De Vos
- In memory of my father, Flor De Vos
- What this book is
- Who this book is for, and how to scale the burden
- What you will not find here
- How to read this book
- A note on tools and versions
Part I
- Why agentic AI changes the risk model
Chapter 1: From chatbots to actors
- Learning objectives
- 1.1 Three things people call AI features
- 1.2 The four delegated authorities
- 1.3 Why production risk increases non-linearly
- 1.4 A short worked example: the trade finance “chatbot”
- Exercises
- Summary
- Notes
Chapter 2: The agentic failure modes
- Learning objectives
- 2.1 Tool misuse
- 2.2 Data leakage
- 2.3 Permission escalation
- 2.4 Prompt injection
- 2.5 Silent workflow corruption
- 2.6 Cost explosions
- 2.7 Audit gaps
- 2.8 Model regression
- 2.9 How the failure modes compose
- A worked example: the failure mode register for a know-your-customer agent
- Exercises
- Summary
- Notes
Chapter 3: The enterprise agent operating model
- Learning objectives
- 3.1 The ownership problem
- 3.2 The four artefacts
- 3.3 Human-in-the-loop is a control, not a fallback
- 3.4 The right cadence
- A worked example: the operating model for a sanctions screening copilot
- Exercises
- Summary
- Notes
Part II
- Architecture of production agentic systems
Chapter 4: Reference architecture for secure agents
- Learning objectives
- 4.1 The seven layers
- 4.2 Why each layer needs its own trust boundary
- 4.3 The blast radius rule
- 4.4 The reference deployment
- 4.5 Layer-specific design notes
- A worked example: redrawing the sanctions screening copilot
- Exercises
- Summary
- Notes
Chapter 5: MCP and the tool-using AI stack
- Learning objectives
- 5.1 What MCP actually is
- 5.2 Local versus remote servers, and why the difference matters
- 5.3 Tool poisoning: descriptions as injection surface
- 5.4 The MCP supply chain
- 5.5 Server trust boundaries
- 5.6 Connector lifecycle
- 5.7 When to use MCP and when not to
- A worked example: an MCP server for sanctions screening
- Exercises
- Summary
- Notes
Chapter 6: Identity for agents
- Learning objectives
- 6.1 The Confused Deputy problem, restated for agents
- 6.2 Why human identity does not transfer
- 6.3 Delegated authority in practice: OAuth 2.1 token exchange
- 6.5 Least privilege for agents
- 6.6 Just-in-time access
- 6.7 Credential handling
- 6.8 Auditability and revocation
- A worked example: identity for the sanctions screening copilot
- Exercises
- Summary
- Notes
Chapter 7: Securing tool calls
- Learning objectives
- 7.1 The capability boundary
- 7.2 Allow lists, deny lists, and the default
- 7.3 Tool classification by blast radius
- 7.4 Argument validation
- 7.5 Sandboxing dangerous tools
- 7.6 Approval workflows
- 7.7 The asymmetric dual-LLM guardrail pattern
- 7.8 Idempotency and rollback
- 7.9 The capability matrix
- 7.10 The Agent Contract: capability matrix as code
- 7.11 Taming primitives: software-engineering recipes for agent security
- A worked example: securing the release-payment tool
- Exercises
- Summary
- Notes
Part III
- Evals, observability, and reliability
Chapter 8: Evals are the new test suite
- Learning objectives
- 8.1 What evals actually are
- 8.2 Building the regression set
- 8.3 LLM-as-judge: when it works and when it does not
- 8.4 Human review
- 8.5 Regression gates as a release control
- 8.6 What to score
- 8.7 The eval-set lifecycle
- A worked example: a starter eval set for the sanctions screening copilot
- Exercises
- Summary
- Notes
Chapter 9: Observability for agents
- Learning objectives
- 9.1 The trace as the unit of observability
- 9.2 The fields that matter most
- 9.3 Prompt and context as observable artefacts
- 9.4 Tool-call observability
- 9.5 Retrieval and memory in the trace
- 9.6 Cost and latency as quality signals
- 9.7 Drift detection
- 9.8 Replay
- A worked example: the sanctions screening incident replay
- Exercises
- Summary
- Notes
Chapter 10: Agent reliability engineering
- Learning objectives
- 10.1 The longer-prompt antipattern
- 10.2 Timeouts
- 10.3 Retries
- 10.4 Idempotency
- 10.5 State machines and durable workflows
- 10.6 Circuit breakers
- 10.7 Algorithmic denial of service: cost as a security vector
- 10.8 Dead-letter queues and human escalation
- 10.9 Failure modes the agent should recognise
- A worked example: the sanctions screening agent as a state machine
- Exercises
- Summary
- Notes
Part IV
- Secure RAG and enterprise knowledge
Chapter 11: RAG as an attack surface
- Learning objectives
- 11.1 Why RAG is an attack surface
- 11.2 Retrieval poisoning
- 11.3 Indirect prompt injection through ingestion
- 11.4 Citation hallucination
- 11.5 Permissions-aware retrieval
- 11.6 Data exfiltration through retrieval
- 11.7 The corpus you trust versus the corpus you ingest
- 11.8 The threat model template
- A worked example: the customer-facing knowledge base RAG
- Exercises
- Summary
- Notes
Chapter 12: Production RAG architecture
- Learning objectives
- 12.1 The pipeline
- 12.2 Chunking is more important than retrieval
- 12.3 Hybrid retrieval
- 12.4 Reranking
- 12.5 Freshness
- 12.6 Evaluation of retrieval
- 12.7 Monitoring retrieval quality in production
- 12.8 The architecture diagram
- A worked example: the customer FAQ RAG, in detail
- Exercises
- Summary
- Notes
Chapter 13: Knowledge governance
- Learning objectives
- 13.1 Classification at the source
- 13.2 Entitlements
- 13.3 PII
- 13.4 Audit trails
- 13.5 Retention for agent-specific artefacts
- 13.6 Legal discovery
- 13.7 Document lifecycle
- 13.8 The governance overlay
- A worked example: the corpus governance for the customer FAQ RAG
- Exercises
- Summary
- Notes
Part V
- Secure agentic coding
Chapter 14: Coding agents in professional teams
- Learning objectives
- 14.1 The unit of agent output
- 14.2 Repo context: what the agent reads
- 14.3 Task planning
- 14.4 The review problem
- 14.5 Test generation as multiplier or blind spot
- 14.6 The platform team’s role
- 14.7 Productivity measurement
- A worked example: the FS team and the Claude Code rollout
- Exercises
- Summary
- Notes
Chapter 15: Security boundaries for AI coding
- Learning objectives
- 15.1 Secrets
- 15.2 Dependency risk
- 15.3 Vulnerability introduction
- 15.4 The human review rule
- 15.5 The CI pipeline as control surface
- 15.6 The repository as security boundary
- 15.7 What happens when an agent introduces an incident
- A worked example: hardening the FS team’s coding agent
- Exercises
- Summary
- Notes
Chapter 16: From vibe coding to governed delivery
- Learning objectives
- 16.1 What “vibe coding” actually is
- 16.2 The gap, in detail
- 16.3 Architecture decision records
- 16.4 The delivery pipeline
- 16.5 Productivity measurement, reframed
- 16.6 The role shift
- 16.7 The vibe-to-secure hardening track
- 16.8 What “governed delivery” looks like in practice
- A worked example: the FS team six months later
- Exercises
- Summary
- Notes
Part VI
- Deployment and governance
Chapter 17: Agent deployment patterns
- Learning objectives
- 17.1 Internal copilots
- 17.2 Workflow agents
- 17.3 Customer-facing agents
- 17.4 Developer agents
- 17.5 Security agents
- 17.6 Choosing the right pattern
- 17.7 Rollout patterns
- 17.8 The post-launch review
- A worked example: classifying a real proposal
- Exercises
- Summary
- Notes
Chapter 18: Policy as code for AI agents
- Learning objectives
- 18.1 Why code, not documentation
- 18.2 The policy plane
- 18.3 Policy engines
- 18.4 Writing policies in Rego
- 18.5 Writing policies in Cedar
- 18.6 Intercepting MCP JSON-RPC calls
- 18.7 Worked scenario: PII tool from outside the VPN
- 18.8 Policy evaluation at runtime
- 18.9 Versioning and rollout
- 18.10 Audit advantage
- 18.11 Patterns for keeping the plane manageable
- A worked example: the policy plane for the sanctions screening copilot
- Exercises
- Summary
- Notes
Chapter 19: Incident response for AI systems
- Learning objectives
- 19.1 Six incident types
- 19.2 Why your playbook does not cover them
- 19.3 Detection
- 19.4 Containment
- 19.5 Forensics
- 19.6 Recovery
- 19.7 Tabletops
- 19.8 The post-mortem
- A worked example: the customer chatbot mortgage refund tabletop
- Exercises
- Summary
- Notes
Chapter 20: Continuous AgentSecOps: build-time and run-time security as one architecture
- Learning objectives
- 20.1 Why one architecture, not two
- 20.2 The build-time pillar
- 20.3 The runtime pillar
- 20.4 The closed loop: from production signal to next-build gate
- 20.5 Automatic vulnerability remediation: where it works
- 20.6 Autonomous Attack Simulation
- 20.7 What to measure to know it is working
- A worked example: the unified architecture for the sanctions screening copilot
- Exercises
- Summary
- Notes
Chapter 21: The production readiness checklist
- Learning objectives
- 21.1 How to use the checklist
- 21.2 The architecture section
- 21.3 The security section
- 21.4 The evals section
- 21.5 The observability section
- 21.6 The governance section
- 21.7 The rollout section
- 21.8 The conversation patterns
- 21.9 The post-launch review, in one place
- 21.10 Defending a no-go
- 21.11 Closing thoughts
- Exercises
- Summary
- Notes
