This book takes you from your first LLM API call to a deployable, production-ready AI workbench — covering agents, RAG, MCP servers, and multimodal AI, all in TypeScript.

What you'll build:

A production multi-turn conversation system with context compression and cost tracking
A hybrid RAG pipeline (BM25 + vector search + reranking) backed by PostgreSQL and pgvector
A multi-tenant knowledge base where each customer's data stays strictly isolated
A ReAct agent that reasons over tool results across multiple steps
A multi-agent system that decomposes goals, executes subtasks in parallel, and synthesizes results
An autonomous coding agent that writes, runs, and self-repairs code in a sandbox
An MCP server you can publish to npm and reuse across any MCP-compatible host
A complete AI workbench integrating RAG, agents, MCP, and multimodal capabilities — deployable to Vercel and Railway

What's covered:

The book is organized into six parts. It starts with full-stack TypeScript foundations — the server patterns and type-safe API design you'll need before touching any LLM. Then it moves into LLM integration: streaming output, prompt engineering, and reliable structured output with Zod.

The RAG section covers the full pipeline from document ingestion to production retrieval — including hybrid search, hallucination detection, and multi-tenant isolation. The Agents section is the heart of the book: from the ReAct loop through tool calling, browser automation, and multi-agent orchestration.

Part Five covers MCP — Model Context Protocol — from building your first server through publishing a reusable tool package to npm. Part Six covers multimodal AI (vision, speech, documents) and the production concerns you can't skip before going live: observability with LangFuse, rate limiting, prompt injection defense, and cost monitoring.

Who this is for:

TypeScript or JavaScript developers who want to build AI applications.

---

**600+ pages. 24 chapters. All code in TypeScript.**

Preface: A Frontend Developer’s Ticket into the AI Era

The moment it clicked
Agents: from talking to doing
What this book is, and isn’t
Six conceptual leaps
Before you begin

How to Read This Book

Prerequisites
Companion Code
Reading Paths
A Note on the Code Examples
Key Terms at a Glance
Appendices
Part One: Full-Stack Foundation

Chapter 1: The Frontend Developer’s Roadmap to Full-Stack

Chapter Goals
1.1 Get an AI Endpoint Running — Now
1.2 Why Full-Stack, Why Now
1.3 The Full-Stack TypeScript Architecture
1.4 Node.js vs. the Browser: Same Language, Different World
1.5 TypeScript on the Server: Best Practices
1.6 Companion Projects Preview
1.7 Summary
Exercises
Further Reading

Chapter 2: Setting Up a Full-Stack TypeScript Development Environment

Chapter Goals
2.1 Why a Monorepo?
2.2 Initializing the Monorepo
2.3 Configuring TypeScript
2.4 The ESM Module System
2.5 Configuring tsx for Hot Reload
2.6 Configuring Biome: Unified Code Style
2.7 Configuring Vitest: Unit Testing
2.8 Building the shared Package
2.9 Building the server Package
2.10 Environment Variable Management
2.11 Complete Project Structure
2.12 Summary
Exercises
Further Reading

Chapter 3: Core Server-Side TypeScript Patterns

Chapter Goals
3.1 The Event Loop: How Node.js Handles Concurrency
3.2 Streams: The Right Way to Handle Large Data
3.3 Error Handling: From Ad Hoc to Systematic
3.4 Type-Safe API Design
3.5 Summary
Exercises
Further Reading

Chapter 4: Building Your First Full-Stack Application

Chapter Goals
Reading Guide
4.1 Project Overview
4.2 Backend Routing vs. Frontend Routing
4.3 Why a Database, and Which One
4.4 Drizzle ORM: TypeScript-First Database Access
4.5 Database Migrations: What They Are and Why They Exist
Extended Path: User Authentication (Optional)
4.6 User Authentication
4.7 Implementing Authentication
4.8 Why zValidator
4.9 The Main Entry Point
4.10 Auth Routes
4.11 Todo Routes
4.12 Frontend: Hono Client
4.13 Deployment
4.14 Summary
Exercises
Further Reading
Part Two: LLM Integration

Chapter 5: How Large Language Models Work

Chapter Goals
5.1 What LLMs Are and Where They Came From
5.2 How Deep Is Deep Enough?
5.3 Transformer: Engineering Intuition
5.4 Tokens: The Basic Unit of LLMs
5.5 Sampling Parameters: Controlling Randomness
5.6 Model Comparison and Selection (2026)
5.7 Hallucination: The Fundamental Limitation
5.8 Summary
Exercises
Further Reading

Chapter 6: Connecting to LLM APIs — From Hello World to Production

Chapter Goals
6.1 The LLM API Model
6.2 Installation and Initialization
6.3 Verifying the Client Setup
6.4 Non-Streaming vs. Streaming
6.5 Implementing Non-Streaming Calls
6.6 Implementing Streaming
6.7 Error Classification and Retry Strategy
6.8 Cost Control
6.9 Complete Service Layer
6.10 Summary
Exercises
Further Reading

Chapter 7: Prompt Engineering and Structured Output

Chapter Goals
7.1 What Prompt Engineering Is
7.2 System Prompt Design Principles
7.3 Few-Shot Examples: Teaching by Example
7.4 XML Tags: Precise Output Structure Control
7.5 Structured JSON Output
7.6 Prompt Version Management
7.7 Prompt Security: Preventing Injection Attacks
7.8 Summary
Exercises
Further Reading

Chapter 8: Building a Multi-Turn Conversation System

Chapter Goals
8.1 Project Architecture
8.2 Conversation Management Service
8.3 Context Window Management and History Compression
8.4 Chatbot Core Service
8.5 Route Layer
8.6 Frontend: Complete Chat Interface
8.7 Multi-Tab Sync with BroadcastChannel
8.8 Summary
Exercises
Further Reading
Part Three: RAG — Give Your AI a Knowledge Base

Chapter 9: Vector Databases and Embeddings

Chapter Goals
9.1 From Keyword Search to Semantic Search
9.2 Embeddings: Compressing Language into Vector Space
9.3 Cosine Similarity: Measuring Directional Similarity Between Vectors
9.4 Vector Databases: Storage Designed for High-Dimensional Vectors
9.5 Implementing the Embedding Service
9.6 Semantic Search API
9.7 Engineering Details: Vector Dimensions and Indexes
9.8 Summary
Exercises
Further Reading

Chapter 10: Document Ingestion Pipeline

Chapter Goals
10.1 What Is a Document Ingestion Pipeline?
10.2 Database Schema: Document Management
10.3 Document Parsing: Extracting Plain Text
10.4 Text Chunking: Four Strategies
10.5 Batch Ingestion Pipeline
10.6 File Upload Route
10.7 Frontend: Document Upload and Status Tracking
10.8 Summary
Exercises
Further Reading

Chapter 11: RAG Core Implementation

Chapter Goals
11.1 How RAG Works
11.2 Hybrid Search: BM25 + Vector
11.3 Reranking: Precision at the Top
11.4 Complete RAG Pipeline
11.5 Citation Display Component
11.6 Summary
Exercises
Further Reading

Chapter 12: Production-Grade RAG System

Chapter Goals
12.1 Project Overview
12.2 Incremental Document Updates
12.3 RAG Quality Evaluation (RAGAS)
12.4 Hallucination Detection
12.5 Multi-Tenant Knowledge Base Isolation
12.6 Embedding Cache
12.7 Three Common Production Pitfalls
12.8 Pre-Launch Checklist
12.9 Summary
Exercises
Further Reading
Part Four: Agents — Tool Calling and Autonomous Decision-Making

Chapter 13: AI Agent Architecture Patterns

Chapter Goals
13.1 What Is an Agent?
13.2 ReAct: The Mainstream Agent Loop
13.3 Plan-and-Execute: Plan First, Then Act
13.4 Agent vs. Chain: When to Use Which
13.5 Agent Failure Modes and Defenses
13.6 Agent Observability
13.7 Factory Functions: Creating Pre-Configured Agents
13.8 A First Agent: Calculator Assistant
13.9 Summary
Exercises
Further Reading

Chapter 14: Tool Calling (Function Calling) in Depth

Chapter Goals
14.1 Tool Schema Design Principles
14.2 Parallel Tool Execution
14.3 Tool Result Validation
14.4 Recursive Call Defense
14.5 Building a Reusable Tool Library
14.6 Tool Call Observability
14.7 Summary
Exercises
Further Reading

Chapter 15: Browser and File System Tools

Chapter Goals
15.1 Connecting the Agent to the Real World
15.2 Browser Tools: Playwright Integration
15.3 File System Tools
15.4 Command and Code Execution Tools
15.5 Tool Registry
15.6 Complete Example: Code Analysis Agent
15.7 Production Deployment Notes
15.8 Summary
Exercises
Further Reading

Chapter 16: Multi-Agent Collaboration Systems

Chapter Goals
16.1 Why Multiple Agents
16.2 Orchestrator/Subagent Pattern
16.3 Pipeline Pattern: Sequential Agent Chains
16.4 Message Passing and Shared State
16.5 Timeout and Interrupt Handling
16.6 Error Path Handling: Timeouts, Conflicting Results, and Loop Detection
16.7 Multi-Agent Routing
16.8 Frontend: Multi-Agent Task Panel
16.9 Summary
Exercises
Further Reading

Chapter 17: Building an Autonomous Coding Agent

Chapter Goals
17.1 Project Goal
17.2 Code Operation Toolset
17.3 Code Review Agent Core
17.4 Test-Driven Auto-Fix
17.5 Code Review Route and SSE
17.6 Frontend: Code Review Interface
17.7 Summary
Exercises
Further Reading
Part Five: MCP — Standardized Tool Ecosystem

Chapter 18: MCP Protocol Specification and Design Philosophy

Chapter Goals
18.1 Why MCP
18.2 JSON-RPC 2.0: MCP’s Protocol Foundation
18.3 MCP Lifecycle
18.4 The Three Primitives in Detail
18.5 MCP vs. Function Calling: Which to Use
18.6 Transport Layer: Three Connection Methods
18.7 MCP Ecosystem Status (2026)
18.8 Summary
Exercises
Further Reading

Chapter 19: Building Your First MCP Server

Chapter Goals
19.1 Project Setup
19.2 MCP SDK Basics
19.3 Implementing Tools
19.4 Implementing Resources
19.5 Implementing Prompts
19.6 Assembling the Complete MCP Server
19.7 Debugging the MCP Server
19.8 Streamable HTTP Transport: Remote Access
19.9 Summary
Exercises
Further Reading

Chapter 20: MCP Client Integration

Chapter Goals
20.1 The MCP Client’s Responsibilities
20.2 Single MCP Client Implementation
20.3 Multi-Server Manager
20.4 Permission Model and User Authorization
20.5 Integrating with the Agent: Permission-Gated Tool Calls
20.6 Version Compatibility Strategy
20.7 MCP Route: Exposing MCP Capabilities to the Application Layer
20.8 Loading MCP Servers at Application Startup
20.9 Frontend: MCP Tool Panel
20.10 Summary
Exercises
Further Reading

Chapter 21: Developing and Publishing Reusable MCP Servers

Chapter Goals
21.1 Project Goal
21.2 Notion MCP Server
21.3 Database Query MCP Server
21.4 Packaging and Publishing to npm
21.5 MCP Server Testing Strategy
21.6 Summary
Exercises
Further Reading
Part Six: Multimodal and Production Deployment

Chapter 22: Multimodal AI Development

Chapter Goals
22.1 What Is Multimodal
22.2 Image Understanding (Vision)
22.3 Speech-to-Text (Whisper)
22.4 Text-to-Speech (TTS)
22.5 Mixed-Content PDF Parsing
22.6 Frontend: Multimodal Upload Component
22.7 Summary
Exercises
Further Reading

Chapter 23: Productionizing AI Applications

Chapter Goals
23.1 The Core Challenges of Production
23.2 LangFuse: Full-Chain LLM Call Tracing
23.3 Rate Limiting and Quotas
23.4 Prompt Injection Defense
23.5 Prompt A/B Testing
23.6 Cost Monitoring and Alerts
23.7 Agent Evaluation System (Evals)
23.8 Production Deployment Checklist
23.9 Summary
Exercises
Further Reading

Chapter 24: Capstone Project — AI Fullstack Workbench

Chapter Goals
24.1 Project Overview
24.2 Architecture
24.3 Backend: Unified AI Gateway
24.4 Frontend: AI Workbench Main Interface
24.5 Real-Time Collaboration: Multi-User Agent
24.6 Deployment: Vercel + Railway
24.7 Monitoring and Maintenance
24.8 Book Knowledge Review
24.9 What Comes Next: Beyond This Book
24.10 Summary
24.11 Closing
Exercises
Further Reading
Appendices

Appendix A: TypeScript Server-Side Development Quick Reference

A.1 Infer Types from Libraries (Don’t Write Them Manually!)
A.2 Common Utility Types
A.3 Conditional Types
A.4 Frontend to Backend: Common Type Errors and Fixes
A.5 Type Patterns Used Throughout This Book
A.6 tsconfig.json Key Settings Quick Reference
Further Reading

Appendix B: AI Application Common Error Troubleshooting Guide

B.1 LLM API Errors
B.2 Database Errors
B.3 RAG Retrieval Issues
B.4 Agent Execution Issues
B.5 MCP Connection Issues
B.6 ESM Module Errors
B.7 Setup and Environment Issues

Appendix C: Recommended Resources and Further Reading

C.1 Official Documentation (Consult Regularly)
C.2 Books
C.3 Papers (Engineering Perspective)
C.4 Blogs and Newsletters
C.5 Video Courses
C.6 Open-Source Projects (Worth Reading the Source)
C.7 Community and Events
C.8 Sustainable Learning Suggestions
C.9 Book Reference Index

Appendix D: AI Engineering Glossary

1. Model Architecture
2. Training
3. Tokens and Context
4. Sampling Parameters
5. Prompting and Output Control
6. RAG — Retrieval-Augmented Generation
7. Agent Architecture
8. MCP and the Protocol Ecosystem
9. Engineering Frameworks and Evaluation

Appendix E: Core AI Engineering Concepts

E.1 The Evolution of Three Engineering Layers
E.2 Prompt Engineering: The Best-Known Layer
E.3 Context Engineering: One Order of Magnitude Larger Than Prompt
E.4 Harness Engineering: Reliability Engineering
E.5 Evals: AI System Testing
E.6 swyx’s IMPACT Framework
E.7 Harness Debt: A New Form of Technical Debt
E.8 Summary: Three Layers of Engineering Practice
Further Reading

Appendix F: Emerging AI Engineering Trends (2025–2026)

F.1 OpenClaw: The Viral Rise of Personal Agents
F.2 The Expansion of the MCP Protocol Stack
F.3 MoE: The Architecture Revolution Reshaping Model Selection
F.4 Relationship to Content Already in This Book
F.5 Recommendations for Readers
Further Reading

You pay

Author earns

About

Share this book

Categories

Feedback

Author

Contents