Leanpub Header

Skip to main content

Retrieval-Augmented Generation

An Engineer's Guide to Building RAG Systems with Your Own Data

The engineer's guide to RAG systems that survive a deploy.

Interested in this book? Show your support by saying what you'd like to pay for it!

$
PDF
EPUB
WEB
About

About

About the Book

Most teams trying to ship a RAG system stall at the prototype stage. The notebook works, the demo wins the meeting, the system never reaches users at scale. The gap between "this works on my laptop" and "this runs reliably in production" is wide and full of engineering challenges. This book is about that gap.

It's written for engineers who need to ship something real. Not for researchers writing benchmarks, not for managers picking vendors. For the person at the keyboard who needs to make decisions about chunking strategy, vector store choice, evaluation methodology, and production operations, and who's tired of vendor-shaped blog posts and examples that don't survive a deploy.

Each chapter pairs concept with implementation. Real code on a real corpus, runnable end to end. The seven failure points of a RAG pipeline are introduced in chapter 1 and traced through every subsequent chapter, so you learn to recognize *where* things break, not just patch them when they do.

The book in three parts

Part I — Foundations.

Why standalone LLMs fail on private data, what RAG actually is, and the building blocks underneath: embeddings, chunking strategies, vector storage (FAISS vs pgvector vs Qdrant with measured benchmarks), and a complete ingestion pipeline that handles the messiness of real documents.

Part II — Building and Improving.

Wiring retrieval into generation. Sparse vs dense retrieval, BM25, hybrid search with reciprocal rank fusion, reranking with cross-encoders, query transformation patterns (multi-query, sub-question decomposition, HyDE). Every chapter measures the improvement instead of just describing it.

Part III — Production and Beyond.

Evaluation done right (separate retrieval and generation metrics, RAGAS, ablation testing). Hardening the pipeline (observability, semantic caching, citation systems, embedding staleness, cost optimization, load testing). Advanced retrieval patterns (GraphRAG, Corrective RAG, Self-RAG) with honest takes on when each earns its keep. Then agentic RAG with realistic guardrails for production.

By the end you'll be able to

  • Choose a chunking strategy on retrieval evidence, not intuition
  • Pick FAISS, pgvector, or Qdrant based on your actual constraints
  • Build a RAG pipeline that handles real PDFs with OCR artifacts, encoding issues, and dirty markdown
  • Evaluate retrieval quality separately from generation quality, and prove your changes help
  • Add reranking, hybrid search, and query transformation when (and only when) they earn it
  • Catch the seven failure points before they reach production
  • Scale, monitor, and cost-optimize a RAG system that survives a deploy

Early access

This book is in active development. Chapters 1-4 are polished and ready (the Foundations sequence through Vector Storage). The full structure exists end-to-end: every chapter has finished prose, code, and examples committed to the repo, and uses the same Acme Corp corpus that runs through the book.

What's still in flight is the polish pass that brings chapters 5-14 to the same standard as 1-4. New polished chapters ship every 2-3 weeks until the full book is complete (target: end of 2026).

Buying now gets you every current chapter plus every future update at no additional cost.

Author

About the Author

Jeroen Herczeg

Hey, I'm Jeroen.

I'm a senior software engineer who builds AI systems for production. Twenty years of engineering experience, now applied to AI agent orchestration and retrieval-augmented generation.

Most recently I built the orchestrator agent for the Google + BBC AI Agents demo at IBC2025, which won the Broadcast Tech Innovation Award. I started studying AI seriously in 2017 with Udacity's Artificial Intelligence Nanodegree, well before the current wave of large language models made it fashionable.

I write about practical AI engineering at herczeg.be/blog and live in Belgium. This book exists because most of what I learned shipping production AI is locked in private codebases, and someone should write it down.

The Leanpub 60 Day 100% Happiness Guarantee

Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.

Now, this is technically risky for us, since you'll have the book or course files either way. But we're so confident in our products and services, and in our authors and readers, that we're happy to offer a full money back guarantee for everything we sell.

You can only find out how good something is by trying it, and because of our 100% money back guarantee there's literally no risk to do so!

So, there's no reason not to click the Add to Cart button, is there?

See full terms...

Earn $8 on a $10 Purchase, and $16 on a $20 Purchase

We pay 80% royalties on purchases of $7.99 or more, and 80% royalties minus a 50 cent flat fee on purchases between $0.99 and $7.98. You earn $8 on a $10 sale, and $16 on a $20 sale. So, if we sell 5000 non-refunded copies of your book for $20, you'll earn $80,000.

(Yes, some authors have already earned much more than that on Leanpub.)

In fact, authors have earned over $15 million writing, publishing and selling on Leanpub.

Learn more about writing on Leanpub

Free Updates. DRM Free.

If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).

Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.

Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.

Learn more about Leanpub's ebook formats and where to read them

Write and Publish on Leanpub

You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!

Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.

Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF and/or EPUB files and then publish with one click!) It really is that easy.

Learn more about writing on Leanpub