Under The Hood
- The LLM Engineering Manual
Copyright
Preface: What This Book Is Actually Asking You To Do
Using the Code Repository
- How the repo is organized
- The intended workflow per chapter
- Getting it running
- What the repo is not
- Reporting issues
Preflight: Python, Tensors, and What a Language Model Actually Is
- What a Language Model Actually Is
- How a Model Learns: The Three Moves
- Python You Need to Read This Book
- Numbers as Tensors
- The Shape of the Book
- Quick Reference: What to Do If You Get Stuck
- What You Now Know
Project 1: The Learning Machine
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 2: Predicting The Next Character
- Hook
- The Concept
- Why It Matters
- The Build
- Building the neural character model
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 3: Building A Tokenizer
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 4: Attention From Scratch
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 5: Your GPT From A Blank File
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 6: From Prototype to nanoGPT
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 7: The Details That Matter
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 8: Flash Attention and Tiled Kernels
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 9: Pretraining On The Real Web
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 10: Data Curation and Contamination
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 11: Training Debugging: Spikes, NaNs, and Profiling
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 12: Distributed Training: FSDP and ZeRO (Single-Box Proxy)
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 13: Fast Inference: The KV Cache
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 14: Speculative Decoding
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 15: Grouped Query Attention
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 16: Long-Context Extension (RoPE, YaRN, NTK-Aware)
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 17: Production Serving: Continuous Batching and PagedAttention
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 18: Mixture Of Experts
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 19: Scaling Laws
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 20: Autonomous Experimentation
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 21: Fine-Tuning And Instruction Tuning
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 22: Evaluation Methodology
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 23: Reward Models And RLHF
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 24: DPO and Preference Optimization
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 25: Test-Time Reasoning (CoT, Self-Consistency, Best-of-N)
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 26: Tool Use and Function Calling
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 27: Quantization and Deployment
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 28: Retrieval-Augmented Generation
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 29: Multimodal: A Tiny Vision-Language Model
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 30: Non-Transformer Architectures (Mamba, RWKV)
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 31: Layer Freezing and Transfer
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 32: Fusing Independently Trained Specialists
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Starting Point
Project 33: The Interface Specification
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Research Anchors
- Starting Point
Project 34: Incremental Assembly
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- What You Now Know
- Research Anchors
- Starting Point
Project 35: Your Architecture
- Hook
- The Concept
- Why It Matters
- The Build
- BREAK IT
- Optional Homework
- Questions To Answer
- Go Further
- Research Anchors
- What You Now Know
- Where The Field Is Now
- What To Sound Like In A Strong Interview
- Frontier Reading Map
- What The Book Now Gives You
- Starting Point
Appendix A: Lecture Companions
- How to use this appendix
- Preflight companion: Python and tensor foundations
- Part I companion: learning mechanics, tokenization, attention
- Part II companion: building and training a transformer
- Part III companion: inference, efficiency, and scaling
- Part IV companion: post-training, alignment, deployment
- Part V companion: transfer, modularity, interfaces, research
- Suggested study rhythm
Appendix B: Free Resources
- Reference architecture by topic
- Stability rule
Appendix C: Notes, Sources, And Bibliography
- Project Sources