Books

Leanpub Book LAUNCH 🚀 My Adventures with Large Language Models: Build foundational LLMs from Transformers to DeepSeek, from scratch, in PyTorch by Prathamesh S.

My Adventures with Large Language Models walks you through building five real LLM architectures from scratch in PyTorch, starting from a vanilla encoder-decoder Transformer and ending at DeepSeek's Multi-Head Latent Attention and Mixture-of-Experts.

Leanpub

Jun 4, 2026 — 2 min read

Welcome to the Leanpub Launch video for My Adventures with Large Language Models: Build foundational LLMs from Transformers to DeepSeek, from scratch, in PyTorch by Prathamesh S.!

About the Book

Book cover image for My Adventures with Large Language Models: Build foundational LLMs from Transformers to DeepSeek, from scratch, in PyTorch by Prathamesh S. — My Adventures with Large Language Models: Build foundational LLMs from Transformers to DeepSeek, from scratch, in PyTorch by Prathamesh S.

Most LLM tutorials stop at GPT-2. This book doesn't.

My Adventures with Large Language Models walks you through building five real LLM architectures from scratch in PyTorch, starting from a vanilla encoder-decoder Transformer and ending at DeepSeek's Multi-Head Latent Attention and Mixture-of-Experts.

Every chapter has runnable, end-to-end code. No pseudocode, no hand-waving. You type it, you run it, you understand it.

What you'll build:

Chapter 1: A vanilla encoder-decoder Transformer for English-to-Hindi translation. The fundamentals, implemented from the ground up.

Chapter 2: GPT-2 (124M parameters) from scratch, then load real OpenAI pretrained weights to verify your implementation works.

Chapter 3: Llama 3.2-3B by swapping exactly four components of your GPT-2. LayerNorm becomes RMSNorm. Learned positional encodings become RoPE. GELU becomes SwiGLU. Multi-Head Attention becomes Grouped-Query Attention. Then load Meta's pretrained weights.

Chapter 4: KV cache, Multi-Query Attention, and Grouped-Query Attention for inference optimisation.

Chapter 5: DeepSeek's full architecture. Multi-Head Latent Attention (with the absorption trick and decoupled RoPE), DeepSeekMoE (shared experts, fine-grained segmentation, auxiliary-loss-free load balancing), Multi-Token Prediction, and FP8 quantisation.

The code repository is open source: https://github.com/S1LV3RJ1NX/mal-code

This book is for ML engineers, researchers, and senior developers who know Python and PyTorch and want to understand modern LLMs at the level of code, not slides or blog posts. If you've read Raschka or watched Karpathy and want to go further, into Llama, GQA, MLA, and MoE, this is the book.

About the Author

Prathamesh S., Author of My Adventures with Large Language Models: Build foundational LLMs from Transformers to DeepSeek, from scratch, in PyTorch

Prathamesh is a Senior Forward Deployed Engineer at TrueFoundry, where he helps enterprises and startups solve real problems with LLMs and agents. He wrote this book because he wanted a resource that went past GPT-2 and into the architectures actually running in production. He is based in Bangalore, India. Portfolio: https://psaraf.pages.dev

Follow the author here!