A clear, illustrated guide to large language models, covering key concepts and practical applications. Ideal for projects, interviews, or personal learning.
Master language models through mathematics, illustrations, and code―and build your own from scratch!
A practical guide to fine-tuning Large Language Models (LLMs), offering both a high-level overview and detailed instructions on how to train these models for specific tasks.Get the paperback version here. Get the Kindle version here.
It's never been easier to build an AI agent—and never been harder to make one that actually works. This book takes you from language model foundations to production-ready multi-agent systems, with the depth to understand what you're building and why it fails.
Este libro es una guía concisa e ilustrada para cualquiera que desee comprender el funcionamiento interno de los Grandes Modelos de Lenguaje, ya sea de cara a realizar entrevistas, proyectos o para satisfacer su curiosidad.
Build GPT-2, Llama 3, and DeepSeek from scratch in PyTorch. Every chapter has runnable end-to-end code and loads real pretrained weights. Goes well past where most LLM tutorials stop.
이 책은 면접 준비, 프로젝트 진행, 또는 순수한 지적 호기심을 위해 대규모 언어 모델의 내부 구조와 작동 원리를 이해하고 싶은 모든 분들을 위한, 그림으로 설명하는 핵심 가이드입니다.
หนังสือเล่มนี้เป็นคู่มือฉบับกระชับพร้อมภาพประกอบ สำหรับผู้ที่อยากเข้าใจการทำงานภายในของแบบจำลองภาษาขนาดใหญ่ ในบริบทการสัมภาษณ์ ทำโครงการ หรือเพื่อสนองตอบความใคร่รู้ของตนเอง
Revised for PyTorch 2.x! In 2019, I published a PyTorch tutorial on Towards Data Science and I was amazed by the reaction from the readers! Their feedback motivated me to write this book to help beginners start their journey into Deep Learning and PyTorch. I hope you enjoy reading this book as much as I enjoy writing it.
Pedagogical Philosophy of the BookThis book is designed with three guiding principles:1. Clarity over Formalism While maintaining mathematical accuracy, the book avoids unnecessary formalism that can confuse beginners. Instead, it uses intuitive explanations, diagrams, and real-world analogies.2. Integration of Computation Every mathematical concept is tied to computational practice. Readers are encouraged to implement simple code snippets (in Python, NumPy, or similar tools) to reinforce their understanding.3. Balance Between Breadth and Depth The book covers the essential calculus concepts in sufficient depth to support AI applications, without delving into overly abstract branches that have limited relevance to machine learning. Who Should Read This Book?· Students of Computer Science, Data Science, and AI – who want to strengthen their mathematical foundation for advanced courses and projects.· Researchers in AI – who need a refresher or structured guide to connect calculus with modern algorithms.· Industry Professionals and Engineers – who want to move beyond using libraries like TensorFlow or PyTorch blindly and instead gain an understanding of the mathematics behind the models.· Educators – who seek a resource that connects abstract mathematics with practical AI examples for teaching purposes.Benefits of Studying This Book1. Builds Mathematical Confidence – Readers who once found calculus intimidating will discover a fresh, accessible perspective tailored for AI.2. Enables Deeper Understanding of Algorithms – Going beyond “black box” usage of AI tools, readers will understand why models work.3. Enhances Problem-Solving Skills – By mastering calculus-driven optimization, readers can design new models and improve existing ones.4. Supports Academic and Career Growth – Mastery of calculus strengthens research capabilities, technical interviews, and advanced study opportunities.5. Encourages Critical Thinking – Rather than rote memorization, the book fosters curiosity about the connections between mathematics and intelligent systems. The Long-Term VisionArtificial Intelligence is not just a passing trend—it is shaping the future of science, technology, and human society. Calculus, as a timeless branch of mathematics, ensures that learners have the intellectual tools to adapt to new paradigms. As AI expands into quantum computing, neuroscience-inspired architectures, and beyond, the reliance on calculus will remain unshaken.This book provides readers not just with knowledge, but with intellectual independence—the ability to reason about algorithms, derive insights, and innovate confidently.
This book is a quick foray into the world of deep learning-based computer vision and abnormal equipment sound detection. The readers are introduced to the ease with which powerful equipment and product quality monitoring solutions can be built using sound and visual data.
How does a machine recognize a face?How can AI distinguish speech from noise?Why do modern computer vision systems still rely on mathematical techniques developed decades ago?The answer lies in Fourier and Wavelet Analysis.In Fourier and Wavelet Analysis in Artificial Intelligence, Anshuman Mishra reveals how frequency-domain representations, multi-resolution analysis, and signal-processing techniques continue to shape the future of Machine Learning, Deep Learning, Computer Vision, Speech Recognition, Biomedical AI, and Edge Intelligence.From Fourier Transforms and Fast Fourier Algorithms to Wavelet Scattering Networks and Hybrid CNN Architectures, this book demonstrates how mathematical signal analysis becomes intelligent feature extraction.Discover the mathematics behind perception, representation, and intelligent decision-making.
Can a neural network be viewed as a differential equation?Why does gradient descent behave like a dynamical system?How do biological neurons inspire modern AI architectures?What mathematical principles govern stability, learning, adaptation, and intelligence?In Differential Equations in AI and Neural Dynamics, Anshuman Mishra explores the mathematical framework that underlies modern Artificial Intelligence.From Ordinary Differential Equations and Neural Population Models to Neural ODEs, Stochastic Learning, Reinforcement Learning, and Brain-Inspired Computation, this book reveals how continuous-time mathematics drives intelligent behavior.Discover how equations of change become equations of intelligence.
What do speech recognition systems, computer vision models, autonomous robots, and biomedical AI applications have in common?They all rely on the mathematics of signal transformation.How does a neural network extract meaningful patterns from raw audio?Why are Fourier features becoming increasingly important in machine learning?How can Laplace and Z-Transforms help analyze dynamic systems, sequential data, and intelligent control architectures?In Integral Transforms for Artificial Intelligence, Anshuman Mishra reveals how Fourier, Laplace, and Z-Transform techniques power modern AI systems across machine learning, deep learning, computer vision, speech processing, robotics, and signal analysis.Discover how mathematical transformations convert raw signals into intelligent insights—and how they continue to shape the future of Artificial Intelligence.