The Leanpub 60 Day 100% Happiness Guarantee
Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms...
Kick off your book project in 2 hours! Live workshop on Zoom. You’ll leave with a real book project, progress on your first chapter, and a clear plan to keep going. Tuesday, June 16, 2026. Learn more…

A 7-book GPU collection covering architecture, CUDA, assembly, PTX, SASS, and parallel computing. Learn how to move from high-level programming to low-level execution and optimize performance across modern GPU systems.
Bought separately
$203
With Coupon
$60.90
About the Bundle
Achieve full-spectrum mastery of GPU systems with the Modern GPU Architecture and Programming Complete Bundle—a comprehensive 7-book collection covering architecture, programming, and low-level optimization.
This bundle is designed as a complete progression from hardware foundations to instruction-level control, giving you the ability to understand, analyze, and optimize GPU performance across every layer of the stack.
It integrates architectural insight with practical programming models and low-level instruction analysis—bridging the gap between CUDA development, GPU assembly, and real hardware behavior.
Included in this collection:
What this bundle delivers:
This is a full-stack GPU engineering library—built for developers and engineers who want to operate beyond frameworks and gain precise control over performance.
About the Books
GPU Parallel Computing: From Basics to Breakthroughs — A Technical Guide to GPU ProgrammingIf you want to understand how modern GPUs work and how to use them effectively for high-performance workloads, this book provides the technical foundation required.This book assumes no prior exposure to GPU internals; however, a working knowledge of electronics and general computer architecture is recommended.It is written for students, engineers, researchers, and data scientists who are new to GPU architecture and parallel programming and want a rigorous introduction before progressing into optimization and large-scale GPU systems.If you are already an experienced CUDA performance engineer or low-level GPU architect seeking a specialized microarchitectural reference manual, this book is not positioned for that purpose.What You Will LearnGPU Architecture FundamentalsStreaming multiprocessors and SIMT executionWarp scheduling and instruction flowGPU memory hierarchy and bandwidth considerationsGPU Programming ModelsCUDA programming principlesOpenCL fundamentalsKernel structure and execution behaviorPerformance OptimizationMemory access patterns and coalescingWarp divergence and latency hidingOccupancy principles and kernel configurationReal-World ApplicationsScientific simulationsMachine learning workloadsGraphics and visualization pipelinesAdvanced TopicsMulti-GPU communicationTensor cores and mixed precisionProfiling, debugging, and performance analysisThe early chapters establish architectural clarity and programming fundamentals.Later chapters address optimization strategies, scalability, and applied GPU workloads.Who This Book Is ForStudents entering GPU computingEngineers transitioning into parallel architectureResearchers and data scientists adopting GPU accelerationThis is a technical book. It builds understanding from architectural principles upward and focuses on performance-oriented reasoning rather than superficial overview.Why This BookMany GPU resources either assume too much prior knowledge or remain overly abstract.This book emphasizes structured technical understanding:How GPUs execute threadsWhy performance bottlenecks occurHow architectural constraints shape resultsHow programming decisions map to hardware behaviorClear explanations.Practical code examples.Architectural context.Read more
Modern GPU Architecture Second Edition — Volume One
Graphics Pipeline Design and Hardware Implementation
Modern GPUs are the most complex and efficient parallel processors ever created—and this book shows you exactly how they work at the hardware level.
Unlike typical graphics or programming guides, this volume takes you inside the GPU itself:
how instructions flow through pipelines, how memory hierarchies sustain bandwidth, how shader cores and fixed-function units cooperate to render billions of pixels per second.
You’ll explore every major stage of the graphics pipeline in depth—geometry, rasterization, shading, texturing, and render output—all supported by clear mathematical models and synthesizable Verilog examples. This is not “theory for theory’s sake”; it’s engineering detail you can apply directly in design, simulation, or hardware verification.
By reading this book, you’ll gain:
Every chapter bridges concept and implementation, making it invaluable for anyone designing graphics hardware, studying computer architecture, or seeking mastery of parallel computation systems.
Dense, detailed, and unapologetically technical, this book is written for those who want to understand modern GPUs—not just use them.
⚠️ This isn’t entertainment. It’s engineering.
If that excites you, welcome aboard.
If it intimidates you, this book isn’t for you.
From the Editor at Burst Books — Gareth Thomas
A Smarter Kind of Learning Has Arrived — Thinking on Its Own.
Forget tired textbooks from years past. These AI-crafted STEM editions advance at the speed of discovery. Each page is built by intelligence trained on thousands of trusted sources, delivering crystal-clear explanations, flawless equations, and functional examples — all refreshed through the latest breakthroughs.
Best of all, these editions cost a fraction of traditional texts yet surpass expectations. You’re gaining more than a book — you’re enhancing the mind’s performance.
Explore BurstBooksPublishing on GitHub to find technical samples, infographics, and additional study material — a complete hub that supports deeper, hands-on learning.
In this age of AI, leave the past behind and learn directly from tomorrow.
Modern GPU Architecture Second Edition — Volume One
Graphics Pipeline Design and Hardware Implementation
Modern GPUs are the most complex and efficient parallel processors ever created—and this book shows you exactly how they work at the hardware level.
Unlike typical graphics or programming guides, this volume takes you inside the GPU itself:
how instructions flow through pipelines, how memory hierarchies sustain bandwidth, how shader cores and fixed-function units cooperate to render billions of pixels per second.
You’ll explore every major stage of the graphics pipeline in depth—geometry, rasterization, shading, texturing, and render output—all supported by clear mathematical models and synthesizable Verilog examples. This is not “theory for theory’s sake”; it’s engineering detail you can apply directly in design, simulation, or hardware verification.
By reading this book, you’ll gain:
Every chapter bridges concept and implementation, making it invaluable for anyone designing graphics hardware, studying computer architecture, or seeking mastery of parallel computation systems.
Dense, detailed, and unapologetically technical, this book is written for those who want to understand modern GPUs—not just use them.
⚠️ This isn’t entertainment. It’s engineering.
If that excites you, welcome aboard.
If it intimidates you, this book isn’t for you.
From the Editor at Burst Books — Gareth Thomas
A Smarter Kind of Learning Has Arrived — Thinking on Its Own.
Forget tired textbooks from years past. These AI-crafted STEM editions advance at the speed of discovery. Each page is built by intelligence trained on thousands of trusted sources, delivering crystal-clear explanations, flawless equations, and functional examples — all refreshed through the latest breakthroughs.
Best of all, these editions cost a fraction of traditional texts yet surpass expectations. You’re gaining more than a book — you’re enhancing the mind’s performance.
Explore BurstBooksPublishing on GitHub to find technical samples, infographics, and additional study material — a complete hub that supports deeper, hands-on learning.
In this age of AI, leave the past behind and learn directly from tomorrow.
Uncover the fundamentals of GPU architecture and assembly programming with Advanced GPU Assembly Programming, a resource designed for enthusiasts and professionals who want to explore the intricate workings of modern GPUs. This book is not a step-by-step manual but a gateway to understanding GPU architecture and assembly programming at a foundational level. It’s ideal for readers who are ready to invest their own effort to experiment and grow their expertise.
What You’ll Gain:
1. Deep Insights into GPU ArchitectureWho This Book is For:
What This Book is Not:
This is not a hands-on, step-by-step guide. Instead, it provides a conceptual framework and architectural insights to set readers on the right path. It encourages further exploration and learning through personal effort and experimentation.
Whether you’re a developer, researcher, or assembly enthusiast, Advanced GPU Assembly Programming will give you the knowledge needed to deeply understand GPU architecture and programming. Equip yourself with the foundational tools to explore, experiment, and achieve mastery in the fascinating world of GPU assembly.
Order your copy today and take your first step into the realm of GPU programming mastery!
UPDATE: This book now has a github repository with all source code samples, infographics, exercise manual and more.
From the Editor at Burst Books — Gareth Thomas
A Smarter Kind of Learning Has Arrived — Thinking on Its Own.
Forget tired textbooks from years past. These AI-crafted STEM editions advance at the speed of discovery. Each page is built by intelligence trained on thousands of trusted sources, delivering crystal-clear explanations, flawless equations, and functional examples — all refreshed through the latest breakthroughs.
Best of all, these editions cost a fraction of traditional texts yet surpass expectations. You’re gaining more than a book — you’re enhancing the mind’s performance.
Explore BurstBooksPublishing on GitHub to find technical samples, infographics, and additional study material — a complete hub that supports deeper, hands-on learning.
In this age of AI, leave the past behind and learn directly from tomorrow.
If you’ve ever wondered why your GPU code hits a wall long before the hardware’s limits, this book tells you why—and how to break through it.
Most programmers stop where the compiler starts. They trust nvcc to make the right decisions, to manage registers, to schedule instructions, and to use memory efficiently. But the compiler doesn’t know your problem. It guesses. And in GPU computing, guessing costs performance.
Mastering PTX and SASS – Volume I pulls back the curtain on NVIDIA’s virtual machine—the PTX instruction set that every CUDA kernel becomes before it touches silicon. You’ll learn how threads, warps, and memory really behave at the hardware level, how each instruction interacts with caches and pipelines, and how to read, write, and reason about PTX like an architect, not just a coder.
This isn’t a surface-level “how-to.” It’s a deep, methodical tour through the machinery of modern GPUs—built for professionals who want measurable, repeatable speedups, not guesswork. You’ll discover how the compiler transforms your high-level logic into executable reality, and where you can step in to take control.
By the time you finish, you won’t be relying on compiler magic. You’ll understand it, improve it, and surpass it.
Mastering PTX and SASS – Volume I gives you the foundation; Volume II takes you to the bleeding edge of optimization. Together, they turn GPU performance from a mystery into a science.
You’ve mastered the architecture—now it’s time to own the performance.
Every GPU developer hits the same wall: the profiler says you’re close to peak, but you know there’s still headroom. What’s missing isn’t another compiler flag—it’s visibility into the hardware’s final truth. That truth lives in SASS, the real machine code running on NVIDIA GPUs.
Mastering PTX and SASS – Volume II takes you past theory into the territory where nanoseconds matter. Here you’ll learn how to read, analyze, and tune instruction streams with surgical precision. You’ll uncover how schedulers pair ops, how register pressure throttles throughput, and how to turn your kernels into clock-cycle-balanced engines of pure efficiency.
This book is for engineers who refuse to settle for “good enough.” It turns profiling, disassembly, and optimization into a repeatable process—one grounded in data, not superstition. From tensor cores to warp shuffles, from atomic operations to multi-GPU scaling, you’ll learn how real experts bend hardware to their will.
Volume I built the foundation; Volume II shows you how to weaponize it.
If you’re ready to squeeze every drop of performance from your GPU—and understand exactly how you did it—this is the manual you’ve been waiting for.
NOTICE: "All Code for this book and many more is at github BurstBooksPublishing"
Advanced CUDA Programming: High-Performance Computing with GPUs is the ultimate guide to unlocking the full power of modern GPU computing. Whether you're developing AI models, optimizing scientific simulations, or pushing real-time applications to their limits, this book delivers the advanced techniques and expert insights you need to achieve peak CUDA performance.
GPU programming is no longer optional—it's a necessity in today's world of deep learning, AI acceleration, and high-performance computing. But simply writing CUDA kernels isn’t enough. To truly optimize GPU applications, you need a deep understanding of GPU architecture, memory hierarchies, execution models, and performance tuning strategies. This book takes you beyond the fundamentals and into the world of advanced CUDA programming, where efficiency, scalability, and raw computational power define success.
What You’ll Learn:
Why This Book?
This isn’t just another CUDA guide—it’s a masterclass in performance optimization. Packed with real-world case studies, hands-on techniques, and cutting-edge strategies, it delivers everything you need to develop fast, scalable, and production-ready GPU applications.
If you're ready to take your CUDA skills to the next level and maximize GPU performance like never before, this book is your roadmap. Don't leave performance on the table—start optimizing today.
Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms...
We pay 80% royalties on purchases of $7.99 or more, and 80% royalties minus a 50 cent flat fee on purchases between $0.99 and $7.98. You earn $8 on a $10 sale, and $16 on a $20 sale. So, if we sell 5000 non-refunded copies of your book for $20, you'll earn $80,000.
(Yes, some authors have already earned much more than that on Leanpub.)
In fact, authors have earned over $15 million writing, publishing and selling on Leanpub.
Learn more about writing on Leanpub
If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).
Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.
Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.
Learn more about Leanpub's ebook formats and where to read them
You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!
Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.
Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF and/or EPUB files and then publish with one click!) It really is that easy.