Understanding The Processor Through Rust
You write Rust. You care about performance. But do you know what happens after the compiler is done?
Your code compiles to machine instructions that run on real silicon — through pipelines, caches, branch predictors, and execution units that have their own rules, their own bottlenecks, and their own surprises. A single line change can make a loop run 10x faster. A struct reordering can cut memory traffic in half. A branchless rewrite can eliminate a stall that costs millions of cycles per second.
These aren't compiler tricks. They're hardware effects. And most Rust programmers have never learned them.
What's inside
Part I — The Machine Under Your Code starts at the transistor and builds up. Logic gates, adders, flip-flops, the ALU, registers, the memory hierarchy, caches, instruction pipelines, branch prediction, out-of-order execution, and multi-core parallelism. No electrical engineering required — just clear explanations, diagrams, and Rust examples that connect every hardware concept to code you'd actually write.
Part II — Writing Rust That Respects the Hardware takes what you learned and puts it to work. Data layout for cache performance. Branchless programming. SIMD. Memory access patterns. Concurrency primitives mapped to hardware operations. Profiling with real tools. Four case studies that take naive code, profile it, find the bottleneck, and optimize — with measurements at every step.
21 chapters. 5 appendices. From silicon to cargo bench.
Table of contents
Part I — The Machine Under Your Code
- Why Should You Care?
- Bits, Gates, and How Computers Think
- The ALU — The Calculator Inside
- Registers — The Fastest Memory You'll Never See
- The Memory Hierarchy — A Story of Trade-offs
- How the Cache Actually Works
- The Instruction Cycle — Fetch, Decode, Execute
- The Instruction Pipeline — An Assembly Line for Instructions
- Branch Prediction — Guessing the Future
- Out-of-Order Execution and Superscalar Design
- Multi-Core and Parallelism
Part II — Writing Rust That Respects the Hardware
- From Rust to Machine Code — The Compilation Pipeline
- Data Layout and Cache Performance
- Branching in Practice
- SIMD in Rust
- Memory Access Patterns
- Concurrency and the Hardware
- Alignment, Allocation, and the OS
- Profiling and Measuring
- Case Studies
- Zero-Copy by Default — Rust's Hidden Performance Advantage
Appendices: x86-64 Assembly Cheat Sheet, ARM64 Assembly Cheat Sheet, Glossary, Tools Reference, Further Reading