The Hundred-Page Machine Learning Book (PDF + EPUB + extra PDF formats)
The Hundred-Page Machine Learning Book
About the Book
Peter Norvig, Research Director at Google, co-author of AIMA, the most popular AI textbook in the world: "Burkov has undertaken a very useful but impossibly hard task in reducing all of machine learning to 100 pages. He succeeds well in choosing the topics — both theory and practice — that will be useful to practitioners, and for the reader who understands that this is the first 100 (or actually 150) pages you will read, not the last, provides a solid introduction to the field."
Aurélien Géron, Senior AI Engineer, author of the bestseller Hands-On Machine Learning with Scikit-Learn and TensorFlow: "The breadth of topics the book covers is amazing for just 100 pages (plus few bonus pages!). Burkov doesn't hesitate to go into the math equations: that's one thing that short books usually drop. I really liked how the author explains the core concepts in just a few words. The book can be very useful for newcomers in the field, as well as for old-timers who can gain from such a broad view of the field."
Gareth James, Professor of Data Sciences and Operations, co-author of the bestseller An Introduction to Statistical Learning, with Applications in R: "This is a compact “how to do data science” manual and I predict it will become a go-to resource for academics and practitioners alike. At 100 pages (or a little more), the book is short enough to read in a single sitting. Yet, despite its length, it covers all the major machine learning approaches, ranging from classical linear and logistic regression, through to modern support vector machines, deep learning, boosting, and random forests. There is also no shortage of details on the various approaches and the interested reader can gain further information on any particular method via the innovative companion book wiki. The book does not assume any high level mathematical or statistical training or even programming experience, so should be accessible to almost anyone willing to invest the time to learn about these methods. It should certainly be required reading for anyone starting a PhD program in this area and will serve as a useful reference as they progress further. Finally, the book illustrates some of the algorithms using Python code, one of the most popular coding languages for machine learning. I would highly recommend “The Hundred-Page Machine Learning Book” for both the beginner looking to learn more about machine learning and the experienced practitioner seeking to extend their knowledge base."
***
As its title says, it's the hundred-page machine learning book. It was written by an expert in machine learning holding a Ph.D. in Artificial Intelligence with almost two decades of industry experience in computer science and hands-on machine learning.
This is a unique book in many aspects. It is the first successful attempt to write an easy to read book on machine learning that isn't afraid of using math. It's also the first attempt to squeeze a wide range of machine learning topics in a systematic way and without loss in quality.
The book contains only those parts of the huge body of material on machine learning developed since the 1960s that have proven to have a significant practical value. A beginner in machine learning will find in this book just enough details to get a comfortable level of understanding of the field and start asking the right questions. Practitioners with experience will use this book as a collection of pointers to the directions of further self-improvement.
The book also comes in handy when brainstorming at the beginning of a project, when you try to answer the question whether a given technical or business problem is "machine-learnable" and, if yes, which techniques you should try to solve it.
The book comes with a wiki which contains pages that extend some book chapters with additional information: Q&A, code snippets, further reading, tools, and other relevant resources. Thanks to the continuously updated wiki this book like a good wine keeps getting better after you buy it.
Reader Testimonials

Karolis Urbonas
Head of Data Science at Amazon
This book is a great introduction to machine learning from a world-class practitioner and LinkedIn superstar Andriy Burkov. He managed to find a good balance between the math of the algorithms, intuitive visualizations, and easy-to-read explanations. This book will benefit the newcomers to the field as a thorough introduction to the fundamentals of machine learning, while the experienced professionals will definitely enjoy the practical recommendations from Andriy's rich experience in the field.

Chao Han
VP, Head of R&D at Lucidworks
I wish such a book existed when I was a statistics graduate student trying to learn about machine learning. There is the right amount of math which demystify the centerpiece of an algorithm with succinct but very clear descriptions. I'm also impressed by the widespread coverage and good choices of important methods as an introductory book (not all machine learning books mention things like learning to rank or metric learning). Highly recommended to STEM major students.

Sujeet Varakhedi
Head of Engineering at eBay
Whether you want to become a machine learning practitioner or looking for an everyday resource, Andriy's book does a fantastic job of cutting the noise and hitting the tracks and full speed from the first page. It manages to structure all the important concepts from foundations to applications into a relatively quick read and leave the reader engaged at all times.

Deepak Agarwal
VP of AI at LinkedIn
This book provides a great practical guide to get started and execute on ML within a few days without necessarily knowing much about ML apriori. The first five chapters are enough to get you started and the next few chapters provide you a good feel of more advanced topics to pursue. A wonderful book for engineers who want to incorporate ML in their day-to-day work without necessarily spending an enormous amount of time going through a formal degree program.

Vincent Pollet
Head of Research at Nuance
The Hundred-Page Machine Learning Book is an excellent read to get started with Machine Learning. In his book, Andriy Burkov distills the ubiquitous material on Machine Learning into concise and well-balanced intuitive, theoretical and practical elements that bring beginners, managers, and practitioners many life hacks.
Table of Contents
Preface
1 Introduction
1.1 What is Machine Learning
1.2 Types of Learning
1.2.1 Supervised Learning
1.2.2 Unsupervised Learning
1.2.3 Semi-Supervised Learning
1.2.4 Reinforcement Learning
1.3 How Supervised Learning Works
1.4 Why the Model Works on New Data
2 Notation and Definitions
2.1 Notation
2.1.1 Data Structures
2.1.2 Capital Sigma Notation
2.1.3 Capital Pi Notation
2.1.4 Operations on Sets
2.1.5 Operations on Vectors
2.1.6 Functions
2.1.7 Max and Arg Max
2.1.8 Assignment Operator
2.1.9 Derivative and Gradient
2.2 Random Variable
2.3 Unbiased Estimators
2.4 Bayes’ Rule
2.5 Parameter Estimation
2.6 Parameters vs. Hyperparameters
2.7 Classification vs. Regression
2.8 Model-Based vs. Instance-Based Learning
2.9 Shallow vs. Deep Learning
3 Fundamental Algorithms
3.1 Linear Regression
3.1.1 Problem Statement
3.1.2 Solution
3.2 Logistic Regression
3.2.1 Problem Statement
3.2.2 Solution
3.3 Decision Tree Learning
3.3.1 Problem Statement
3.3.2 Solution
3.4 Support Vector Machine
3.4.1 Dealing with Noise
3.4.2 Dealing with Inherent Non-Linearity
3.5 k-Nearest Neighbors
4 Anatomy of a Learning Algorithm
4.1 Building Blocks of a Learning Algorithm
4.2 Gradient Descent
4.3 How Machine Learning Engineers Work
4.4 Learning Algorithms’ Particularities
5 Basic Practice
5.1 Feature Engineering
5.1.1 One-Hot Encoding
5.1.2 Binning
5.1.3 Normalization
5.1.4 Standardization
5.1.5 Dealing with Missing Features
5.1.6 Data Imputation Techniques
5.2 Learning Algorithm Selection
5.3 Three Sets
5.4 Underfitting and Overfitting
5.5 Regularization
5.6 Model Performance Assessment
5.6.1 Confusion Matrix
5.6.2 Precision/Recall
5.6.3 Accuracy
5.6.4 Cost-Sensitive Accuracy
5.6.5 Area under the ROC Curve (AUC)
5.7 Hyperparameter Tuning
5.7.1 Cross-Validation
6 Neural Networks and Deep Learning
6.1 Neural Networks
6.1.1 Multilayer Perceptron Example
6.1.2 Feed-Forward Neural Network Architecture
6.2 Deep Learning
6.2.1 Convolutional Neural Network
6.2.2 Recurrent Neural Network
7 Problems and Solutions
7.1 Kernel Regression
7.2 Multiclass Classification
7.3 One-Class Classification
7.4 Multi-Label Classification
7.5 Ensemble Learning
7.5.1 Boosting and Bagging
7.5.2 Random Forest
7.5.3 Gradient Boosting
7.6 Learning to Label Sequences
7.7 Sequence-to-Sequence Learning
7.8 Active Learning
7.9 Semi-Supervised Learning
7.10 One-Shot Learning
7.11 Zero-Shot Learning
8 Advanced Practice
8.1 Handling Imbalanced Datasets
8.2 Combining Models
8.3 Training Neural Networks
8.4 Advanced Regularization
8.5 Handling Multiple Inputs
8.6 Handling Multiple Outputs
8.7 Transfer Learning
8.8 Algorithmic Efficiency
9 Unsupervised Learning
9.1 Density Estimation
9.2 Clustering
9.2.1 K-Means
9.2.2 DBSCAN and HDBSCAN
9.2.3 Determining the Number of Clusters
9.2.4 Other Clustering Algorithms
9.3 Dimensionality Reduction
9.3.1 Principal Component Analysis
9.3.2 UMAP
9.4 Outlier Detection
10 Other Forms of Learning
10.1 Metric Learning
10.2 Learning to Rank
10.3 Learning to Recommend
10.3.1 Factorization Machines
10.3.2 Denoising Autoencoders
10.4 Self-Supervised Learning: Word Embeddings
11 Conclusion
11.1 Topic Modeling
11.2 Gaussian Processes
11.3 Generalized Linear Models
11.4 Probabilistic Graphical Models
11.5 Markov Chain Monte Carlo
11.6 Genetic Algorithms
11.7 Reinforcement Learning
Index
Other books by this author
The Leanpub 60-day 100% Happiness Guarantee
Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms
Do Well. Do Good.
Authors have earned$11,830,542writing, publishing and selling on Leanpub, earning 80% royalties while saving up to 25 million pounds of CO2 and up to 46,000 trees.
Learn more about writing on Leanpub
Free Updates. DRM Free.
If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).
Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.
Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.
Learn more about Leanpub's ebook formats and where to read them
Top Books
Stratospheric
Tom Hombergs, Björn Wilmsmann, and Philip RiecksFrom Zero to Production with Spring Boot and AWS. All you need to know to get a Spring Boot application into production with AWS. No previous AWS knowledge required.
Go to stratospheric.dev for a tour of the contents.
node-opcua by example
Etienne RossignonGet the best out of node-opcua through a set of documented examples by the author himself that will allow you to create stunning OPCUA Servers or Clients.
OpenIntro Statistics
David Diez, Christopher Barr, Mine Cetinkaya-Rundel, and OpenIntroA complete foundation for Statistics, also serving as a foundation for Data Science.
Leanpub revenue supports OpenIntro (US-based nonprofit) so we can provide free desk copies to teachers interested in using OpenIntro Statistics in the classroom and expand the project to support free textbooks in other subjects.
More resources: openintro.org.
Introduction to Data Science
Rafael A IrizarryThe demand for skilled data science practitioners in industry, academia, and government is rapidly growing. This book introduces concepts from probability, statistical inference, linear regression and machine learning and R programming skills. Throughout the book we demonstrate how these can help you tackle real-world data analysis challenges.
Discrete Mathematics for Computer Science
Alexander Shen, Alexander S. Kulikov, Vladimir Podolskii, and Alexander GolovnevThis book supplements the DM for CS Specialization at Coursera and contains many interactive puzzles, autograded quizzes, and code snippets. They are intended to help you to discover important ideas in discrete mathematics on your own. By purchasing the book, you will get all updates of the book free of charge when they are released.
Ansible for DevOps
Jeff GeerlingAnsible is a simple, but powerful, server and configuration management tool. Learn to use Ansible effectively, whether you manage one server—or thousands.
R Programming for Data Science
Roger D. PengThis book brings the fundamentals of R programming to you, using the same material developed as part of the industry-leading Johns Hopkins Data Science Specialization. The skills taught in this book will lay the foundation for you to begin your journey learning data science. Printed copies of this book are available through Lulu.
Maîtriser Apache JMeter
Philippe Mouawad, Bruno Demion (Milamber), and Antonio Gomes RodriguesToute la puissance d'Apache JMeter expliquée par ses commiteurs et utilisateurs experts. De l'intégration continue en passant par le Cloud, vous découvrirez comment intégrer JMeter à vos processus "Agile" et Devops.
If you're looking for the newer english version of this book, go to Master JMeter : From load testing to DevOps
Functional Event-Driven Architecture
Gabriel VolpeExplore the event-driven architecture (EDA) in a purely functional way. Learn to design and develop distributed systems that scale. Identify common design patterns in such systems.
Take your functional programming skills to the next level by joining me in developing a distributed system powered by Apache Pulsar and Fs2 streams, all in Scala 3!
C++20 - The Complete Guide
Nicolai M. JosuttisAll new language and library features of C++20 (for those who know previous C++ versions).
The book presents all new language and library features of C++20. Learn how this impacts day-to-day programming, to benefit in practice, to combine new features, and to avoid all new traps.
Buy early, pay less, free updates.
Other books:
Top Bundles
- #1
CCIE Service Provider Ultimate Study Bundle
2 Books
Piotr Jablonski, Lukasz Bromirski, and Nick Russo have joined forces to deliver the only CCIE Service Provider training resource you'll ever need. This bundle contains a detailed and challenging collection of workbook labs, plus an extensively detailed technical reference guide. All of us have earned the CCIE Service Provider certification... - #2
All the Books of The Medical Futurist
6 Books
We put together the most popular books from The Medical Futurist to provide a clear picture about the major trends shaping the future of medicine and healthcare. Digital health technologies, artificial intelligence, the future of 20 medical specialties, big pharma, data privacy, digital health investments and how technology giants such as Amazon... - #3
Practical FP in Scala + Functional event-driven architecture
2 Books
Practical FP in Scala (A hands-on approach) & Functional event-driven architecture, aka FEDA, (Powered by Scala 3), together as a bundle! The content of PFP in Scala is a requirement to understand FEDA so why not take advantage of this bundle!? - #4
Pattern-Oriented Memory Forensics and Malware Detection
2 Books
This training bundle for security engineers and researchers, malware and memory forensics analysts includes two accelerated training courses for Windows memory dump analysis using WinDbg. It is also useful for technical support and escalation engineers who analyze memory dumps from complex software environments and need to check for possible... - #6
Software Architecture
2 Books
"Software Architecture for Developers" is a practical and pragmatic guide to modern, lightweight software architecture, specifically aimed at developers. You'll learn:The essence of software architecture.Why the software architecture role should include coding, coaching and collaboration.The things that you really need to think about before... - #9
Learn Git, Bash, and Terraform the Hard Way
3 Books
Learn Git, Bash and Terraform using the Hard Way method.These technologies are essential tools in the DevOps armoury. These books walk you through their features and subtleties in a simple, gradual way that reinforces learning rather than baffling you with theory. - #10
Static Analysis and Automated Refactoring
2 Books
As PHP developers we are living in the "Age of Static Analysis". We can use a tool like PHPStan to learn about potential bugs before we ship our code to production, and we can enforce our team's programming standards using custom PHPStan rules. Recipes for Decoupling by Matthias Noback teaches you in great detail how to do this, while also...