Machine Learning Pipeline
This book is 60% complete
Last updated on 2019-06-11
About the Book
By reading this book you will learn how to build a machine learning pipeline for a real-life projects, whatever stopped you before from mastering machine learning with python you can easily overcome it with this book, because of easy step-by-step, and example-oriented approach that will help you apply the most straightforward and effective tools to both demonstrative and real-world problems and datasets.
Note: This book is for free and and will always be, so get your copy and we will be glade if you supported us by either with your feedback or some donation.
This book will cover the following:
Part one: Introduction
- an introduction to what is data science tools and how to setup it.
- an introduction to data science pipelines and define it and how to scale it.
- an introduction to machine learning pipelines and how learning is done.
- building a small project to make sure that you are now understand the meaning of pipelines.
Part two: Data
- defining data, types of data and levels of data, because it will help us to understand the data.
- understand and cleaning data process, since it's a very important step in the pipeline
- resampling data to create train-set and test-set, and splitting techniques.
- feature engineering and selection, and that's because not all time the needed variable is visible to us.
Part three: supervised leaning
an introduction to machine learning algorithms, how it works, and it's evaluation. And this part will cover the following algorithms:
- Linear Regression.
- Logistic Regression.
- Decision Trees.
- Support Vector Machines.
Table of Content:
Part One: Introduction
- 1 Chapter 1: Introduction
- 1.1 Introduction to data science and python
- 1.2 Installing python
- 1.3 Introducing IPython & Jupyter
- 1.4 Summary
- 2 Chapter 2: A Nice Tour Through Data Science Pipeline
- 2.1 What is Data Science?
- 2.2 A bird’s-eye view of the pipeline
- 2.3 Summary
- 3 Chapter 3: Machine Learning Pipeline
- 3.1 Data
- 3.2 Goals
- 3.3 Models
- 3.4 Features
- 3.5 Model Evaluation
Part Two: Data
- 4 Chapter 4: Defining Data
- 4.1 Defining Data
- 4.2 Why you should read this chapter?
- 4.3 Structured, semi-structured and unstructured data
- 4.4 Quantitative versus Qualitative data
- 4.5 Example – Titanic
- 4.6 Example – world alcohol consumption data
- 4.7 Divide and Conquer
- 4.8 Making a checkpoint
- 4.9 The four levels of data
- 4.10 The nominal level
- 4.11 The ordinal level
- 4.12 Quick recap and check
- 4.13 The Interval Level
- 4.14 The Ratio Level
- 4.15 Summarizing All Levels
- 4.16 Summary
- 5 Chapter 5: Data Cleaning
- 5.1 The data science pipeline revisited
- 5.2 Data loading and preprocessing with pandas
- 5.3 Missing Data
- 5.4 Dealing with big datasets
- 5.5 Accessing other data formats
- 5.6 Data preprocessing
- 5.7 Categorical and Text data
- 5.8 Case Study: Titanic
- 5.9 Summary
- 6 Chapter 6: Data Resampling
- 6.1 Creating training and test sets
- 6.2 Cross-Validation
- 6.2.1 Validation set technique
- 6.2.2 Leave-One-Out Cross-Validation (LOOCV)
- 6.2.3 K-Fold Cross-Validation
- 6.3 Bootstrap (not added yet)
- 6.4 Summary
- 7 Chapter 7: Feature Selection and Feature Engineering
- 7.1 scikit-learn datasets
- 7.2 Feature selection and filtering
- 7.3 Principal component analysis
- 7.3.1 Non-negative matrix factorization
- 7.3.2 Sparse PCA
- 7.3.3 Kernel PCA
- 7.4 Atom extraction and dictionary learning
- 7.5 Summary
Part Two: Supervised Learning Algorithms
- 8 Chapter 8: Introduction to Learn ability
- 9 Chapter 9: Linear Regression
- 9.1 Dataset we will use in the chapter
- 9.2 Simple Linear Regression
- 9.3 Example
- 9.4 Estimating the Parameters
- 9.4.1 Our Goal
- 9.4.2 Least Square Criterion
- 9.4.3 How Least Square Works?
- 9.5 Assessing the Accuracy of the Coefficient Estimates
- 9.6 Assessing the Accuracy of the Model
- 9.7 Multiple Linear Regression
- 9.7.1 Estimating the Regression Coefficients
- 9.8 Linear regression with scikit-learn
- 9.8.1 Regressor analytic expression
- 9.9 Ridge, Lasso, and ElasticNet
- 9.10 Robust regression with random sample consensus
- 9.11 Polynomial regression
- 9.12 Isotonic regression
- 9.13 Summary
- 10 Chapter 10: Logistic Regression
The Leanpub 45-day 100% Happiness Guarantee
Within 45 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms
Free Updates. DRM Free.
If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).
Most Leanpub books are available in PDF (for computers), EPUB (for phones and tablets) and MOBI (for Kindle). The formats that a book includes are shown at the top right corner of this page.
Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.