The Hitchhiker's Guide to Linear Models (The Book + Codes + Datasets)
The Hitchhiker's Guide to Linear Models
Based on the famous R programming language
About the Book
This book aims to get straight to the point, and the only thing I assume here is that you have used spreadsheets at some point and that you are motivated to estimate linear models in R. Here I do not assume that you know how to install R or the basics of the R programming language.
This books contains no proofs. I tried to replace those with multiple examples consisting in analyzing my own experiments, such as throwing a tennis ball and measuring the time it takes to hit the ground from different heights, and another where I got two thermometers and measured the temperature outside a building at the same time of the day on different days.
ISBN: 978-1-7380675-0-3
Packages
The Book
PDF
English
The Book + Codes + Datasets
All the datasets and codes used in the book. These are presented as RStudio projects with R scripts to enhance the hands-on book experience.
PDF
English
Reader Testimonials
Claudia Negri-Ribalta
Université Paris 1 Panthéon-Sorbonne
I think it's great that you are teaching them how to write in R. My big problem was always that, the syntax.
Catherine Moez
University of Toronto
Grounded book. I like the UofT-related examples.
Badi H. Baltagi
Syracuse University
This is a lot of work.
Table of Contents
- I. Preface
- II. R Setup
- 1. R and RStudio
- 1.1. Windows and Mac
- 1.2. Linux
- 2. Installing R
- 2.1. Windows and Mac
- 2.2. Linux
- 3. Installing RStudio
- 3.1. Windows and Mac
- 3.2. Linux
- 4. Installing R Packages
- 4.1. Windows and Mac
- 4.2. Linux
- 5. Changing RStudio colors and font
- 5.1. Windows and Mac
- 5.2. Linux
- 6. Installing Quarto
- 6.1. Windows and Mac
- 6.2. Linux
- 1. R and RStudio
- III. Linear algebra review
- 1. Using R as a calculator
- 2. System of linear equations
- 3. Matrix
- 4. Transpose matrix
- 5. Matrix multiplication
- 6. Matrix representation of a system of linear equations
- 7. Identity matrix
- 8. Inverse matrix
- 9. Solving systems of linear equations
- 1. Using R as a calculator
- IV. Statistics review
- 1. Using R as a calculator
- 1.1. Mean
- 1.2. Variance
- 1.3. Standard deviation
- 1.4. Covariance
- 1.5. Correlation
- 1.6. Normal distribution
- 1.7. Poisson distribution
- 1.8. Student's t-distribution
- 1.9. Computing probabilities with the normal distribution
- 1.10. Computing probabilities with the Poisson distribution
- 1.11. Computing probabilities with the t-distribution
- 2. Data and dataset
- 2.1. Mean
- 2.2. Variance
- 2.3. Standard deviation
- 2.4. Covariance
- 2.5. Correlation
- 2.6. Normal distribution
- 2.7. Poisson distribution
- 2.8. Student's t-distribution
- 2.9. Computing probabilities with the normal distribution
- 2.10. Computing probabilities with the Poisson distribution
- 2.11. Computing probabilities with the t-distribution
- 3. Summation
- 3.1. Mean
- 3.2. Variance
- 3.3. Standard deviation
- 3.4. Covariance
- 3.5. Correlation
- 3.6. Normal distribution
- 3.7. Poisson distribution
- 3.8. Student's t-distribution
- 3.9. Computing probabilities with the normal distribution
- 3.10. Computing probabilities with the Poisson distribution
- 3.11. Computing probabilities with the t-distribution
- 4. Probability
- 4.1. Mean
- 4.2. Variance
- 4.3. Standard deviation
- 4.4. Covariance
- 4.5. Correlation
- 4.6. Normal distribution
- 4.7. Poisson distribution
- 4.8. Student's t-distribution
- 4.9. Computing probabilities with the normal distribution
- 4.10. Computing probabilities with the Poisson distribution
- 4.11. Computing probabilities with the t-distribution
- 5. Descriptive statistics
- 5.1. Mean
- 5.2. Variance
- 5.3. Standard deviation
- 5.4. Covariance
- 5.5. Correlation
- 5.6. Normal distribution
- 5.7. Poisson distribution
- 5.8. Student's t-distribution
- 5.9. Computing probabilities with the normal distribution
- 5.10. Computing probabilities with the Poisson distribution
- 5.11. Computing probabilities with the t-distribution
- 6. Distributions
- 6.1. Mean
- 6.2. Variance
- 6.3. Standard deviation
- 6.4. Covariance
- 6.5. Correlation
- 6.6. Normal distribution
- 6.7. Poisson distribution
- 6.8. Student's t-distribution
- 6.9. Computing probabilities with the normal distribution
- 6.10. Computing probabilities with the Poisson distribution
- 6.11. Computing probabilities with the t-distribution
- 7. Sample size
- 7.1. Mean
- 7.2. Variance
- 7.3. Standard deviation
- 7.4. Covariance
- 7.5. Correlation
- 7.6. Normal distribution
- 7.7. Poisson distribution
- 7.8. Student's t-distribution
- 7.9. Computing probabilities with the normal distribution
- 7.10. Computing probabilities with the Poisson distribution
- 7.11. Computing probabilities with the t-distribution
- 1. Using R as a calculator
- V. Recommended workflow
- 1. Creating projects
- 2. Creating scripts
- 3. Creating notebooks
- 4. Organizing code sections
- 5. Customizing notebooks' output
- 1. Creating projects
- VI. Read, Manipulate, and Plot Data
- 1. The datasauRus dataset in R format
- 2. The Quality of Government dataset in CSV format
- 3. The Quality of Government dataset in SAV (SPSS) format
- 4. The Quality of Government dataset in DTA (Stata) format
- 5. The Freedom House dataset in XLSX (Excel) format
- 1. The datasauRus dataset in R format
- VII. Linear Model with One Explanatory Variable
- 1. Model specification
- 1.1. Linear model as correlation
- 1.2. Linear model as matrix multiplication
- 1.3. Relation between correlation and matrix multiplication
- 1.4. Computational note
- 2. The Galton dataset
- 2.1. Linear model as correlation
- 2.2. Linear model as matrix multiplication
- 2.3. Relation between correlation and matrix multiplication
- 2.4. Computational note
- 3. A word of caution about Galton's work
- 3.1. Linear model as correlation
- 3.2. Linear model as matrix multiplication
- 3.3. Relation between correlation and matrix multiplication
- 3.4. Computational note
- 4. Loading the Galton dataset
- 4.1. Linear model as correlation
- 4.2. Linear model as matrix multiplication
- 4.3. Relation between correlation and matrix multiplication
- 4.4. Computational note
- 5. Estimating linear models' coefficients
- 5.1. Linear model as correlation
- 5.2. Linear model as matrix multiplication
- 5.3. Relation between correlation and matrix multiplication
- 5.4. Computational note
- 6. Logarithmic transformations
- 6.1. Linear model as correlation
- 6.2. Linear model as matrix multiplication
- 6.3. Relation between correlation and matrix multiplication
- 6.4. Computational note
- 7. Plotting model results
- 7.1. Linear model as correlation
- 7.2. Linear model as matrix multiplication
- 7.3. Relation between correlation and matrix multiplication
- 7.4. Computational note
- 8. Linear model does not equal straight line
- 8.1. Linear model as correlation
- 8.2. Linear model as matrix multiplication
- 8.3. Relation between correlation and matrix multiplication
- 8.4. Computational note
- 9. Transforming variables
- 9.1. Linear model as correlation
- 9.2. Linear model as matrix multiplication
- 9.3. Relation between correlation and matrix multiplication
- 9.4. Computational note
- 10. Regression with weights
- 10.1. Linear model as correlation
- 10.2. Linear model as matrix multiplication
- 10.3. Relation between correlation and matrix multiplication
- 10.4. Computational note
- 1. Model specification
- VIII. Linear Model with Multiple Explanatory Variables
- 1. Model specification
- 1.1. Root Mean Squared Error and Mean Absolute Error
- 1.2. RMSE and MAE interpretation
- 1.3. Coefficient's standard error
- 1.4. Coefficient's t-statistic
- 1.5. Coefficient's p-value
- 1.6. Residual standard error
- 1.7. Model's multiple R-squared (or unadjusted R-squared)
- 1.8. Model's adjusted R-squared
- 1.9. Model's F-statistic
- 1.10. Error's normality
- 1.11. Error's homoscedasticity (homogeneous variance)
- 2. Life expectancy, GDP and well-being in the Quality of Government dataset
- 2.1. Root Mean Squared Error and Mean Absolute Error
- 2.2. RMSE and MAE interpretation
- 2.3. Coefficient's standard error
- 2.4. Coefficient's t-statistic
- 2.5. Coefficient's p-value
- 2.6. Residual standard error
- 2.7. Model's multiple R-squared (or unadjusted R-squared)
- 2.8. Model's adjusted R-squared
- 2.9. Model's F-statistic
- 2.10. Error's normality
- 2.11. Error's homoscedasticity (homogeneous variance)
- 3. Estimating linear models' coefficients
- 3.1. Root Mean Squared Error and Mean Absolute Error
- 3.2. RMSE and MAE interpretation
- 3.3. Coefficient's standard error
- 3.4. Coefficient's t-statistic
- 3.5. Coefficient's p-value
- 3.6. Residual standard error
- 3.7. Model's multiple R-squared (or unadjusted R-squared)
- 3.8. Model's adjusted R-squared
- 3.9. Model's F-statistic
- 3.10. Error's normality
- 3.11. Error's homoscedasticity (homogeneous variance)
- 4. Model accuracy
- 4.1. Root Mean Squared Error and Mean Absolute Error
- 4.2. RMSE and MAE interpretation
- 4.3. Coefficient's standard error
- 4.4. Coefficient's t-statistic
- 4.5. Coefficient's p-value
- 4.6. Residual standard error
- 4.7. Model's multiple R-squared (or unadjusted R-squared)
- 4.8. Model's adjusted R-squared
- 4.9. Model's F-statistic
- 4.10. Error's normality
- 4.11. Error's homoscedasticity (homogeneous variance)
- 5. Model summary
- 5.1. Root Mean Squared Error and Mean Absolute Error
- 5.2. RMSE and MAE interpretation
- 5.3. Coefficient's standard error
- 5.4. Coefficient's t-statistic
- 5.5. Coefficient's p-value
- 5.6. Residual standard error
- 5.7. Model's multiple R-squared (or unadjusted R-squared)
- 5.8. Model's adjusted R-squared
- 5.9. Model's F-statistic
- 5.10. Error's normality
- 5.11. Error's homoscedasticity (homogeneous variance)
- 6. Error's assumptions
- 6.1. Root Mean Squared Error and Mean Absolute Error
- 6.2. RMSE and MAE interpretation
- 6.3. Coefficient's standard error
- 6.4. Coefficient's t-statistic
- 6.5. Coefficient's p-value
- 6.6. Residual standard error
- 6.7. Model's multiple R-squared (or unadjusted R-squared)
- 6.8. Model's adjusted R-squared
- 6.9. Model's F-statistic
- 6.10. Error's normality
- 6.11. Error's homoscedasticity (homogeneous variance)
- 1. Model specification
- IX. Linear Model with Binary and Categorical Explanatory Variables
- 1. Model specification with binary variables
- 1.1. ANOVA is a particular case of a linear model with binary variables
- 1.2. Corruption and popular vote in the Quality of Government dataset
- 1.3. Estimating a linear model and ANOVA with one predictor and two categories
- 1.4. Corruption and regime type in the Quality of Government dataset
- 1.5. Estimating a linear model and ANOVA with one predictor and multiple categories
- 1.6. Estimating a linear model with continuous and categorical predictors
- 1.7. Corruption and interaction variables in the Quality of Government dataset
- 1.8. Estimating a linear model with binary interactions
- 1.9. Confidence intervals with binary interactions
- 1.10. Estimating a linear model with categorical interactions
- 1.11. Confidence intervals with categorical interactions
- 2. Model specification with binary interactions
- 2.1. ANOVA is a particular case of a linear model with binary variables
- 2.2. Corruption and popular vote in the Quality of Government dataset
- 2.3. Estimating a linear model and ANOVA with one predictor and two categories
- 2.4. Corruption and regime type in the Quality of Government dataset
- 2.5. Estimating a linear model and ANOVA with one predictor and multiple categories
- 2.6. Estimating a linear model with continuous and categorical predictors
- 2.7. Corruption and interaction variables in the Quality of Government dataset
- 2.8. Estimating a linear model with binary interactions
- 2.9. Confidence intervals with binary interactions
- 2.10. Estimating a linear model with categorical interactions
- 2.11. Confidence intervals with categorical interactions
- 3. Model specification with categorical interactions
- 3.1. ANOVA is a particular case of a linear model with binary variables
- 3.2. Corruption and popular vote in the Quality of Government dataset
- 3.3. Estimating a linear model and ANOVA with one predictor and two categories
- 3.4. Corruption and regime type in the Quality of Government dataset
- 3.5. Estimating a linear model and ANOVA with one predictor and multiple categories
- 3.6. Estimating a linear model with continuous and categorical predictors
- 3.7. Corruption and interaction variables in the Quality of Government dataset
- 3.8. Estimating a linear model with binary interactions
- 3.9. Confidence intervals with binary interactions
- 3.10. Estimating a linear model with categorical interactions
- 3.11. Confidence intervals with categorical interactions
- 1. Model specification with binary variables
- X. Linear Model with Fixed Effects
- 1. Year fixed effects
- 1.1. Model specification
- 1.2. Corruption and popular vote in the Quality of Government dataset
- 1.3. Estimating year fixed effects' coefficients
- 1.4. Estimating country-time fixed effects' coefficients
- 2. Country fixed effects
- 2.1. Model specification
- 2.2. Corruption and popular vote in the Quality of Government dataset
- 2.3. Estimating year fixed effects' coefficients
- 2.4. Estimating country-time fixed effects' coefficients
- 3. Country-year fixed effects
- 3.1. Model specification
- 3.2. Corruption and popular vote in the Quality of Government dataset
- 3.3. Estimating year fixed effects' coefficients
- 3.4. Estimating country-time fixed effects' coefficients
- 1. Year fixed effects
- XI. Generalized Linear Model with One Explanatory Variable
- 1. Model specification
- 1.1. Gaussian model
- 1.2. Poisson model
- 1.3. Quasi-Poisson model
- 1.4. Binomial model (or logit model)
- 2. Model families
- 2.1. Gaussian model
- 2.2. Poisson model
- 2.3. Quasi-Poisson model
- 2.4. Binomial model (or logit model)
- 1. Model specification
- XII. Generalized Linear Model with Multiple Explanatory Variables
- 1. Obtaining the original codes and data
- 2. Loading the original data
- 3. Ordinary Least Squares
- 4. Poisson Pseudo Maximum Likelihood
- 5. Tobit
- 6. Reporting multiple models
- 1. Obtaining the original codes and data
Other books by this author
The Leanpub 60 Day 100% Happiness Guarantee
Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
Now, this is technically risky for us, since you'll have the book or course files either way. But we're so confident in our products and services, and in our authors and readers, that we're happy to offer a full money back guarantee for everything we sell.
You can only find out how good something is by trying it, and because of our 100% money back guarantee there's literally no risk to do so!
So, there's no reason not to click the Add to Cart button, is there?
See full terms...
Earn $8 on a $10 Purchase, and $16 on a $20 Purchase
We pay 80% royalties on purchases of $7.99 or more, and 80% royalties minus a 50 cent flat fee on purchases between $0.99 and $7.98. You earn $8 on a $10 sale, and $16 on a $20 sale. So, if we sell 5000 non-refunded copies of your book for $20, you'll earn $80,000.
(Yes, some authors have already earned much more than that on Leanpub.)
In fact, authors have earnedover $14 millionwriting, publishing and selling on Leanpub.
Learn more about writing on Leanpub
Free Updates. DRM Free.
If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).
Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.
Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.
Learn more about Leanpub's ebook formats and where to read them