Data Computing
Data Computing

An Introduction to Wrangling and Visualization in R

About the Book

Based on a no-prerequisite short course the author developed at Macalester College, Data Computing uses R, a leading software application for statistics and data analysis which is freely available for all widely-used platforms.  The book's short chapters and the clarity of notation supported by Hadley Wickham's hugely popular dplyr and ggplot2 packages help the reader to develop skills at a measured pace.

Praise for Data Computing

No matter what you do, you can use data to do it better. Gaining that superpower requires you to learn some programming tools and some mental tools. This book will teach you both: you'll get the mental building blocks to think about data analysis, and the computational tools to turn those thoughts into code. If you're just learning to swim in the data ocean, Danny's lucid writing and thoughtful approach makes this book a great place to start!  --- Hadley Wickham, Chief Scientist, RStudio

The book covers in a systematic way not only the stuff I've picked up in a fragmentary, bit-at-a-time way, but more important for me, stuff I wanted to know, and stuff I needed to know without realizing it.  Better yet, Kaplan's book makes it all easily accessible. Just what our profession needs! --- George Cobb, Professor Emeritus of Mathematics and Statistics, Mt. Holyoke College

About the Author

Daniel Kaplan

Daniel Kaplan is a Harvard-trained, award-winning teacher. He has two decades of experience teaching statistics and modeling, computing, and applied mathematics.  His graduate training in biomedical engineering and economics, as well as extensive consulting, give him an applied perspective: using data to serve a purpose.

Kaplan is the DeWitt Wallace Professor of Mathematics, Statistics, and Computer Science at Macalester College in Saint Paul, Minnesota, USA. His textbooks include 

  • Data Computing
  • Statistical Modeling: A Fresh Approach 
  • An Introduction to Scientific Computation and Programming in Python
  • Understanding Nonlinear Dynamics.

Table of Contents

1. Tidy Data

2. Computing with R

3. R Command Patterns

4. Files and Documents

5. Introduction to Data Graphics

6. Frames, Glyphs, and other Components of Graphics

7. Wrangling and Data Verbs

8. Graphics and their Grammar

9. More Data Verbs

10. Joining Two Tables

11. Wide versus Narrow Data Layouts

12. Ranks and Ordering

13. Networks

14. Collective Properties of Cases

15. Scraping and Cleaning Data

16. Using Regular Expressions

17. Machine Learning


  • Projects
    • Popular Names
    • Bird Species
    • World Cities
    • Stocks and Dividends
    • Statistics of Gene Expression
    • Bicycle Sharing
    • Inspecting Restaurants
    • Street or Road
    • Scraping Nuclear Reactors
  • Solutions to selected exercises
  • Readings and Notes

