Kick off your book project in 2 hours, get started with GhostAI in 2 hours, or do both! Free live workshops, on Zoom. You’ll leave with a real book project and a clear plan to keep going. Saturday, June 27, 2026.
This book brings the fundamentals of R programming to you, using the same material developed as part of the industry-leading Johns Hopkins Data Science Specialization. The skills taught in this book will lay the foundation for you to begin your journey learning data science. Printed copies of this book are available through Lulu.
Data analysis is now part of practically every research project in the life sciences. In this book we use data and computer code to teach the necessary statistical concepts and programming skills to become a data analyst. Instead of showing theory first and then applying it to toy examples, we start with actual applications and describe the theory as it becomes necessary to solve specific challenges. The book includes links to computer code that readers can use to follow along as they program.
This book teaches you to use R to effectively visualize and explore complex datasets. Exploratory data analysis is a key part of the data science process because it allows you to sharpen your question and refine your modeling strategies. This book is based on the industry-leading Johns Hopkins Data Science Specialization.
This book covers R software development for building data science tools. This book provides rigorous training in the R language and covers modern software development practices for building tools that are highly reusable, modular, and suitable for use in a team-based environment or a community of developers. (Printed copies coming soon!)
This book gives a brief, but rigorous, treatment of regression models intended for practicing Data Scientists.
En el manual expongo, de forma clara y sencilla, los conceptos básicos de un análisis exploratorio de datos a nivel descriptivo y cómo llevarlo a la práctica con el software estadístico R y datos reales. El libro está pensado para que el lector avance paso a paso en su proceso de auto-aprendizaje, por lo que se proporcionan muchos ejemplos.
Modern Computational Statistics with R teaches statistics as a disciplined way of thinking: start with the scientific question, design, data, and uncertainty before reaching for formulas. Through real examples, simulations, and R, readers learn how to turn data into defensible evidence and build the statistical foundations needed for modern data science, machine learning, and AI.
The book provides a modern look at introductory Biostatistical concepts and the associated computational tools using the latest developments in computation and visualization in the R language environment. The book includes practical data analysis based on datasets that can be downloaded here: https://github.com/muschellij2/biostatmethods.
Greenhouse gas emissions have caused considerable changes in climate, including increased surface air temperatures and rising sea levels. This e-textbook presents a series of laboratory exercises in R that teach the Earth science and statistical concepts needed for assessing climate-related risks. These exercises are intended for upper-level undergraduates, beginning graduate students, and professionals in other areas who wish to gain insight into academic climate risk analysis.
A rigorous treatment of linear models for self learning data scientists. This book is only available in pdf form.
This book teaches the fundamental concepts and tools behind reporting modern data analyses in a reproducible manner. As data analyses become increasingly complex, the need for clear and reproducible report writing is greater than ever. The material for this book was developed as part of the industry-leading Johns Hopkins Data Science Specialization. Printed versions are available through Lulu (see link below).
The Working Notes complement Applied Data Science for Credit Risk and Probability of Default Rating Modeling with R, offering practice-oriented insights. Based on the author’s GitHub repository, they address real-world challenges and are regularly updated to reflect ongoing developments.
Biological Data Science with R covers data manipulation with dplyr, visualization with ggplot2, essential statistics, survival analysis, RNA-seq analysis, phylogenetic trees, predictive modeling and infectious disease forecasting, text mining and natural language processing, and more.
Develop insights from data with tidy tools. Import, wrangle, visualize, and model data with the Tidyverse R packages.
This book introduces the topic of Developing Data Products in R. A data product is the ideal output of a Data Science experiment. This book is based on the Coursera Class "Developing Data Products" as part of the Data Science Specialization. Particular emphasis is paid to developing Shiny apps and interactive graphics.