Methods in Biostatistics with R
Methods in Biostatistics with R
A Rigorous and Practical Treatment of Biostatistics Foundations using R
About the Book
Biostatistics is easy to teach poorly. Too often, books focus on methodology with no emphasis on programming and practical implementations. In contrast, books focused on R programming and visualization rarely discuss foundational topics that provide the infrastructure needed by data analysts to make decisions, evaluate analytic tools, and get ready for new and unforeseen challenges. Thus, we are bridging this divide that had no reason to exist in the first place. The book is unapologetic about its focus on Biostatistics, that is Statistics with Biological, Public Health, and Medical applications, though we think that it could be used successfully for large Statistical and Data Science Courses. Data and code can be downloaded here: https://github.com/muschellij2/biostatmethods
Table of Contents
1 Introduction
1.1 Biostatistics
1.2 Mathematical prerequisites
1.3 R
2 Introduction to R
2.1 R and RStudio
2.2 Reading R code
2.3 R Syntax and Jargon
2.4 Objects
2.5 Assignment
2.6 Data Types
2.7 Data Containers
2.8 Logical Operations
2.9 Subsetting
2.10 Reassigment
2.11 Libraries and Packages
2.12 dplyr, ggplot2, and the tidyverse
2.13 Problems
3 Probability, random variables, distributions
3.1 Experiments
3.2 An intuitive introduction to the bootstrap
3.3 Probability
3.4 Probability calculus
3.5 Sampling in R
3.6 Random variables
3.7 Probability mass
3.8 Probability density function
3.9 Cumulative distribution function
3.10 Quantiles
3.11 Problems
3.12 Supplementary R training
4 Mean and Variance
4.1 Mean or expected value
4.2 Sample mean and bias
4.3 Variance, standard deviation, coefficient of variation
4.4 Variance interpretation: Chebyshev’s inequality
4.5 Supplementary R training
4.6 Problems
5 Random vectors, independence, covariance, and sample mean
5.1 Random vectors
5.2 Independent events and variables
5.3 Covariance and correlation
5.4 Variance of sums of variables
5.5 Sample variance
5.6 Mixture of distributions
5.7 Problems
6 Conditional distribution, Bayes’ rule, ROC
6.1 Conditional probabilities
6.2 Bayes rule
6.3 ROC and AUC
6.4 Problems
7 Likelihood
7.1 Likelihood definition and interpretation
7.2 Maximum likelihood
7.3 Interpreting likelihood ratios
7.4 Likelihood for multiple parameters
7.5 Profile likelihood
7.6 Problems
8 Data visualization
8.1 Standard visualization tools
8.2 Problems
9 Approximation results and confidence intervals
9.1 Limits
9.2 Law of Large Numbers (LLN)
9.3 Central Limit Theorem (CLT)
9.4 Confidence intervals
9.5 Problems
10 The χ 2 and t distributions
10.1 The χ 2 distribution
10.2 Confidence intervals for the variance of a Normal
10.3 Student’s t distribution
10.4 Confidence intervals for Normal means
10.5 Problems
11 t and F tests
11.1 Independent group t confidence intervals
11.2 t intervals for unequal variances
11.3 t-tests and confidence intervals in R
11.4 The F distribution
11.5 Confidence intervals and testing for variance ratios of Normal distributions
11.6 Problems
12 Data Resampling Techniques
12.1 The jackknife
12.2 Bootstrap
12.3 Problems
13 Taking logs of data
13.1 Brief review
13.2 Taking logs of data
13.3 Interpreting logged data
13.4 Inference for the Geometric Mean
13.5 Summary
13.6 Problems
14 Interval estimation for binomial probabilities
14.1 Introduction
14.2 The Wald interval
14.3 Bayesian intervals
14.4 Connections with the Agresti/Coull interval
14.5 Conducting Bayesian inference
14.6 The exact, Clopper-Pearson method
14.7 Confidence intervals in R
14.8 Problems
15 Building a Figure in ggplot2
15.1 The qplot function
15.2 The ggplot function
15.3 Making plots better
15.4 Make the Axes/Labels Bigger
15.5 Make the Labels to be full names
15.6 Making a better legend
15.7 Legend INSIDE the plot
15.8 Saving figures: devices
15.9 Interactive graphics with one function
15.10 Conclusions
15.11 Problems
16 Hypothesis testing
16.1 Introduction
16.2 General hypothesis tests
16.3 Connection with confidence intervals
16.4 Data Example
16.5 P-values
16.6 Discussion
16.7 Problems
17 Power
17.1 Introduction
17.2 Standard normal power calculations
17.3 Power for the t test
17.4 Discussion
17.5 Problems
18 R Programming in the Tidyverse
18.1 Data objects in the tidyverse: tibbles
18.2 dplyr: pliers for manipulating data
18.3 Grouping data
18.4 Summarizing grouped
18.5 Merging Data Sets
18.6 Left Join
18.7 Right Join
18.8 Right Join: Switching arguments
18.9 Full Join
18.10 Reshaping Data Sets
18.11 Recoding Variables
18.12 Cleaning strings: the stringr package
18.13 Problems
19 Sample size calculations
19.1 Introduction
19.2 Sample size calculation for continuous data
19.3 Sample size calculation for binary data
19.4 Sample size calculations using exact tests
19.5 Sample size calculation with preliminary data
19.6 Problems
20 References
Authors have earned$10,068,654writing, publishing and selling on Leanpub, earning 80% royalties while saving up to 25 million pounds of CO2 and up to 46,000 trees.
Learn more about writing on Leanpub
The Leanpub 45-day 100% Happiness Guarantee
Within 45 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms
Free Updates. DRM Free.
If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).
Most Leanpub books are available in PDF (for computers), EPUB (for phones and tablets) and MOBI (for Kindle). The formats that a book includes are shown at the top right corner of this page.
Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.
Learn more about Leanpub's ebook formats and where to read them
Top Books
C++ Best Practices
Jason TurnerLevel up your C++, get the tools working for you, eliminate common problems, and move on to more exciting things!
Digital-First Events
Joep Piscaer and Jana BorutaThe only resource you will ever need to launch your digital events program.
node-opcua by example
Etienne RossignonGet the best out of node-opcua through a set of documented examples by the author himself that will allow you to create stunning OPCUA Servers or Clients.
Cloud Strategy
Gregor Hohpe“Strategy is the difference between making a wish and making it come true.” A successful migration to the cloud can transform your organization, but it shouldn’t be driven by wishes. This book tells you how to develop a sound strategy guided by frameworks and decision models without being overly abstract nor getting lost in product details.
R Programming for Data Science
Roger D. PengThis book brings the fundamentals of R programming to you, using the same material developed as part of the industry-leading Johns Hopkins Data Science Specialization. The skills taught in this book will lay the foundation for you to begin your journey learning data science. Printed copies of this book are available through Lulu.
C++20
Rainer GrimmC++20 is the next big C++ standard after C++11. As C++11 did it, C++20 changes the way we program modern C++. This change is, in particular, due to the big four of C++20: ranges, coroutines, concepts, and modules.
The book is almost daily updated. These incremental updates ease my interaction with the proofreaders.
Sockets and Pipes
Type ClassesSockets and Pipes is not an introduction to Haskell; it is an introduction to writing software in Haskell. Using a handful of everyday Haskell libraries, this book walks through reading the HTTP specification and implementing it to create a web server.
Atomic Kotlin
Bruce Eckel and Svetlana IsakovaFor both beginning and experienced programmers! From the author of the multi-award-winning Thinking in C++ and Thinking in Java together with a member of the Kotlin language team comes a book that breaks the concepts into small, easy-to-digest "atoms," along with exercises supported by hints and solutions directly inside IntelliJ IDEA!
Ansible for DevOps
Jeff GeerlingAnsible is a simple, but powerful, server and configuration management tool. Learn to use Ansible effectively, whether you manage one server—or thousands.
Introducing EventStorming
Alberto BrandoliniThe deepest tutorial and explanation about EventStorming, straight from the inventor.
Top Bundles
- #1
Software Architecture for Developers: Volumes 1 & 2 - Technical leadership and communication
2 Books
"Software Architecture for Developers" is a practical and pragmatic guide to modern, lightweight software architecture, specifically aimed at developers. You'll learn:The essence of software architecture.Why the software architecture role should include coding, coaching and collaboration.The things that you really need to think about before... - #2
Django for Beginners/APIs/Professionals
3 Books
- #3
CCIE Service Provider Ultimate Study Bundle
2 Books
Piotr Jablonski, Lukasz Bromirski, and Nick Russo have joined forces to deliver the only CCIE Service Provider training resource you'll ever need. This bundle contains a detailed and challenging collection of workbook labs, plus an extensively detailed technical reference guide. All of us have earned the CCIE Service Provider certification... - #4
Cisco CCNA 200-301 Complet
4 Books
Ce lot comprend les quatre volumes du guide préparation à l'examen de certification Cisco CCNA 200-301. - #5
Modern Management Made Easy
3 Books
Read all three Modern Management Made Easy books. Learn to manage yourself, lead and serve others, and lead the organization. - #6
Linux Administration Complet
4 Books
Ce lot comprend les quatre volumes du Guide Linux Administration :Linux Administration, Volume 1, Administration fondamentale : Guide pratique de préparation aux examens de certification LPIC 1, Linux Essentials, RHCSA et LFCS. Administration fondamentale. Introduction à Linux. Le Shell. Traitement du texte. Arborescence de fichiers. Sécurité... - #7
CCDE Practical Studies (All labs)
3 Books
CCDE lab - #8
All the Books of The Medical Futurist
6 Books
We put together the most popular books from The Medical Futurist to provide a clear picture about the major trends shaping the future of medicine and healthcare. Digital health technologies, artificial intelligence, the future of 20 medical specialties, big pharma, data privacy, digital health investments and how technology giants such as Amazon... - #9
Cloud Architect: Transform Technology and Organization
2 Books
Architects don't just recite product names and features. They understand the options, decisions, and trade-offs behind them. They earn credibility and maintain authenticity by connecting the penthouse with the engine room. Get two essential books that redefine the role of the software and IT architect at one low price:37 Things One Architect... - #10
Mastering Containers
2 Books
Docker and Kubernetes are taking the world by storm! These books will get you up-to-speed fast! Docker Deep Dive is over 400 pages long, and covers all objectives on the Docker Certified Associate exam.The Kubernetes Book includes everything you need to get up and running with Kubernetes!