About the Book
Data science has taken the world by storm. In a recent survey by Glassdoor, Data Scientist was named the best job in America. But what is the job of the data scientist? Data science is a process by which we state a question, gather and explore relevant data, conduct formal modeling, interpret results, and communicate findings. There is an art to this process that requires experience and collaboration to master.
The goal of the Data Science Salon is to provide a guided path to learning the conceptual and analytical skills needed to be an effective data scientist. We formulate this path as a group activity that can be done in small chunks on a regular basis. The primary objective is for your salon team to apply the Art of Data Science framework to a data science question of your choice and produce a report or summary of the analysis. After working through your own data science question using the TAODS framework, you will be well equipped to embark on additional data analysis projects of even greater complexity. Unlike didactic coursework with short answer questions and data analysis courses where you analyze the instructor’s sample data, in your Salon, you will be provided a structure to support your team’s analysis of its own data science question. This learning format is especially well suited to those who have had some traditional coursework and are poised to make the leap to applying technical skills to a real world data science question.
The Data Science Salon series is composed of weekly sessions, which can be completed at any pace. The book provides summaries of what activities should be conducted each week and presents questions that should be answered by the end of each session. The objective of the Salon is for your group to meet regularly to work out different aspects of your data science problem. Each session will be supported by reading materials from the Art of Data Science, lecture videos from the authors, and additional reading material specifically supporting each session.
About the Authors
Roger D. Peng is a Professor of Statistics and Data Sciences at the University of Texas, Austin. Previously, he was Professor of Biostatistics at the Johns Hopkins Bloomberg School of Public Health. His research focuses on the development of statistical methods for addressing environmental health problems and on developing tools for doing better data analysis. He is the author of the popular book R Programming for Data Science and 10 other books on data science and statistics. He is also the co-creator of the Johns Hopkins Data Science Specialization, the Simply Statistics blog where he writes about statistics for the public, the Not So Standard Deviations podcast with Hilary Parker, and The Effort Report podcast with Elizabeth Matsui. Roger is a Fellow of the American Statistical Association and is the recipient of the Mortimer Spiegelman Award from the American Public Health Association, which honors a statistician who has made outstanding contributions to public health. He can be found on Twitter and GitHub at @rdpeng.
Elizabeth Matsui is a Professor of Population Health and Pediatrics at Dell Medical School at UT Austin and an Adjunct Professor of Pediatrics at Johns Hopkins University. She is also a practicing pediatric allergist/immunologist and epidemiologist and directs a research program focused on environmental exposures and lung health. Elizabeth can be found on Twitter @elizabethmatsui.
Corinne Keet, MD, PhD is an Associate Professor of Pediatrics at the Johns Hopkins School of Medicine. She is an epidemiologist specializing in the causes and treatment of allergic diseases. She is cofounder of Skybrude Consulting, LLC, a data science consulting firm.