PREFACE
- Why I Love Being a Data Analyst
- Discovering Data Analysis
- A Career Built on Curiosity
- When Data Became Personal
- The Power of Real World Data
WHY I WROTE THIS BOOK
WHO SHOULD READ THIS BOOK
HOW TO USE THIS BOOK
TOOLS CHANGE. HABITS ARE FOREVER.
THE ELEPHANT IN THE ROOM
- What About Artificial Intelligence?
- Responsible Use of AI
- How I Use AI in Practice
- The Future of the Data Analyst
THE TEN HABITS OF GREAT DATA ANALYSTS
CHAPTER 1
- Understand the Science and the People
- Learning the Disease
- Learning From the Literature
- Case Study: When a Few Minutes of Reading Saves the Project
- Listening to People with Lived Experience
- Case Study: Listening to Lived Experience Averts a Catastrophe
- The Importance of Team Science: The Analyst as a Collaborator
- Curiosity Is the Foundation
CHAPTER 2
- Start With a Clear Statistical Analysis Plan
- What Is a Statistical Analysis Plan?
- Translating Questions into Analysis
- Using Table Shells
- Operationalizing Variables
- Clarifying the Deliverable
- Understanding Deadlines and Timelines
- Including Exploratory Analysis and Data Cleaning in the Data Analysis Plan
- Plans Can Change
- Case Study: The Registry as a Cornucopia Gone Wrong
CHAPTER 3
- Explore the Data Before Modeling
- Why Exploration Is Important
- Start With the Structure
- Examine Missing Data
- Identify Outliers and Inconsistencies
- Use Visual Exploration
- Tools That Support Exploration
- Becoming the Data Expert
- Curiosity Leads to Insight
- Case Study: What Exploratory Data Analysis Reveals
CHAPTER 4
- Prioritize Data Quality
- What Is Data Quality?
- Common Data Quality Problems
- Data Quality as an Analytical Skill
- Case Study: When Missing Data Become Part of the Research Question
- The Missing Data
- Rethinking the Question
- What the Missing Data Revealed
- A Different Kind of Contribution
- Lessons for Data Analysts
- Communicating Data Quality Issues
- Improving Data Quality Over Time
- Data Quality as a Mindset
CHAPTER 5
- Protect Data Security and Privacy
- Understanding Sensitive Data
- Regulatory Frameworks
- Secure Data Access
- Minimizing Data Exposure
- Responsible Use of Analytical Tools
- Transparency and Accountability
- Ethical Responsibility
- Case Study: Data Security Lessons from the Field
- When “De-Identified” Data Are Not Truly De-Identified
- Sharing Data Without Proper Review
- Data Use Agreements and Secondary Data
- Building a Relationship With the IRB
- Protecting Sensitive Information in Documentation
- Case Study: Proprietary Data Models
- Protecting Sensitive Information in Code
- Data Security Is a Shared Responsibility
- Security and Scientific Integrity
CHAPTER 6
- Communicate Clearly and Document Analytical Decisions
- Why Communication Is Important
- Communicating Throughout the Project
- A Practical Communication Tool
- Asynchronous Communication
- The Importance of Documentation
- Data Dictionaries
- Case Study: The Cost of Missing Documentation
- Building Better Documentation
- A More Structured Approach
- Lessons for Data Analysts
- Communication and Documentation Builds Collaboration
CHAPTER 7
- Systematically Tackle Complex Data
- Thinking in Domains
- Understanding Data Relationships
- Identifying Unique Identifiers
- Organizing Variables into Domains
- Derived Variables Across Domains
- Identifying Stratifying Variables
- Working Efficiently Across Similar Variables
- Protecting the Original Data
- Understanding the Story Behind the Data
- Helping Investigators Understand Their Data
- Case Study: A Registry with Thousands of Variables
- Looking for Clues in the Data
- Reconstructing the Study Design
- Lessons for Data Analysts
CHAPTER 8
- Respect Qualitative Methods and Unstructured Data
- Qualitative Methods Are Real Methods
- Open-Ended Responses Often Provide the Missing Context
- Coding, Recoding, and Categorization
- Unstructured Data Are a Major Part of Modern Data Science
- Clinical Notes Are Especially Important
- Natural Language Processing and Computational Approaches
- The Role of Manual Review
- Case Study: Finding a Needle in a Haystack
- Large Language Models (LLM) and New Possibilities
- Use the Right Tool for the Job
- Case Study: What Open-Ended Responses Revealed
- Lessons for Data Analysts
CHAPTER 9
- Know When to Do It the Old-Fashioned Way
- The Most Tedious Solution Is Sometimes the Most Efficient Solution
- Automation Has Costs Too
- This Is Not an Argument Against Technology
- Case Study: The Handwritten Logs
- Why the Manual Approach Won
- Other Old-Fashioned Tools That Still Work Just Fine
- Some Heroes Don’t Wear Capes
- Lessons for Data Analysts
CHAPTER 10
- Keep Learning and Contributing to the Community
- Learning Is Part of the Job
- Learning From Real Projects
- Learning From the Community
- Sharing What You Learn
- From Observation to Discovery
- Case Study: It’s Never Too Late
- Discovering a New Community
- Learning in Small Pieces
- Becoming Part of the Community
- The Power of Curiosity
- Lessons for Data Analysts
KEY TAKEAWAYS
- The Ten Habits
- What Data Analysis Really Is
- The Responsibility
- An Often Twisty Career Path
- The Only Tools You Need
REFERENCES
DATA ANALYST’S TOOLKIT
- Customizable Tools That Build Confidence