Getting Started with Data Journalism
Getting Started with Data Journalism
Writing data stories in any size newsroom
About the Book
Getting started with data journalism does not require huge resources. The basic skills can help you develop data stories in any size newsroom.
This book outlines the skills that have helped power award-winning journalism. The majority of the examples in this book are actual stories which have been run in a regional newspaper, giving an idea of the kind of content data journalism skills can be used to generate.
This book will teach you how to:
- Source data for your projects
- Clean messy data
- Understand and analyse data
- Find the story in your data
- Create visualisations from maps to social interactives
Table of Contents
- 1. Introduction
-
2. Why Data Journalism
- Data skills and traditional skills
- Editorial planning for data
- Data to story to resource
-
3. Getting Started
-
What is data?
- The basics of data files and formats
- The basics of working with spreadsheets
-
What is data?
-
4. Sources of Data
- Government Statistics
- Written answers
- Searching
-
Freedom of Information (FOI)
- Making a request
- FOI Tips
- Who is covered in the England and Wales?
- What if I do not get the information I want?
- FOI Resources
- Crowdsourcing
-
Scraping
- Very simple scraping
- Using Open Refine as a scraper
- ScraperWiki
- Scraping resources
-
5. Understanding what the numbers might mean
- Who gathered the data?
- Watch out for small numbers and rare events
- Consider how reliable your data is
- What are the long-term trends?
- Don’t cherry-pick your data
- Be careful of what the numbers mean
-
6. Cleaning Data
- Bringing data together
- Getting data out of PDFs
- Tidying up messy spreadsheets
- Data on websites
- Lots of abbreviations
- Different words, all meaning the same thing
-
7. Getting the story out of the data
- Simple maths
-
Spreadsheet Basics
- Doing Sums
- Formatting Cells
- Using Formulas
- Sorting
- Filtering
- Correlations
- Pivot Tables
- Data Mashups
-
8. Telling a Story with Visualisations
-
Choosing the right visualisation
- Tables
- Graphs and Charts
- Maps
- Infographics
- Some other things to remember
-
Choosing the right visualisation
-
9. Visualising Data - Graphs and Charts
- Google Charts
-
Tableau
- Getting Started
- Bar Charts
- Line Graphs
- Dashboards
- Filtering
- Tooltips
- Annotations and Reference Lines
- Pages
- Live updating graphs
-
10. Visualising Data -Maps
- Simple maps
- Mapping with Tableau
-
Heat Maps
- Openheatmap
- Google Fusion Tables
- Layered Maps
- Real-time Maps
-
11. Visualising Data - More Complex Visualisations
-
News Apps
- Actions
- Social interactives
-
News Apps
-
12. Conclusion
-
Where Next?
- Data Analysis
- Visualisation
- Coding
-
Where Next?
-
13. Appendix 1 - Full case studies
-
Bookies in Wales
- Finding data on bookies
- Making simple maps
- Bringing together bookies and deprivation
- Analysing your data
- Making a map to show the correlation
-
Children in Care
- Gathering data
- Tidier data
- Making a map
-
Parking Tickets
- Cleaning untidy data
- Cleaning untidier data
- Cleaning untidiest data
- Geolocating the parking tickets
- Visualising your data
-
Road accidents and casualties
- Merging the data
- Cleaning the data
- Visualising the data
-
School Spending on Temporary Staff
- Using pivot tables to clean data
- Merging data for better analysis
- Analysing the data
-
South Wales Police Helicopter
- Scraping
- Tidying up unstructured data
- Mapping your data
-
Bookies in Wales
-
14. Appendix 2 - Useful Links
- Data journalism
- Finding Data
- Cleaning Data
- Visualisation
Authors have earned$9,913,563writing, publishing and selling on Leanpub, earning 80% royalties while saving up to 25 million pounds of CO2 and up to 46,000 trees.
Learn more about writing on Leanpub
The Leanpub 45-day 100% Happiness Guarantee
Within 45 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms
Free Updates. DRM Free.
If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).
Most Leanpub books are available in PDF (for computers), EPUB (for phones and tablets) and MOBI (for Kindle). The formats that a book includes are shown at the top right corner of this page.
Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.
Learn more about Leanpub's ebook formats and where to read them
Top Books
C++ Best Practices
Jason TurnerLevel up your C++, get the tools working for you, eliminate common problems, and move on to more exciting things!
Continuous Delivery Pipelines
Dave FarleyThis practical handbook provides a step-by-step guide for you to get the best continuous delivery pipeline for your software.
OpenIntro Statistics
David Diez, Christopher Barr, Mine Cetinkaya-Rundel, and OpenIntroA complete foundation for Statistics, also serving as a foundation for Data Science.
Leanpub revenue supports OpenIntro (US-based nonprofit) so we can provide free desk copies to teachers interested in using OpenIntro Statistics in the classroom and expand the project to support free textbooks in other subjects.
More resources: openintro.org.
C++20
Rainer GrimmC++20 is the next big C++ standard after C++11. As C++11 did it, C++20 changes the way we program modern C++. This change is, in particular, due to the big four of C++20: ranges, coroutines, concepts, and modules.
The book is almost daily updated. These incremental updates ease my interaction with the proofreaders.
Atomic Kotlin
Bruce Eckel and Svetlana IsakovaFor both beginning and experienced programmers! From the author of the multi-award-winning Thinking in C++ and Thinking in Java together with a member of the Kotlin language team comes a book that breaks the concepts into small, easy-to-digest "atoms," along with exercises supported by hints and solutions directly inside IntelliJ IDEA!
Introductory Statistics with Randomization and Simulation
Mine Cetinkaya-Rundel, Christopher Barr, OpenIntro, and David DiezA complete foundation for Statistics, also serving as a foundation for Data Science, that introduces inference using randomization and simulation while covering traditional methods.
Leanpub revenue supports OpenIntro, so we can provide free desk copies to teachers interested in using our books in the classroom.
More resources: openintro.org.
Ansible for DevOps
Jeff GeerlingAnsible is a simple, but powerful, server and configuration management tool. Learn to use Ansible effectively, whether you manage one server—or thousands.
Java OOP Done Right
Alan MellorObject Oriented Programming is still a great way to create clean, maintainable code. But only if you use it right.
This book gives you 25 years of OO best practice, ready to use.
You'll learn to design objects behaviour-first, use TDD to help, then confidently apply Design Patterns, SOLID principles and Refactoring to make clean, crafted code.
Introducing EventStorming
Alberto BrandoliniThe deepest tutorial and explanation about EventStorming, straight from the inventor.
Discrete Mathematics for Computer Science
Alexander Shen, Alexander S. Kulikov, Vladimir Podolskii, and Aleksandr GolovnevThis book supplements the DM for CS Specialization at Coursera and contains many interactive puzzles, autograded quizzes, and code snippets. They are intended to help you to discover important ideas in discrete mathematics on your own. By purchasing the book, you will get all updates of the book free of charge when they are released.
Top Bundles
- #1
Software Architecture for Developers: Volumes 1 & 2 - Technical leadership and communication
2 Books
"Software Architecture for Developers" is a practical and pragmatic guide to modern, lightweight software architecture, specifically aimed at developers. You'll learn:The essence of software architecture.Why the software architecture role should include coding, coaching and collaboration.The things that you really need to think about before... - #2
CCIE Service Provider Ultimate Study Bundle
2 Books
Piotr Jablonski, Lukasz Bromirski, and Nick Russo have joined forces to deliver the only CCIE Service Provider training resource you'll ever need. This bundle contains a detailed and challenging collection of workbook labs, plus an extensively detailed technical reference guide. All of us have earned the CCIE Service Provider certification... - #3
Cisco CCNA 200-301 Complet
4 Books
Ce lot comprend les quatre volumes du guide préparation à l'examen de certification Cisco CCNA 200-301. - #4
Modern C++ by Nicolai Josuttis
2 Books
- #5
CCDE Practical Studies (All labs)
3 Books
CCDE lab - #6
"The C++ Standard Library" and "Concurrency with Modern C++"
2 Books
Get my books "The C++ Standard Library" and "Concurrency with Modern C++" in a bundle. The first book gives you the details you should know about the C++ standard library; the second one dives deeper into concurrency with modern C++. In sum, you get more than 600 pages full of modern C++ and about 250 source files presenting the standard library... - #7
Mastering Containers
2 Books
Docker and Kubernetes are taking the world by storm! These books will get you up-to-speed fast! Docker Deep Dive is over 400 pages long, and covers all objectives on the Docker Certified Associate exam.The Kubernetes Book includes everything you need to get up and running with Kubernetes! - #8
The Future of Digital Health
6 Books
We put together the most popular books from The Medical Futurist to provide a clear picture about the major trends shaping the future of medicine and healthcare. Digital health technologies, artificial intelligence, the future of 20 medical specialties, big pharma, data privacy and how technology giants such as Amazon or Google want to conquer... - #9
Django for Beginners/APIs/Professionals
3 Books
- #10
Linux Administration Complet
4 Books
Ce lot comprend les quatre volumes du Guide Linux Administration :Linux Administration, Volume 1, Administration fondamentale : Guide pratique de préparation aux examens de certification LPIC 1, Linux Essentials, RHCSA et LFCS. Administration fondamentale. Introduction à Linux. Le Shell. Traitement du texte. Arborescence de fichiers. Sécurité...