Alberto Cairo thinks you can solve almost all your problems with data journalism in two steps.
“Journalists,” he tells me, “in my opinion, tend to oversimplify matters quite a lot.
“Let’s say that you are exploring the average height of people in an area. If you only report the average, that may be wrong or right depending on how spread out the data is. If all people are more or less around that height then reporting that average is correct.
“But if you have a huge range of heights, the average is still the same but you may not be reporting how wide the spread is. Or if your distribution is bimodal, the average will still be the same, but you have a cluster of short people on one end and a cluster of tall people on the other end, that’s a feature of the data that will go unnoticed if you only report the averages.”
His other complaints about journalists attempting to write about data will be familiar to despairing members of data desks in newsrooms around the world. Speculative extrapolation, inferring causation from correlation, a lack of understanding of probability and uncertainty, and many other journalistic foibles all fall under Cairo’s fierce scrutiny.
“The thing is,” he insists, “all the things I’m mentioning are easily solved, so it’s not that you need to take a course on advanced correlation and regression analysis, it’s just a matter of learning Stats 101 and then read two, three books on quantitative thinking and quantitative reasoning.
“That’s how you avoid 80 per cent or 90 per cent of the problems, and the other 10, 20 per cent will be avoided if you consult with experts every time you do a story based on data, which is something we need to systematically do.”
Even most students of data journalism probably can’t say that they’ve read two or three books on quantitative thinking and quantitative reasoning, but if we’re serious about our pursuit of the truth, perhaps next time a disgraced politician publishes their memoirs we should Google ‘best books on statistics’ instead.
Though Cairo is of course prolific in data visualisation in his own right, he is clearly a teacher at heart. After two decades working in infographics and data visualisation, he could be forgiven for losing an ounce of enthusiasm, but the Knight Chair in Visual Journalism and author of two books on data visualisation still has a twinkle in his eye as we discuss his recent work.
He’s just finished teaching an online course on using data visualisation as more than just a method of communication. Instead, he is focusing on using it “to find stories in data”.
“There’s a whole branch of statistics,” he explains, “which was defined around the 60s, 70s and the 80s by a statistician called John Tukey.
“He wrote a book titled Exploratory Data Analysis. The whole field of data visualisation in computer science and statistics focuses mostly not on communication, it focuses on exploration, how to explore data, how to discover features of the data that may remain unnoticed if you don’t visualize them.”
Alberto Cairo is a member of the jury for this year’s Data Journalism Awards. Unfortunately he won’t give me a direct road map to victory, but for students hoping to enter this year (the first year in which a student category has been included), his advice is surely invaluable.
“Steal from the best.
“This is advice I give my students every semester: we all learn by copying other people. By copying I don’t mean plagiarising, but getting inspiration. Look at work from ProPublica or the Washington Post or The New York Times, and copy their style, copy their structure, copy the way they present information.
“Don’t try to think of graphics as if they were embellishments to your story, but as analytical tools and communication tools within your story. They should never be afterthoughts when you’re developing your story. They’re an integral part of your story and an integral part of its communicative power.”
Even with his experience, Cairo says, he still does this himself. Though many journalists seem addicted to credit, and are unlikely ever to admit to anything short of completely original works of genius, in data journalism, collaboration is endemic.
“Nobody works inside of a cocoon,” notes Cairo. “The community of data reporters and investigative reporters is very open. I just came back from the NICAR conference, and some of my students who attended were amazed that they could approach – I don’t know – Scott Klein from ProPublica and ask him questions directly. They believe there’s some sort of hierarchy, but there’s not.”
This lack of hierarchy should lend confidence to aspiring data journalists. To slightly amend Alberto Cairo’s steps to success: all one need do is get a decent grounding in statistics, consult with experts, and join the data journalism ecosystem.