James Ball is a data journalist working for the Guardian investigations team, which he joined after working for Wikileaks and the Bureau of Investigative Journalism. He is the Washington Post Laurence Stern Fellow for 2012 and visiting lecturer at City University London for the Interactive Journalism course. I interviewed him on his data journalism career, the industry and tools of the trade…
How did you become involved Wikileaks and the Guardian?
I suppose one of them lead to the other. I left City and did a fairly standard first job but worked my way up through the newspapers which then enabled me to get a job at the Bureau of Investigative Journalism when it opened.
Gavin McFadyen, Director at the Bureau, was tied to Wikileaks and he knew Julian Assange, so after the Guardian did the Afghan war logs, the Bureau managed to talk to Julian about doing some TV on the Iraq ones, and I came along to that meeting because I had data skills basically and I was almost like a nerd to translate between the Bureau and the rest of the people, which is obviously a good place to be.
So we had a very late dinner on Saturday night and at about one o’clock in the morning Julian gives me the USB stick with all of the cables on it – 400,000 …You have visions of dropping it down the grate or something! Then, as I’m going to leave he tells me not to go home, because he didn’t want the two addresses connected, so I was trying to think about what I could do at one o’clock on a Saturday night rather than go home. You sort of think: ‘I can’t really hit a club’, can you imagine, ‘I got a bit drunk, lost all the cables…Only 400,000 classified military documents!’ I eventually ended up crashing on the floor of a friend’s house.
With Wikileaks, we went through all of the cables and I obviously got to know Julian a bit more during that, and worked out what was coming next. Essentially I got a message on my computer at about two o’clock in the morning informing me that if I wanted to work on the next release of cables it would have to be for Wikileaks.
David Leigh at the Guardian told me not to it because I’d never work in mainstream media again! But I did it anyway, and I said I’d only do it for a few months, and as I was coming up towards leaving, I had a couple of job offers as a result of where I’d been, partly because I’d met a lot of senior editors and so on.
The Guardian was one of the job offers, and for a lot of people, when they get into journalism, the Guardian is the paper they want to work for. That was true for me so I ended up working there.
What’s your dream dataset from any government or organisation?
Having done the embassy cables and so on, obviously I would love the CIA cables or MI6 and MI5. I’m sure there would be a very limited amount you could publish but in a dream world it would be great wouldn’t it? It would get you in trouble but it would be so much fun to have access to.
In day to day life I’d like to see a lot more of the actual advice to ministers, and the inside of how decisions are made. We’ve got Freedom of Information but if you try and get anything about a briefing to a minister or how policy is actually decided, you’re never successful, and I’d love to see more of that information. It might not be data, but it’s just opening information up.
Do you think the data comes before the story or the story comes before the data?
Personally, I’m more story than data and I really think that’s how it goes – if the data doesn’t back your story you drop the story. It’s a lot better to have a question in mind and then look for it, so I always go that way around. If you’re trying to regularly explain information, such GDP releases, economy or public spending, you’re better off following the data and seeing what comes of it.
But my job is more to dig up news stories, and come at things that way. I think it’s always better to have the story in mind because otherwise you could chase everything at once, as there’s so many exciting bits of data or bits of information. Showing what data’s going on there, and then using it to tell some stories is what I find most interesting.
What’s been your most difficult project to date?
Everything is difficult in different ways. You end up learning a lot when ever anything goes catastrophically wrong, which a lot of the time it does! Logistically, the hardest thing had to be Reading the Riots, because we had sets of very complicated data, we were mindful that we had promised people anonymity, and so we had to set up very careful systems which ensured we knew the material we were being sent was genuine and wasn’t being faked.
We had to sort information from so many sources as well as trying to do a three month data-driven project with some stuff on field interviews, some stuff on court records, Twitter feeds and so on, but to keep all these things cogent and talking to each other, and then present it in such a way that it wasn’t going to be a 292-page academic report that no one would read was difficult.
We did win a few awards for it in the end. The Riot Rumours interactive was something I thought was really fun, that was the petri-dish type interactive. It was quite light-hearted on one level but it did draw people into the more serious journalism. The whole project won Innovation of the Year at the British Journalism Awards, so thankfully our hard work paid off.
Which data tools do you recommend or use the most?
If I could only ever pick one tool it would always be Microsoft Excel, because you’ve got to start at the most basic level, but it does so much. It’s useless for presentation but it’s incredibly rare for a tool that no matter how big or small the story is, it’s not heavily involved. It’s the right mix of not freaking new people out when they’re shown things on it, while at the same time it has the power to do so many things.
Visualisations with things like Google Fusions and Data Wrapper are all really useful and have come a long way, but what I really miss are IBM’s s tools. They used to have this brilliant tool for tree maps – the square diagrams which are really simple, useful, nice diagrams. Having to go back and talk to designers to do similar maps is a real shame because it’s so backwards, but I’m sure someone else will come up with something similar. It is getting easier and easier to do lots more with the tools out there.
Which other data journalism work do you admire?
There’s absolutely masses of really good stuff. On the more investigative side, Reuters did a great guide to inequality in America, by a guy called Himanshu Ojha, and it was a really thorough guide which had joined up the data from a lot of on the ground reporting.
The BBC have a wonderful way of doing really quick, grabby interactives, like the Seven Billion People one they did which was really clever. They basically thought: ‘Why don’t we look at population over history and let people type in their date of birth and see you are the ‘xth’ person born in the world’ and share it. It was popular for weeks.
Do you think the Guardian and BBC are data rivals?
Not particularly; the BBC are very good at fun and explanatory stuff but they stay very firmly on the record. They ‘react’ to news whereas I think we use it to be proactive and create the news, and generate stories and get ahead. I think we do different things but I’ve got a huge respect for what they’re doing. They’re brilliant at explanatory stuff, though, and I think we can get better on that. I think we’re quite good at graphics driving a story but their background and explanations are good, neat communication.
What three pieces of advice would you give to aspiring data journalists?
A lot of it comes down to persistence, and people accepting they’re not going to have a simple career path or a simple career route such as going straight into the papers or the TV. The publications that you like may not be around in five years’ time or you might not walk into the national job you want.
However, the more you can get used to the idea that if you meet lots of people while you try and create chances for yourself, even if they don’t seem to come to anything at the time, the further you’ll go. Someone that you may have met two years ago and impressed might suddenly call you up and need something. I went virtually overnight through a phone call from being in trade journalism to doing investigative stuff on TV through the Bureau, and it’s not necessarily a smooth, normal career path, so you’ve got to be persistent. Get out there, meet people, stick with it.
The second thing is knowing that the eye for the stories is as important as anything, and is as hard to teach as it ever was, but don’t let the new techniques distract you from that. If you can come up with good ideas that other people haven’t, you’ll get commissioned and get people’s interest.
Finally, you have got to have skills that other people in the newsroom don’t. At the moment that’s probably still data and social. The bar is getting higher on each – five years ago if you could use an Excel sheet you were a rarity. Now, being able to use it smartly, being able to think about it, and being smart about how you use social and curating it is what you have to do to stand out. Tweeting certain things years ago made stars of several people but that won’t happen again.
Maybe the next thing in data is proper use of news content, but either way the more you explore new areas, you’re going to find spaces in newsrooms to do good stuff. My answers are almost contradictory between two and three but that’s how the industry is at the moment!
James’ book, The Infographic History of the World, is available now.