Opinion: Data journalism is about more than finding a shocking figure

Ampp3d data journalism

On Tuesday, the Guardian’s Teacher Network hosted an online debate around how schools make use of data.

From the start, teachers expressed concerns that too much focus on data is getting in the way of teaching.

On the panel of experts was Simon Warburton, deputy headteacher at Hitchin Boys’ School in Hertfordshire. He wrote: “We are awash with data but we don’t always see the interrogation of it that can lead to effective intervention and support.”

Others, such as Rachael Lizzie Harper, agreed. She wrote: “Data is of course important in telling us what is going on but it doesn’t say how or why.”

In other words, data by itself can’t tell the full story.

The discussion was interesting because while it was happening, some similar conversations were taking place in a pub at King’s Cross Station.

A few of the Interhacktives had gone to the journalism.co.uk social to mingle. Some questions were raised over whether the type of data journalism that places a heavy focus on charts and figures, such as that produced by Ampp3d, can sometimes miss the mark in terms of telling the full story.

Of course, Ampp3d’s mantra is around exploring key facts and figures from the day’s news agenda and so telling the full story isn’t necessarily its goal. But thinking about the full picture is crucial.

A shocking statistic can often seem like it must be the story, but without finding out the reasons behind it or seeing what effects it might be having on people, it can be meaningless or even misleading.

Daily Mail data journalism
Missing the point … data journalism from the Daily Mail

As an example, take the Daily Mail’s September announcement that we’re now facing “Global Cooling”. It suggested that because the Arctic ice cap had grown by 29% in a year, global warming predictions were wrong.

But environmental journalist Tom Yulsman wrote a detailed blog post demonstrating how focusing on one big figure at the expense of context completely changes the story. The article had not taken long-term trends into account which have seen year-to-year variations but still an overall decline in sea ice extent.

What we can all take away from this is that we should never rely on the key figure to tell a story. We should still be speaking to the right people and delivering whichever contextual information is needed.

Data is a powerful tool in a journalist’s arsenal, but it must never be the only one.

Andrew Hill interview: ‘data journalism can make the most honest, impactful news’

Andrew Hill is a data scientist at mapping software company CartoDB.

What is CartoDB and whom is it for?

We built CartoDB to help democratize map-making online. It is a tool for anyone who wants to create a map. We think everyone has the ability to create interesting visualizations and tell important stories with data. We created CartoDB to make it easier, faster, and less expensive.

What are the broad goals of CartoDB?

First is to allow anyone to make beautiful maps from their data. Second is to help people tell stories from data that wasn’t possible before. Finally, it’s our goal to push the boundaries of mapping online through innovative technology and beautiful design. CartoDB is being built to enable the future of maps online.

Why should the world be excited about data mapping?

There are a lot of reasons to be excited about data journalism. I am by no means an expert, but I’d say I’m more a passionate fan. If you start at the outside and work inwards, I think the field is doing a lot to increase data literacy and public perception of what data is and what it can tell us about the world. It has transformed the way that we are able to consume what used to be a very complex topic, by giving us approachable and often beautiful insights into the data that exist. I think that when done correctly, data journalism can be some of the most honest and impactful pieces of news.

Maps are a small piece of that but I often make the argument that they are one of the best tools for telling stories with data. I would argue that, on average, the public is more prepared to interact with a map than they are to interact with other common forms of data visualization.

What is the best thing about CartoDB?

That someone who has never mapped data before can spend five minutes with the tool and already be tweeting me really interesting maps they created. As a bonus, they are excited by that and the power it gives them.

CartoDB Map

Are there any things you wish CartoDB could do that it currently doesn’t?

I’m anxiously awaiting our public map gallery. Right now, so many people are making great maps online, but we don’t have any automated way to show them in a public gallery. When that happens, I’ll probably have to spend 2 hours a day just looking at them all.

What is the most challenging aspect of creating a data mapping service?

In the early days I wrote a lot more code for CartoDB but I do it a lot less now that we have much better software engineers on the team. So for me, the most challenging part is connecting with all the people that could benefit from the service. Every time I demo CartoDB someone is completely shocked that they didn’t know about the service before, I need to reach those people more efficiently.

What does the job of a data scientist entail?

A lot. I do everything from developer advocacy and outreach to exploring new technologies for CartoDB. One of my favorite things I get to do is play with data and try to find stories to tell with visualizations. Storytelling with maps has been my big focus lately, especially trying to communicate how that can be done with all different types of data and working with the team to make that easier with our platform.

What do you think makes a good data map?

It totally depends on your goals and audience. It always takes design. Whether that comes early or late in your process of thinking about a map, it has to be in there. Second, it takes some understanding of your data – without understanding the data it is going to be hard to figure out if your map is telling the right story. That is why the filter wizard and the full SQL access in CartoDB help me so much: it allows me really to dig into datasets and see the result right on my map.

Are there any areas of CartoDB that you feel are under-utilised by its users?

I’m the biggest SQL fan there is but we try to reduce how much our users need to use it. In fact, you don’t even need to know what SQL is to start making beautiful maps. But all my favorite maps that I’ve created have come from a little to a lot of SQL use. I’m going to be teaching spatial SQL in the Map Academy (cartodb.com/academy) and can’t wait to see people start using it more for their maps.

How to make a choropleth map with Google Fusion Tables

Choropleth map of cycling in Hackney wards

Data mapping is becoming an increasingly popular way of visualising information. It’s quite straightforward to make a choropleth map and in this post I’m going to show you how to make one using Google Fusion Tables.

A fortnight ago I made a choropleth map to go alongside this story for the Hackney Post.

Choropleth map of cycling in Hackney wards
Choropleth map of cycling in Hackney wards

The map was designed to show the increase in the number of Hackney residents commuting to work by cycling over the last decade. I thought I’d use a map as it is more visual and quickly tells the story. Stating that “Dalson had a rise of 10.9%, whereas Hoxton’s increase was only 5.3%, unlike Clissold…” would undeniably bore the reader.

I’m briefly going to show you how to create your own choropleth map.

The data that I used is from the Greater London Authority’s website and was published in October.

First of all I created a Google Docs spreadsheet showing the percentage increase of every Hackney ward:

Percentage point increase for cycling in Hackney by ward

Once I had manually entered these data, I put them on to a map using Google Fusion Tables. Once you load up Fusion Tables, all you need to do is import your Google Docs spreadsheet:

Importing a spreadsheet to Google Fusion Tables
Importing a spreadsheet to Google Fusion Tables

Now you need to find the map data to visualise the cycling data on a map. What you’re looking for here is a “KMZ Shapefile”. Thankfully, Ændrew Rininsland (former City Uni hack, now News Developer at The Times) uploaded a shapefile of all Hackney ward map data. If you download any shapefile of the area that you’re trying to map, that should work fine. Next, upload the .kmz file to your Google Drive.

Now, go back to the Google Fusion Table and click:

Merging in Google Fusion Tables

Select the correct .kmz file and merge it based on the field name “ward”, as this will be the same in each table.

This will show you something like this:

merged map in Google Fusion Tables

Now, to shade it in different colours dependent on intensity, go to feature map > change feature styles > buckets:

Colour shading on Google Fusion Tables

This will end up with something like this:

Unmarked choropleth map of Hackney Wards

Then all I did was add the ward names using Adobe Photoshop.

Once you get used to using Google Fusion Tables, they are a fast and easy way to visualise your data.

Martin Belam launches Ampp3d the new ‘socially shareable’ data site

The eagerly-awaited data project from Martin Belam and Trinity Mirror, formerly known as ‘Mysterious Project Y‘, has officially been named… Ampp3d. Yesterday, the Twitter account was up and running, as was Google+ and a Facebook page. Martin Belam, the brains behind popular BuzzFeed-y site UsVsThem, took to Twitter to make the announcement:

Screen Shot 2013-11-28 at 13.27.29


What do you think about the announcement – do you think that data driven journalism could possibly be as shareable as the content on UsVsThem? Let us know below