Avoid these mistakes and save your data visualization in just 10 minutes

The art on Cairo’s opening slide

The New York Times’ most viewed story of 2013 wasn’t actually a story. It was an interactive that predicted where you were born based on the words you use and the way you speak.

As visualization becomes mainstream, it is extremely important to get data right. That was the lesson of University of Miami professor Alberto Cairo’s Saturday morning talk, “Shit I did a decade ago but shouldn’t have.”

So what did Cairo do that he shouldn’t have? Here’s what he told the standing-room only crowd at the SND conference in San Francisco:

What he did: Choosing forms to represent data based just on how cool they looked
What he should have known: Visualization has certain (flexible) rules based on how the eyes and the brain work

To illustrate this mistake, Cairo used an example of two pie charts showing students’ music preferences in 1994 and 2014. He pointed out that this visualization basically tells you nothing as you can’t easily compare the two years. He replaced the pie charts with a simple line chart.

Screen Shot 2016-04-09 at 2.27.38 PM

The data, before and after
The data, before and after

As Cairo pointed out, this lesson does not mean that graphics shouldn’t look good, but it is most important that they are useful.

What he did: Receiving data from a source and rushing to design an infographic
What he should have known: Always read the documentation carefully, analyze the data, and consult with other sources

For this mistake, Cairo looked at a Vox graphic comparing medical costs in different countries.

From Vox
From Vox

Looks pretty straight-forward. But when Cairo dove into the data, he found it was inconsistent. The data from the U.S. came from more than 100 million claims, while data from other countries came from one private company in each country. Additionally, the dollar amounts are not equal. For example, $1,000 in the U.S. is different from $1,000 in Spain, where the median salary is about half of what it is in the U.S.

What he did: Choosing the data that better fitted a preconceived headline
What he should have known: Your data should lead to your headline, not the other way around

Here, Cairo broke down a simple bar chart that claimed less regulation led to more cable industry investment. When he converted the bar chart to a line chart, which better illustrated the data, he found this was not actually the case at all.

Chart headline

The lesson here is that your data should lead your headline, not the other way around. “You can begin with a preconceived narrative, but you need to be open to changing,” he said. “Interrogate the data. This is the type of thinking graphic designers and visual journalists should apply more when creating a graphic.”

What he did: Thinking that readers dislike complexity and only want to see the simplest graphics
What he should have known: We need to respect our audience’s intelligence

The example here was from a Spanish newspaper that compared the salaries of the highest-paid soccer players of all time. However, the paper did not adjust for inflation because it claimed their audience would not understand the concept. This is the wrong approach, he said. “If you think your audience won’t know what adjusting for inflation is, don’t not adjust for inflation, adjust for inflation and explain what it means.” 

Don’t simplify your graphics and underestimate your audience’s intelligence. “If the story is complex, we need to show that complexity to tell the story,” Cairo said.

What he did: Assuming that data is precise
What he should have known: Most of the time, data is noisy and uncertain

This section emphasized the importance of margin of error. A newspaper in Spain claimed a poll showed more people favored Catalonia remaining with Spain than becoming independent for the first time ever. However, only 0.8% more people voted this way, while the margin of error was 2.95%, making this an erroneous claim. The really story is that the poll was inconclusive.

What he did: Assuming that two things were related just because they varied together
What he should have known: Correlation doesn’t imply causation

Cairo didn’t include this xkcd comic, but he could have.

Here, Cairo used the “Sports Illustrated jinx” to illustrate regression to the mean. The jinx says that players’ performance declines after they are featured on the cover of the magazine. The reason for this, he explained, is that players are featured at the peak of their performance, then regress to their more average performance after the cover runs.

Arguably the most important example Cairo used was a study published in the New England Journal of Medicine showing a correlation between chocolate consumption and Nobel Prize winners. This led to headlines such as “Eating chocolate may help you win the Nobel Prize.” What the Journal of Nutrition later found was that many factors related to wealth correlated with Nobel Prize winners including wine consumption and IKEA stores. The common thread was the nation’s gross domestic product.

Nobel Prize winners


How he stopped doing shit he never should have done in the first place


So how did he stop? “Think critically about the data and apply a bit of common sense….Deadlines are an excuse, not a problem. With just ten minutes of thinking, you can avoid all these problems.”

Download Cairo’s presentation here. Cairo’s new book The Truthful Art is now available. Buy it here.