Data or It Didn’t Happen

There’s a story in the news this week about the requested retraction of a study on changing people’s minds on same sex marriage. I always find it interesting when retraction stories are picked up by major news outlets, especially when the article’s data (or lack thereof) is central to the reasons for the retraction.

The likely retraction (currently an expression of concern) in question concerns a study published in Science last year looking at the effect of canvassing on changing people’s minds. Study participants took pre- and post-canvassing online surveys to judge the effect of canvassing on changing opinions. While the canvassing data appears to be real, it looks like the study’s first author, Michael LaCour, made up data for the online surveys.

The fact of the faked data is remarkable enough, but what particularly interests me is how it was discovered. Two graduate students at UC-Berkeley, David Broockman and Joshua Kalla, were interested in extending the study but had trouble reproducing the original study’s high response rate. Upon contacting the agency who supposedly conducted the surveys, they were told that the agency did not actually run or have knowledge of the pre- and post-tests. Evidence of misconduct mounted when Broockman and Kalla were able to access the original data from another researcher who posted it in compliance with a journal’s open data policy. They found anomalies once they started digging into the data.

In my work, I talk a lot about the Reinhart and Rogoff debacle from two years ago where a researcher gaining access to the article’s data led to the fall of one of the central papers supporting economic austerity practices. We’re seeing a similar result here with the LaCour study. But in this case, problems arose due to a common practice in research: using someone else’s study as a starting point for your own study. Building from previous work is a central part of research and bad studies have problematic downstream effects. Unfortunately, such studies aren’t easy to spot without digging into the data, which often isn’t available.

There’s an expression that goes “pictures or it didn’t happen,” suggesting that an event didn’t actually take place unless there is photographic proof. I think this expression needs to be coopted for research to be “data or it didn’t happen.”  Unless you can show me the data, how do I know that you actually did the research and did it correctly?

I’m not saying that all research is bad, just that we need regular access to data if we’re going to be able to do research well. We can’t build a house on a shaky foundation and without examining the foundation (data) in more detail, how will we find the problems or build the house well?

So next time you publish an article, share the data that support that article. Because remember, data or it didn’t happen.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 Unported
This entry was posted in openData, researchMisconduct. Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>