Happy Open Access week! As usual, I’m celebrating the Open Data portion of Open Access, but my special focus this year is on getting credit for your research data. My library did an in-person workshop on the topic earlier this week and I want to share the idea out more widely on the blog because I think it’s important.
The crux of the issue is that more and more researchers are sharing their data (either because their funder/publisher requires it or because the researcher believes in open access to research materials), but not all data sharing venues are made equal. Consider the following sharing pathways[1]:
- Sharing data on personal or lab website
- Sharing data by request via email or Dropbox
- Publishing data as supplemental material for a journal article
- Depositing data into a disciplinary data repository
- Depositing data into a general research repository or institutional repository.
All of these methods work to distribute your data and many comply with requirements to share. However, not all of these sharing venues will maximize the credit you receive for your data. For example, the last three sharing options will provide researchers with a stable location and a citation for the data, increasing the data’s citability. Option 4, in particular, will probably maximize credit because your research peers are likely to look for your data in a disciplinary repository.
Getting credit for your work is supremely important in research and this doesn’t get any less true when it comes to data sharing. The good news is that sharing data actually increases citation counts on the corresponding article by roughly 10%, with a higher citation boost for older papers.[2] However, this finding likely holds true only if others can actually find your data. Therefore, I encourage you to think about data sharing venues with respect to maximizing your credit.
As a follow up to getting credit for your data, I want to touch on how to actually give credit for using another researcher’s data: data citation. Data citation is very similar to article citation in that you cite the data you used in the works cited section of your article. Where data citation differs is the citation format and that fact that you cite the data separately from the article. Let’s look a little bit at how this works.
At its most basic, a data citation should include the following information:
- Creator
- Publication Year
- Title
- Publisher
- Identifier
The format of your data citation can vary across citation styles (APA, Chicago, etc.) but at a minimum should contain these five components. If you don’t have a recommended data citation format, you can use the following:
Creator (PublicationYear): Title. Publisher. Identifier
- Example: Piwowar HA, Vision TJ (2013) Data from: Data reuse and the open data citation advantage. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.781pv
It’s often considered good practice to cite the corresponding article whenever you cite the dataset, but it’s not strictly necessary. Use your best judgement and always give credit for the content you do use.
I want to wrap up by saying that as we get into a greater regime of data sharing, I hope you start thinking about this topic with respect to maximizing credit. This means placing your data in a location where you’ll get the most credit as well as giving proper credit to others, via data citation, when using their data. Framing data sharing through the lens of credit means that we’ll do right by our data going forward and properly recognize it as an important scholarly product.
[1] Many thanks to Lisa Johnston (U Minnesota) for inspiration from her pro/con data sharing exercise
[2] Piwowar HA, Vision TJ (2013) Data reuse and the open data citation advantage. PeerJ 1: e175. http://dx.doi.org/10.7717/peerj.175