July is Disability Pride Month. I’ve been learning more about disability during the past year and this has naturally led me to thinking about accessibility of research data. Last year, my friend Abigail co-wrote a really nice call-to-action about making research data more accessible because this is an area where we have to catch up and do better.
As spreadsheets are one of the most common types of shared scientific data, I decided to start my research there. There’s some great content from the U.S. government’s section 508 website on creating accessible Excel files. Similarly, the Web Accessibility Initiative from W3C has a great tutorial on accessible HTML tables. For research data, I also really like the Data Curation Network’s Accessibility Data Primer as a starting point, though spreadsheets are only a small part of the document.
The hard part about shared data though, is that such data is meant to be maximally reusable. For spreadsheets, this means thinking about the accessibility of the most reusable tabular file format: the CSV file. Karl Broman and Kara Woo published my favorite article on making spreadsheets reusable in 2017 but it doesn’t really address accessibility issues. Adding to this problem is the fact that strategies for reusability and accessibility can sometimes be in conflict.
After a bunch of research, I’ve taken a try at a checklist for making accessible and reusable CSV files for data sharing. It balances a bunch of stuff from the Broman and Woo article with a bunch of different guidance for accessibility. For example, I based recommendations for variable naming off of guidance for hashtags that are screen-reader friendly.
The checklist is available as exercise 6.4 in edition 1.1 of The Research Data Management Workbook. (It’s currently only in the online edition of the Workbook and eventually I’ll spin off edition 2.0 of the Workbook and give it a citable DOI.)
Please note: I am not disabled. I am a neurodivergent person with a couple chronic conditions and am not an expert on disability. I am sharing this because I’m requesting feedback so I can make updates to the checklist.
The goal is to create something that a scientist with zero background in accessibility can use to make their data more accessible and reusable. I don’t know if the the checklist will ever be perfect but it should still help make data better.
Let me know what you think of the checklist!