Indigenous Data and CARE

I just finished reading An Indigenous Peoples’ History of the United States, which was a great overview of the history of indigenous-US relations. Warning: the book is difficult to read at times, as it describes the long history of genocide against indigenous nations in America. That said, it’s important for non-indigenous, especially white, people to recognize this genocide and not look away from this history.

The brutal history of indigenous-US relations (as well as similar relations between other colonial powers and the indigenous people inhabiting the lands colonized) creates a long shadow that impacts many modern topics, including data management and sharing. This is especially the case when managing and sharing data pertaining to indigenous people, their cultures, and their knowledges of the world.

Historically, research about indigenous people, their cultures, and traditional knowledge has been exploitative, with a power differential between the [typically white] researchers and the indigenous people being studied. Researchers extracted knowledge from tribes with little-to-no benefit – and often harm – to the tribes themselves. With the rise of modern indigenous rights movements, tribes have been striving to correct such power imbalances and assert their sovereignty, including in the areas of research and data.

Indigenous data is too broad of a topic to fully cover in a single blog post (I’ll instead refer you to the book Indigenous Data Sovereignty and Policy, which is available Open Access), but at the very least we need to discuss “CARE”. We hear a lot about FAIR data (Findable, Accessible, Interoperable, and Reusable), but the CARE Principles for Indigenous Data Governance are equally important.

The CARE principles were developed by a global alliance of indigenous groups and are made up of 12 principles in four areas:

  • Collective benefit,
  • Authority to control,
  • Responsibility,
  • Ethics.

The Principles recognize:

  • that indigenous ways of knowing are different than western scientific knowledge systems;
  • that research must be conducted in partnership with tribes, from the very beginning of a project;
  • that tribes must benefit from research, both in terms of individual development as well as using data for tribal governance;
  • that tribes have the authority to control what happens to the data collected;
  • and more.

I encourage you to fully read the CARE Principles to better understand all of the included guidance. In short, conducting research on and about indigenous people, their cultures, and their knowledge systems must be handled in a very different way than how research methodology is typically taught in U.S universities. This is because such research must be collaborative with tribes, consider tribal world views, and be disseminated in ways beneficial to tribes.

While the CARE Principles do not apply to all types of research, the ideas behind the CARE Principles are taking root in many institutions that support research using indigenous data. For example, the U.S. National Institutes of Health (NIH) sometimes funds research pertaining to indigenous peoples. When the NIH Data Management and Sharing Policy went into effect in 2023, the policy had a separate policy supplement on the Responsible Management and Sharing of American Indian/Alaska Native Participant Data, as the usual terms of the policy did not apply to this type of data. This is just one instance of how we are starting to treat indigenous data differently, and with a lot more care, than other types of research data.

If you in any way touch indigenous data, I encourage you to read the full CARE Principles and dig deeper into the ways that working with this type of data is different from how other types of data are managed and shared.

Posted in dataManagement, ownership, socialJustice | Leave a comment

Rescheduled Data Accessibility Webinar

I unfortunately had a medical event right before November’s scheduled figshare webinar on “Making repositories and data digitally accessible.” I ended up having surgery in December and now that I’m feeling much better, we have rescheduled the webinar for February.

The webinar is on now on Tuesday, February 11, 2025 at 8am PT / 11am ET / 4pm GMT. You can register here for the webinar.

I hope you can join us at the new time.

Posted in accessibility | Leave a comment

Upcoming Data Accessibility Webinar

My colleague Megan O’Donnell and I will be speaking at an upcoming, free webinar on “Making repositories and data digitally accessible“. The webinar is on Monday, November 18, 2024, at 9am PT / 12pm ET / 5pm GMT.

I’m still travelling down the deep rabbit hole that is the topic of accessible data but it’s a larger discussion that the whole data community needs to start having. I’m looking forward to sharing what I know about how to make shared data files more accessible.

You can register for the webinar here.

Posted in accessibility, admin | Leave a comment

RDAP 2023 Work of the Year Award

I’m extremely pleased to share that The Research Data Management Workbook won the 2023 Work of the Year award from the Research Data Access and Preservation (RDAP) Association!

I’ve been participating in RDAP since I started working as a librarian and a member since 2019 when RDAP officially became a professional association. I remember when RDAP was moving from a conference to a professional association and one of the the pipe dreams was to have association awards. It took a few years but we finally got the awards up an running this year. I couldn’t be more humbled to win the very first RDAP Work of the Year award.

I also want to say congratulations to my peers, Ashley Thomas, Shannon Sheridan, and Megan O’Donnell, who shared the 2023 Volunteer of the Year award from RDAP.

Thank you so much RDAP!

Posted in admin | Leave a comment

Writing Alt Text for a Scientific Figure

I wrote last month about making spreadsheets more accessible and reusable, and I want to continue in the theme this month by discussing accessibility of scientific figures.

I don’t have capacity in this blog post to do justice to the full topic of accessibility for scientific images. If you are interested in the basics of creating accessible data visualizations, I recommend you check out Chartability by Frank Elavsky and the Do No Harm Guide: Centering Accessibility in Data Visualization report by Jonathan Schwabish, Susan J. Popkin, and Alice Feng. There are many more resources available on the internet in this area, but those two links are a good place to start.

In this post, I want to discuss a more foundational requirement for accessibility and visualizations: alt text for scientific figures. Alt text, short for “alternative text”, is a textual description of a digital image. This text is critical for blind and low vision people whose screen reader software cannot otherwise interpret images; without alt text, blind people miss all information or context provided by a figure. Alt text also helps with search engine optimization and is presented instead of an image when the image file cannot load (due to low bandwidth, etc.).

All images on the internet should have accompanying alt text. But alt text isn’t something that comes up regularly when discussing the sharing of research results. So how do you write alt text for a scientific figure?

If you don’t know anything about writing alt text, you should start with Amy Cesal’s guidance on writing alt text for data visualizations. Her alt text formula is as follows:

alt text = *Chart type* of *type of data* where *reason for including chart*. *Link to source data.*

Let’s look at an example using this figure from my most recent publication:

The chart type is “column chart” and the type of data, or y-axis data, is best summarized as “research data availability”. The point of the figure is that “research data on the internet disappears at a rate of 2.6% per year”. Using Cesal’s formula, the alt text for this figure would be :

Column chart of research data availability where research data on the internet disappears at a rate of 2.6% per year. For underlying data, see “Figure2_UnavailableByYear.csv” file at https://doi.org/10.22002/h5e81-spf62.

There are further examples in Cesal’s post to give you a better sense of how the formula works. She also advises repeating the formula for each separate panel in a multi-part scientific figure.

If you want to go beyond Cesal’s basic alt text for data visualizations, chapter four of the Do No Harm Guide offers a really nice look into several models for writing full alt text descriptions for visualizations. The image description guidelines from the DIAGRAM Center also provide more detailed recommendations for specific types of images, such as: chemical elements, Venn diagrams, line graphs, etc. Cesal’s shorthand formula is useful but fuller descriptions are preferable when you are able to write them.

I know this post links to a lot of information about alt text but the important thing is to have something written as alt text because so many images on the internet have no alt text at all. If you want to use Cesal’s quick formula instead of a fuller description, that’s 100 times better than having no image description at all.

Hopefully these resources have prompted you to think more about accessibility of your scientific visualizations so that, at the very least, you’ll include good alt text for the scientific images you share on your lab website, social media, or anywhere!

Posted in accessibility, dataVisualization | Leave a comment

Making Spreadsheets Accessible and Reusable

July is Disability Pride Month. I’ve been learning more about disability during the past year and this has naturally led me to thinking about accessibility of research data. Last year, my friend Abigail co-wrote a really nice call-to-action about making research data more accessible because this is an area where we have to catch up and do better.

As spreadsheets are one of the most common types of shared scientific data, I decided to start my research there. There’s some great content from the U.S. government’s section 508 website on creating accessible Excel files. Similarly, the Web Accessibility Initiative from W3C has a great tutorial on accessible HTML tables. For research data, I also really like the Data Curation Network’s Accessibility Data Primer as a starting point, though spreadsheets are only a small part of the document.

The hard part about shared data though, is that such data is meant to be maximally reusable. For spreadsheets, this means thinking about the accessibility of the most reusable tabular file format: the CSV file. Karl Broman and Kara Woo published my favorite article on making spreadsheets reusable in 2017 but it doesn’t really address accessibility issues. Adding to this problem is the fact that strategies for reusability and accessibility can sometimes be in conflict.

After a bunch of research, I’ve taken a try at a checklist for making accessible and reusable CSV files for data sharing. It balances a bunch of stuff from the Broman and Woo article with a bunch of different guidance for accessibility. For example, I based recommendations for variable naming off of guidance for hashtags that are screen-reader friendly.

The checklist is available as exercise 6.4 in edition 1.1 of The Research Data Management Workbook. (It’s currently only in the online edition of the Workbook and eventually I’ll spin off edition 2.0 of the Workbook and give it a citable DOI.)

Please note: I am not disabled. I am a neurodivergent person with a couple chronic conditions and am not an expert on disability. I am sharing this because I’m requesting feedback so I can make updates to the checklist.

The goal is to create something that a scientist with zero background in accessibility can use to make their data more accessible and reusable. I don’t know if the the checklist will ever be perfect but it should still help make data better.

Let me know what you think of the checklist!

Posted in accessibility, openData | Leave a comment