What is data equity and why does it matter?

Jul 15

Data equity is a growing movement for more responsible data work, from analytics and data visualization to data science and machine learning to data-driven decision making. To start with the basics: equity seeks to ensure fair treatment, equality of opportunity, and fairness in access to information and resources for all, according to the Ford Foundation. Data Equity is a set of principles and practices to guide anyone who works with data (especially data related to people) through every step of a data project through a lens of justice, equity, and inclusivity. And equity is not just an end goal, but also a framing for all data work from start to finish. As the authors of Data Feminism say, “equity is both an outcome and a process.”

As data increasingly categorizes and influences every individual and informs access to resources, the goal of the Data Equity movement is to build a world where data and its use – with individuals’ knowledge and consent – both supports wellbeing and prevents harmful impacts of technology and its expanding reach. In order for each person to be treated fairly according to their needs, anyone who works with data must carefully consider context and thoughtfully approach the collection, use, and analysis of individual and collective data.

Here are five big reasons why data equity matters

Data is about people

The boundary between people and our data is blurring. Nearly everything facet of modern life has been digitized – search history, contact lists, banking and health information, and location history, to name a few. This data is often available for use by businesses to gather and use as they wish, without consent. There are companies whose entire business model is to aggregate massive amounts of data about individuals, then sell it to private corporations. What businesses do next directly impacts daily life, from whether someone gets approved for a loan, sees listings for high-paying jobs, or receives a special offer for a new product or service. Due to the rapid expansion of artificial intelligence (AI), each person’s experience can be shaped to look different depending on the data model’s assumptions about each person’s identity, likes and dislikes, and who deserves access to certain resources.

Those who build AI and the related data tools – who create the algorithms that make real-life decisions about us – are human. People decide what data to collect and use, and how to build the predictive models that make choices and recommendations that shape each person’s experience. Their biases are inevitably baked in. To combat these biases, all data related to people deserves to be approached with a lens of ethics and equity, to both maximize positive impact and minimize unintended consequences.

Data is power

Data alone, in its raw form of rows or blocks of text or audio recordings or any other format, doesn’t say much. Probing deeper (What direction is this moving in? How does this compare to last year?) data quickly transforms into something much more valuable: information and insights. The ability to extract information from data is a highly in-demand skill in the job market – data analysts, data scientists, and those who can manipulate a spreadsheet enough to glean patterns, trends, and insights from it – are sought after in nearly every industry.

Leaders know that information is power, and with more and more data being collected and generated each day, the depth and scope of information available is increasing. If a company has information that its competitors lack, that organization is in a stronger position to launch its product or tune its messaging; the information becomes a competitive advantage. The issue of what information is being formed, and from what data, is at its heart an issue of equity. What questions are being asked of the dataset? Does the dataset have information to answer them? Why or why not? Who gets to ask the questions, who are the questions being asked about, and who gets to learn the answers? Who ‘owns’ what data, and what are the implications? Should people own and control the data they generate – the data about them?

Data equity prompts accountability

Protecting personal information through strong data privacy practices is essential in all responsible data work. At the same time, equity practices pose the question: who is data collected about, and who is data collected for? How can individuals working with data seek to balance power dynamics in those patterns? It's common for an organization to collect data about people, conduct analysis, and never share their findings back to those from whom it was collected. It's also common for organizations to make equity commitments, such as increasing the diversity of staff leadership, without sharing any data about their progress toward those commitments.

Equity practices require action and accountability, and in many cases that means they demand transparency around how data outputs are shared with those who provided or generated the data in the first place. This includes important considerations about how to be transparent while protecting anonymity; for example, when analysis of survey responses is broken down into demographic subgroups, the responses for the smallest subgroups may become identifiable. Rather than abandoning the effort or excluding that data, an equity-forward approach requires acknowledging contextual factors, such as why a subgroup may be so scarcely represented in the data to begin with.

Technological change outpaces regulation

If every byte of data being created on Earth were a cup of water, the world’s oceans would refill every twelve days.[1][2] Much of this data is about people. All this data empowers Artificial Intelligence, which gets more powerful and far-reaching with every data point it gobbles up.

As computer models reinvent old ways of doing things and brand new sectors pop up, the volume and pace of change is tough to comprehend. Technology is advancing at a much faster pace than the laws and protections that ensure equal rights, equitable access, or meaningful privacy. Even organizations with a mission to advance best practices or standards aren’t able to keep up.

Because the rate of change outpaces regulation, it is increasingly the domain of those creating projects based on data to bring a lens of ethics and equity to their work. The primary goal should always be to ensure that people who may be impacted - and whose data is powering the computer models without compensation – are not harmed.

Data equity enables informed action

Efforts around diversity, equity, and inclusion in the workplace commonly hit obstacles in implementation. While many organizations make statements about inclusivity, post supportive social media content, and perhaps even implement more equitable hiring practices, the question of how to actually shift problematic organizational business practices into an equity-centered approach is a difficult one.

Data can be a powerful tool to enable this effort in two areas. The collection and analysis of data around issues of inclusivity, bias, and inequity can be a useful way to understand specific problems and track progress toward improvements. For example, collecting data on hiring and promotions, and breaking it down into demographic subgroups, can be a valuable tool in identifying problematic practices and monitoring whether improvement is being made as new practices are implemented. But data can also be considered as a functional area in itself: a candidate for the implementation of more equitable practices.

Simple questions like, what data is being captured about people? How is it being captured and why? Can help uncover possible equity issues. A common example is demographic data collection: is it a dropdown menu with constrained options? Do those options feel inclusive and authentic to the audience? Is this data being used to disaggregate analysis of impact to ensure equitable results among subgroups? Is there a way for the all respondents to be represented beyond the boundaries of a pre-established list, with the ability to self-identify instead of being limited by labels?

Conclusion

Ethical data practitioners carrying out projects using data about people have an opportunity and an obligation to consider the potential impacts of their project through a lens of Data Equity. Fundamental awareness and the willingness to examine how project managers think about and use data is a good place to start building ethical data muscles. To learn more about specific principles and practices for ethical and equitable data use, register for one of LA Tech4Good’s upcoming Data Equity + Ethics workshops to take your knowledge and practice to the next level.

[1] How Much Data Is Created Every Day?
[2] How much water is in the ocean?

About the authors

**Rachel Whaley**

Rachel serves as the Data Equity Program Manager for LA Tech4Good. Across her experience with nonprofits, higher ed, and public and private sectors, she believes data can be a force for good when handled responsibly. She holds degrees in public policy and computer science from the University of Chicago.

**Jen Holmes**

Jen is a data boss, fundraiser, mountain climber, expert welder, and master home chef. She believes in the power of data to drive meaningful change, and brings a lens of equity to each project she manages in her role as a nonprofit leader. She holds an MBA from the University of Wisconsin.

Rachel Whaley and Jen Holmes

Rachel serves as the Data Equity Program Manager for LA Tech4Good. Across her experience with nonprofits, higher ed, and public and private sectors, she believes data can be a force for good when handled responsibly. She holds degrees in public policy and computer science from the University of Chicago.

Jen is a data boss, fundraiser, mountain climber, expert welder, and master home chef. She believes in the power of data to drive meaningful change, and brings a lens of equity to each project she manages in her role as a nonprofit leader. She holds an MBA from the University of Wisconsin.