When we want to address equity in data science, we often need to talk about power and we sometimes need to talk about money. It can be useful to think about an individual’s data like a raw resource, you could call it something cheesy like “Dataonium”. There’s a reason that our icon for Step #4: Data Collection & Sourcing in the Data Equity Framework has a pickaxe in it. The fact that data science uses words like “mining”, “raw”, “refining”, “cleaning”, “pipeline”, “ETL (extract, transform, load)”, and more is no coincidence.
We’ve got other articles that deal with some of the issues around paying for data or crafting Funding Webs, but what I want to talk about today is the equity relationship that you (and your project) have with the people who are your data source. Well, I want to talk about the one you have versus the one you think you have.
Four intertwined ideas to think about when you’re looking at this:
Price: Who sets what the data is exchanged for.
Ownership: Who retains, gets, or loses; control, access, or proprietorship over the data.
Value: Who determines what the data is worth.
Profit: Who gains from this data work, and how much.
When it comes to price, raw data ranges from very expensive to free. Sometimes people will be happy to answer a 2-hour survey for a $5 gift card. Sometimes people wouldn’t tell you their age for a thousand dollars. Like everything else, the price someone is able to charge for their data often depends on how badly they need the money or the relationship. We All Count has attempted surveys at the C Suite or Executive level in many large organizations and the executives not only fiercely protect every bit of their own personal data, but they also want a large and tangible exchange for even a small amount of it. Only when providing the data was required by their well-remunerated job were they willing to give it. The relatively high price for data set by these execs is partially a function of their economic empowerment, but it’s also because they have a particular perspective on the second consideration: the value of data. To be clear, I’m not implying that these executives are bad, I’m saying they’ve got a very good idea of how much thier data is worth.
Unlike price, which relates to the cost of the data, value is what the data is worth. It’s usually worth very different things to different people in the transaction. Partly, this is similar to other raw resources. On its own, one individual’s speck of ‘Dataonium’ might not be worth much to statistics, but when gathered in large quantities it’s arguably today’s most valuable resource. Paradoxically, data on the individual level might be quite valuable or even sacred to the individual it’s from. Like a farmer conned out of his “worthless” land by an oil tycoon, equitably setting prices on our data requires a shared and transparent understanding of what the data is worth to all parties involved. Like the difference between seeing a forest as a culturally vital hunting ground or a good spot for a 4-lane highway, equitable data dealings also need to address multiple perspectives on the value of the data and how it can be used.
Ownership can dramatically alter a data-providing relationship. How is the transition of data structured? Is the transfer permanent? Is it a loan? Who gets a say in how the data is used? Who decides who can access the data? Do the original data providers get their data back? In raw form? In a refined and more valuable form? Is the data owned by individuals, groups, or organizations?
Last is profit. One of the most common things that we hear from organizations, especially those in the social sector, is that they aren’t paying respondents or program participants directly for their data, but rather are exchanging it for an improvement to services. We need to ask ourselves: Who evaluates the value of the services we’re going to improve? Is it as much or more than the salary I’m being paid to process this data? Is it as much or more than my organization stands to gain from this data project?
Because treating data like a monetizable resource is in its relative infancy, people don’t expect written contracts specifying the price, value, ownership, and required profit when it comes to giving data. Maybe they should.
Relationship Types:
Let’s look at this another way. Which of these data provider relationship types best describes the ones in your project? We’re not saying one is necessarily better than another – ok, except for the last one – but it’s good to know where you are and what you’re aiming for.
Data Donor
Price: This donor typically provides their data for free or for very little in exchange.
Value: They are aware of the value of their data to both them and you and are happy to give it.
Ownership: They most likely are relinquishing ownership of their data, but not necessarily.
Profit: They expect no direct profit from this transaction.
Equity questions to ask about Data Donors:
Do they think of themselves as a donor?
Could they retain some ownership or control over the data without compromising the project?
Would they want a share of the profit if they knew about it?
Does my perspective on the value of their data match their own?
Am I calling this person a Donor when they are really a Seller?
Data Seller
Price: The data seller sets, demands, or negotiates a price for their data.
Value: The data seller is doing their best to reconcile their immediate need for the price against the data’s value to them, and the data’s value to you.
Ownership: The data seller sets the terms for the ownership of the data and their expectations for the end product. This will likely affect the price.
Profit: They expect their profit to come from the immediate exchange.
Equity questions to ask about Data Sellers:
Would I sell my data for this price?
Does their evaluation of the data’s value match mine?
Does this seller have the ability to negotiate around different types of ownership?
Am I comfortable with the economic and other pressure’s effect on the price this seller is able to charge?
Does this seller have a healthy diverse market to sell their data in?
Does this Seller want to sell their data?
Price: The main price here is bringing this person to the table as a Partner.
Value: The Data Partner hopes to arrive at the same valuation of the data as you, because you’ll be using it to accomplish a shared goal.
Ownership: This is the main advantage to the data provider in this type of relationship. The Data Partner retains meaningful control over how the data is used, processed, handled and disseminated. They will be part owners in the raw data and also the data product.
Profit: The Data Partner expects to equally or proportionally profit alongside the other stakeholders in the project.
Equity questions to ask about Data Partners:
Is the control over the data being meaningfully shared?
Is being a Data Partner only available to certain individuals or groups?
Are my contributions to the data process equal to the data from my Partners?
Are all four of these elements formalized in our partnership?
Data Partner
Price: The price for the Investor’s data is the expectation of some kind of profit at the end, monetary or otherwise.
Value: The Data Investor weighs the value of their data in its current state against the potential good it might do for everyone involved in the transaction.
Ownership: This might vary greatly across this type of relationship. Being less involved with exactly what happens with the data differentiates them from a Data Partner, but they may still have things like time limits, use restrictions, or formal sovereignty agreements to protect their investment. Almost always they will expect access to the finished data product.
Profit: Sometimes the data investor will get a monetary profit from the new data product. More often they will be investing in the improvement of goods and services they receive, or profiting from the answers provided by the data project.
Equity questions to ask about Data Investors:
Are they taking a risk or are they guaranteed some kind of profit?
Does everyone involved in the project have the power to negotiate an Investment type of relationship?
What is the difference between this and a Data Donor?
Will their investment be a one-time payoff or a long term, dividend producing asset?
Are they able to effectively manage price, valuation, and risk vs. profit considerations?
Data Investor
Data Dupe
Price: They may not be able to set prices. Their data may be taken by force, by guile, or in a way that they are unaware of or unfamiliar with.
Value: They may not understand the value of the data to all parties, precluding them from setting fair prices or demanding shares of the profit.
Ownership: They may not know they owned their data in the first place, may not know they have lost it, may agree to a single data transaction that is then resold over and over again in various forms, may contribute to a project that they have no control over, may never see their collected data or the analysis results.
Profit: They may not have any say in the profits they receive when providing their data. It may be minimal, it may be nothing, they may even contribute data to things that harm, oppress, or cost them.
Equity questions to ask about Data Dupes:
How can we change this relationship into any of the other types?