How to Model Data With Intersectionality

It’s Pride weekend here in Toronto and the word of the day of The 519, a city organization dedicated to advocacy for the inclusion of LGBTQ communities is intersectionality.

We’ve talked about the importance of using an intersectional approach in increasing the equity in your data products. Thanks for all your positive feedback on that story. You’ve asked for some practical pointers on how to get started so we’re going to start looking at some practical ways to actually do that.

Let’s start with an example, a study published in the New England Journal of medicine exploring how gender and race were influencing the referral of patients for cardiac catheterization. The had data on both race and gender and produced a statistical analysis. What they did is called a “main effects” analysis in which they looked at the influence of gender and then they looked at the influence of race. They then combined these main effects additively and created a chart that looked like this one.

However, prominent scholars Lisa Bowleg and Greta Bauer use this example to point out that adding together main effects does not produce an intersectional analysis. (Side note: Like most data equity analyses, not only is this practice not an intersectional analysis, it’s also wrong and bad math. See here.) When the analysis is redone using a quantitative intersectionality-based model, these are the results we find. It becomes clear quickly that the real bias here is against black women and the initial incorrect results showing the lower odds ratios for white women and black men is due to the fact that in the additive main effects model, both of these categories include some of the bias being shown to black women.

While there is not unanimous agreement among experts, practitioners, or users on how to do quantitative intersectional analysis, in my experience, there are two fundamental elements that need to be considered. Once you get the hang of these two, we can talk about other tips, things to think about, and next steps. These two are the foundation.

Using Multiplication Rather than Addition

Using A Combination of both Individual and Structural Data Points

Multiply, Don’t Add

To build an intersectional model, we need to move beyond addition into multiplication. A model or algorithm that is built as

Outcome = Race + Gender + Sexual Orientation

is not an intersectional model. This model can also be used to understand the effects of one of the predictors, such as sexual orientation, on the outcome while holding other predictors constant (another way to say assuming a level playing field in the other variables). However, holding gender constant while looking at the impact of race and sexual orientation does not tell us whether the impact of these individual characteristics differ when they are allowed to fluctuate; in other words, what is the effect of sexual orientation when gender is allowed to be either male or female? This is the nature of the core questions in intersectional analysis. To get a useful answer to this question we need to add an interaction term. An interaction term is essentially multiplication. It’s also called moderation, or a product of terms. The new model now looks like this:

Outcome = Race * Gender * Sexual Orientation

This model can answer the question of how does the outcome change for different combinations of the variables. How does the outcome change when race, gender, and sexual orientation change together? For example, the different experience of being (simultaneously) a Latino Male Homosexual compared to that of being a While Male Homosexual compared to that of being a White Female Heterosexual. It does this by looking at all three predictors together rather than looking at them one at a time while holding the other two steady. The multiplication in the model accurately estimates the simultaneous and layered effects of the different variables.

Use a Combination of both individual and structural data points

One of the key tenets of equity in data generally and intersectional analysis specifically is that looking only at individual level data frequently produces biased and incorrect results. This is kind of like taking a bug out of the field and putting it into a glass jar and trying to study it. Without any context, you’re going to get a lot wrong. To build an intersectional model, it’s great to include variable and data that measure the context and the communities that the individuals are in. For example, in a model about the effects of age, gender and refugee status on educational outcomes, it is important to include measures of how accepting each community is to refugees. Also nice would be to include variables measuring the availability of education in a variety of languages, systemic regulations on gender and education, and more. Luckily, there are several statistical methods for doing this. One of the most common is multilevel models which are designed to include variables measured at the individual level and several broader levels of aggregation such as community and country. Just like traditional regression, these models can include multiplication, not just addition, as discussed above.

Setareh Rouhani’s primer on quantitative intersectional analysis includes a great example of this. The study explores the intersectional effects of race, education and urban area. The different urban areas have different policies in place so this variable acts as a measure of structural level influence in the model.

“Researchers could conduct cross-contextual comparisons that would evaluate the impact of this policy (through comparison of cumulative years before and after its introduction) across urban settings (cities in states that enacted the legislation versus those that did not) to empirically investigate how policy constructs the relative power and privileges….” A multi-level analysis better illuminates how policy constructs the relative power and privileges experienced by individuals with respect to their status, health and well-being (Bauer, 2014; Hankivsky et al., 2012).

There are packages and examples of both multiplicitive models and multilevel models that you can use to add an intersectional analysis to your work available for most common statistical packages. Here is a nice on for R on multiplication or interactions and one on multilevel modeling. For SPSS users, here’s a one on the interactions and a nice one on multilevel modeling. Honestly, there are hundreds out there. Find someone who demonstrates the way you learn best and go for it. I would really love to hear your updates on how your models and data products are changing. Keep the feedback, suggestions, comments, and questions coming.

How to Model Data With Intersectionality

Multiply, Don’t Add

Use a Combination of both individual and structural data points

Leave a ReplyCancel reply