Literature DB >> 36033589

Sex trouble: Sex/gender slippage, sex confusion, and sex obsession in machine learning using electronic health records.

Abstract

False assumptions that sex and gender are binary, static, and concordant are deeply embedded in the medical system. As machine learning researchers use medical data to build tools to solve novel problems, understanding how existing systems represent sex/gender incorrectly is necessary to avoid perpetuating harm. In this perspective, we identify and discuss three factors to consider when working with sex/gender in research: "sex/gender slippage," the frequent substitution of sex and sex-related terms for gender and vice versa; "sex confusion," the fact that any given sex variable holds many different potential meanings; and "sex obsession," the idea that the relevant variable for most inquiries related to sex/gender is sex assigned at birth. We then explore how these phenomena show up in medical machine learning research using electronic health records, with a specific focus on HIV risk prediction. Finally, we offer recommendations about how machine learning researchers can engage more carefully with questions of sex/gender.

Entities: Chemical

Keywords: electronic health records; gender; healthcare; machine learning; non-binary; sex; sex/gender; transgender

Year: 2022 PMID： 36033589 PMCID： PMC9403398 DOI： 10.1016/j.patter.2022.100534

Source DB: PubMed Journal: Patterns (N Y) ISSN： 2666-3899

Introduction

Your health insurance company does not believe you have a cervix, so good luck getting that pap smear covered. Your testosterone is not in the “correct” range for people assigned male at birth, so a bunch of your blood test results were flagged, and you are not sure whether they used the appropriate reference range for you or not. And it’s unclear whether the fancy new system that the hospital introduced to monitor you during your upcoming surgery will accidentally pull the wrong information from your medical record, and if it does, what that would do to your surgery outcomes. In other words, when it comes to medical systems, sex is full of trouble, and not the good kind. The above examples illustrate how the sex marker in electronic health record (EHR) systems shows up in the experiences of patients who are transgender and ground the necessity of critical interrogation of what it means to use sex/gender in medical care. But before we interrogate “sex” and the medical system, it is helpful to situate this discussion at the intersections of data-driven healthcare, expectations of increased diversity in clinical research, and changes in EHRs. It is perhaps a cliché at this point to point out that like many industries, healthcare has become increasingly data heavy and dependent. Many actors within health systems see great opportunity in both data gathered explicitly for research as well as the exhaust produced by everyday interactions between patients and doctors., This shift to data-driven healthcare has taken place at the same time as funding agencies are expecting researchers to consider sex and gender in their work, including the experiences of transgender people. Since the early 2000s, funding agencies such as the National Institutes of Health (NIH) in the United States have required the inclusion of women and minority groups in all funded clinical research and that clinical trials be designed to provide information about differences by sex/gender, race, and/or ethnicity., This has also recently been expanded to using “sex as a dependent variable” for earlier stage biological research., In 2015, the NIH also formed a Sexual & Gender Minority Research Office,, in part because the prior focus on the inclusion of women involved an overemphasis on “sex differences” that was not inclusive of transgender people. The move toward inclusion in clinical research has also included advocacy for better representation in healthcare systems for clinical care. In particular, providers of EHR systems have begun including explicit fields to document information on gender identity and sexual orientation. In 2018, EHR systems that were certified for use in particular medical contexts in the United States were required to have the capacity to collect sexual orientation and gender identity (SOGI) data. In July 2021, the United States Department of Health and Human Services announced that SOGI data, along with broader data on social determinants of health, would be incorporated into the United States Data Core, a national standard used as a baseline across medical record systems. Machine learning researchers who work with more modern EHR data will have access to a wider variety of information about patients, and that, combined with the pressure to engage with sex differences, means that more and more machine learning researchers will confront questions about how to incorporate sex/gender data in their work. Previous literature about EHRs and transgender people advocate for the importance of collecting SOGI information and applaud its increased integration into new and existing systems.12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22 However, most of these works also assume that sex is a coherent concept with a consistent meaning, even for transgender people. Work across many disciplines, including transgender studies, suggests otherwise and casts doubt upon the legitimacy of assumptions around what sex is, let alone how EHR systems record it.23, 24, 25 Despite this, there is little advice for machine learning researchers about how to engage with questions of sex in ways that represent the concept’s complexity. This is because much of the literature critiquing assumptions around gender in machine learning looks at applications where sex characteristics are irrelevant, such as “automated gender recognition” in computer vision, and so has not tackled questions of how to deal with settings where sex characteristics might be relevant, like in clinical care for transgender people.26, 27, 28, 29, 30, 31 The stakes are high in machine learning with widespread usage of EHR “sex” at scale. In medical contexts, machine learning researchers encounter the sex field in EHR systems and assume that it corresponds to a coherent concept and that that concept is the correct one to engage in the question of sex/gender differences. In order for medical machine learning researchers to develop models that appropriately account for the experiences of transgender people, they need to move away from an emphasis on sex assigned at birth toward richer representations that account for potential differences in physiology and experiences of gender. In this perspective, we look at how EHR systems in the United States encode sex and gender and discuss the implications this has for medical machine learning. We begin by introducing three terms that capture the assumptions and issues around sex and gender: sex/gender slippage, sex confusion, and sex obsession. We discuss how these terms fit in with historical assumptions about sex and gender and how those assumptions have influenced the gathering of sex and gender identity information. We then turn directly to machine learning, showing how studies rarely engage with sex as the complex variable that it is, even in areas like HIV prevention, where gender minorities are at higher risk than other populations. Finally, we provide recommendations for moving the field toward richer representations.

Coming to terms

Sex and gender

Sex and gender are terms that refer to a variety of aspects of a person’s physiology, how they see themselves, and how they interact with others and their environment. Sex typically refers to a person’s gonads, anatomy, chromosomes, and hormone levels.,, Gender, meanwhile, typically refers to a person’s gender identity (how they see themselves or experience their own gender) but also involves other factors such as how a person is perceived by others or experiences differential treatment related to their perceived gender.34, 35, 36 While it is often assumed that sex and gender are distinct, with sex being “in the body” and gender being “in the mind,” the two are actually deeply entwined, and both are social., Social and cultural factors attributed to gender can have impacts on the body to the point where sex cannot be meaningfully considered as separate from gender., We will use the term sex/gender as used in Springer et al. to refer to the entwinement of these phenomena as appropriate and “sex” or “gender identity” to refer to the specific use of these terms in EHR systems. Components of sex/gender and how a person interacts and is treated in a social world have significant impacts on their health and wellbeing.38, 39, 40, 41 Additionally, although a full treatment is outside the scope of this work, gender is also entwined with race, with race mediating how people experience their own genders and how others gender them.42, 43, 44, 45 The interaction of race and gender means that gendered understandings within the United States reflect the same types of white supremacist biases that all other institutions do, and although we do not have the expertise to fully unpack how sex and gender identity data gathering is mediated by race, we hope that others are able to build on this work to do so. For a recent overview of race and medical research, see Lett et al. Both sex and gender are different from sexual orientation, which refers to “the desire [or lack of desire] to have sexual relations with someone of the same or different gender identity and/or anatomical sex.”, Although we occasionally mention sexual orientation, our emphasis will be on sex and gender. It may be that similar critiques apply, as sexual orientation is often defined in relationship to one’s own gender as well as that of others. Historically, sex/gender has been assumed to be binary, static, and concordant.,, Binary means that there were two options for a person’s sex (male or female) and two options for a person’s gender (man or woman). Once assigned male or female sex at birth, typically by measurement of a newborn’s external genitalia, a person’s gender is presumed not to change, thus static. Anyone assigned male at birth is also presumed to be a man, and anyone assigned female at birth is presumed to be a woman (concordant). In practice, sex and gender are not binary, static over time, or necessarily aligned. There are people who have chromosomal, gonadal, anatomical, and/or secondary sex characteristics that do not align with the defaults for male or female sex. Some people with such characteristics may use the term intersex for themselves, but some do not. Estimates suggest intersex people make up between 1.7% and 4% of the population (though an exact number is difficult to quantify). Similarly, there are people with genders other than man or woman (e.g., non-binary, genderqueer, genderfluid), and some people have no gender (agender). Some people’s gender(s) change over short or long periods of time. A person’s sex and gender also need not be aligned; transgender people are assigned a sex at birth that does not align with their gender identity. This is in contrast to cisgender people, whose assigned sex at birth does align with their gender identity. Some non-binary people identify as transgender (and vice versa), but not all do. In this perspective, we will use transgender as an umbrella term to refer to a person whose sex assigned at birth does not align with their gender identity. Transgender is both an umbrella term and an identity, but it is also constructed in part by the systems in place that approve gender-affirming care., Because gatekeepers have always been a part of defining transgender experiences in the United States, transgender people who take certain kinds of transition steps can be viewed as “more legitimate” than those that do not through what Austin Johnson refers to as “transnormativity.” For the purpose of this perspective, it is perhaps enough to say that some transgender people choose to pursue “medical transition” such as taking hormones or having surgery, but many do not, either because of lack of access/appropriate medical care or because they do not want to. Transgender people that do take medical transition steps may then have sex-related characteristics such as secondary sex characteristics, hormone levels, or genitals that are different than what would be presumed based on their sex assigned at birth. A detailed understanding of a transgender person’s medical history is therefore important in providing clinical care. However, similarities and differences between transgender and cisgender people are not always well understood in the medical literature or by individual providers.,

Sex/gender slippage, sex confusion, and sex obsession

Despite the complexities of sex and gender, the false assumptions of binary, static, and concordant sex/gender are deeply ingrained in our medical system.,,, This negatively impacts not just transgender and intersex people but also fails to serve anyone who doesn’t exactly fit the average.56, 57, 58 We have coined three terms that further articulate the false assumptions and their implications in the medical system: sex/gender slippage, sex confusion, and sex obsession. Shortened definitions of these terms and related ones can be found in Table 1.

Table 1

Definitions of sex, gender, and sex/gender-related terms used to characterize the use of sex/gender in medical machine learning and possible improvements

Term	Definition	Source
Sex/gender slippage	the frequent substitution of sex and sex-related terms for gender, and vice versa, often reflecting an underlying assumption of the concordance of sex and gender	⁵⁹
Sex confusion	the fact that any given sex variable holds many different potential meanings, from sex assigned at birth to current sex for purposes of health insurance, and may or may not correspond to the presence of any particular body part or hormonal status	present work
Sex obsession	the idea that the relevant variable for most inquiries related to sex/gender is sex, and specifically sex assigned at birth	present work
Organ/anatomic inventory	a record of what organs a patient may or may not have	²⁰
Phenotyping	the identification of particular patients who might have certain “characteristics of interest”	⁶⁰
Data richness	richness moves beyond the binary presence or absence of a condition to timing, degree, severity, cause, and relationship to factors like behavior, etc.	⁶¹

Definitions of sex, gender, and sex/gender-related terms used to characterize the use of sex/gender in medical machine learning and possible improvements Sex/gender slippage is the frequent substitution of sex and sex-related terms for gender, and vice versa, often reflecting an underlying assumption of the concordance of sex and gender. The most common example of this is conflating gendered terms (man or woman) with sexed terms (male or female) or referring to a person’s gender identity as male/female. (Individual people may choose to refer to their anatomies or gender identities as male/female, and this would not necessarily be an example of sex/gender slippage)., Slippage can also occur when inferring experiences based on someone’s gender identity from their sex assigned at birth, or vice versa. We introduced the term sex/gender slippage in an earlier work to reflect the way some researchers switched back and forth between the terms sex and gender within their research articles and equations. However, as we will explore in this perspective, the presumed concordance of sex and gender as reflected by sex/gender slippage is embedded in the medical system and society at large, even if not explicitly discussed in these terms. One parallel in feminist writing is Judith Butler’s idea of a “stable gender” in the “heterosexual matrix.”, To some extent, the modern fields of gender and transgender studies exist to unpack, critique, or complicate the easy switch between sex and gender, in addition to trying to figure out what sex and gender refer to. Sex confusion refers to the fact that any given sex variable holds many different potential meanings, from sex assigned at birth to current sex for purposes of health insurance, and may or may not correspond to the presence of any particular body part or hormonal status. Clinicians often associate this challenge with only having a single sex variable from which to assess a person’s sex and gender identity.,,, It is true that sex confusion is more common as sex markers have become easier to change and transition care has been more widely covered by insurance,, but as Paisley Currah has argued, sex has always been a context-dependent variable—as he puts it, “sex is whatever an entity whose decisions are backed by the force of law says it is.” While one variable was never enough to capture the complexities of sex/gender, if one assumes that sex/gender are binary, static, and concordant, one variable can be assumed to cover all of these. It is these assumptions, not the existence of transgender people, that cause sex confusion. Sex obsession is the idea that the relevant variable for most inquiries related to sex/gender is sex, and specifically sex assigned at birth. It can be compared with the interpersonal dynamic of asking a transgender person “what they really are” or “what’s in their pants.”, A clear example shows up in the way that EHR systems attempt to address sex confusion, namely the introduction of additional fields (gender identity, sexual orientation) but with the continued reliance on sex assigned at birth as a “ground truth” for a person’s physiology. Although others do not use the term sex obsession, this phenomenon has been documented across contexts, including by activists such as Riki Wilchins in the early 2000s and more recently by users on Tumblr.,

The gathering of sex/gender data in EHRs: Sex/gender slippage, sex confusion, and sex obsession

In this section, we discuss the role that sex/gender slippage, sex confusion, and sex obsession have played in the collection of sex/gender data in EHR systems. Our history and analysis focus on the United States healthcare system, but we suspect similar trajectories may have happened elsewhere.

Pre-2010s: Sex/gender slippage

Before the 2010s, sex/gender information was captured exclusively using a single sex field included in EHR systems. Options for this field were binary, with an option for either male or female. Sometimes a third option of either “unknown” or “did not disclose” was included. (Intersex inclusion in the sex field continues to be an issue and is not as simple as adding a third category for it.) Reliance on a single variable to capture information not just about sex assigned at birth but also about gender identity is an example of sex/gender slippage. The assumption was that information about a person’s physiology, gender, and experience of gender in the world could all be extrapolated from one single variable of sex. If gender identity was recorded in medical records, it was included in the social history section or in progress notes but may or may not have been actively used in providing care.

2010s to present: A decade of sex confusion and sex obsession

As Paisley Currah has articulated, binary sex identifiers have long been tied to distribution of rights and resources. In particular, sex markers provided an external method of determining who was entitled to particular benefits and, prior to Obergefell v. Hodges, who could marry each other. Transgender people have always troubled binary sex categories and demanded (or created) ways to change sex classifications, both in physical architecture and in computer systems. For many transgender people, updating their legal documents was and is an important way to avoid misgendering and to be treated appropriately by their healthcare providers. This leads to significant sex confusion when trying to interpret the sex field. Does the sex in the EHR system correspond to the sex that a person was assigned at birth, or is it the updated sex on their driver’s license? What about the sex on their health insurance? This confusion has clinical implications, as we illustrate in the introduction, because existing EHR systems relied heavily on sex assigned at birth for uses as varied as reference ranges for laboratory tests, screening notifications, and as part of billing for health insurance purposes.,69, 70, 71, 72 One way the medical field began to address sex confusion was through the inclusion of a separate gender identity field in EHR systems.,, Most resources recommend a “two-step” questionnaire to determine whether a patient or participant is transgender, including questions about current gender identity and sex assigned at birth.,,, However, many resources suggest using sexed terms for gender identity (e.g., male, female, transgender male), demonstrating sex/gender slippage and leading to confusion for transgender and non-binary people about how to answer.,, Kronk et al. provide an alternative that is likely clearer for transgender people. While the goal of adding a gender identity field to help identify transgender patients and provide them with better care is important, a close reading of articles that advocate for or document the implementation of such systems shows that the primary purpose of adding a gender identity field was as a tool to maintain information about a transgender patient’s “real” sex, with gender identity used along with preferred name and pronouns to avoid misgendering patients.,, Providers and EHR systems developers had an opportunity to re-configure clinical care around a more nuanced understanding of physiology inclusive of not just transgender people but all people and instead doubled down on the importance of sex assigned at birth as the most relevant measure of a patient’s clinical care. Sex obsession manifests not only in how researchers and clinicians discuss changes to EHR systems but also in how healthcare systems communicate the importance of sex assigned at birth to patients. In an article about implementation of SOGI data for the Veterans Health Administration (VHA), the authors discuss a campaign to encourage transgender people to not change their sex assigned at birth due to concerns about negative implications in doing so. The VHA even distributed flyers to patients emphasizing the importance of birth sex for clinical care, with the use of the gender identity field primarily for providers to “know your identity and use respectful terms during interactions when delivering personalized care.” In other words, sex assigned at birth is still considered the most clinically relevant variable for all people regardless of their gender identity.

Toward richer representations

Even if sex assigned at birth and gender identity are accurately reflected in a patient’s medical record, continuing to make assumptions based on those values is giving too much weight to sex assigned at birth. But the field is in this position precisely because sex assigned at birth is a simple and convenient default for most cisgender people. A move to a system that includes many more variables will be demanding both for EHR system manufacturers and care teams, as data entry into EHRs is already very time consuming and prone to errors., However, even if improvements are challenging, they are important for improving clinical care for all patients. The current approach of designing for “default” users systematically excludes people at the margins, in this case transgender patients, along with other LGBTQIA+ patients, non-white patients, low-income patients, and patients with disabilities, as well as those at the intersections of those groups. If we instead center the people at the margins, we can create systems that work better not just for those at the margins but for everyone., The most commonly cited method of moving toward richer representations is the use of organ/anatomic inventories. There are also design approaches that can both establish strong defaults with flexibility and the ability to confirm or update variables as needed, easing some of the burden for data entry., We will discuss increasing data richness in more detail in the recommendations section.

Machine learning, EHRs, and sex/gender

The use of EHRs for machine learning

EHR data have been used by machine learning researchers in a variety of contexts, including for retrospective analysis to better understand health phenomena, to identify patients for potential treatments or interventions, and/or to guide clinical care in real time. While training data for medical machine learning does not have to come from EHRs, collecting information at scale makes EHR data especially appealing for machine learning purposes., Additionally, whether a dataset is derived from EHR data or not, the way that sex and gender identity are encoded in EHR systems is consistent with broader trends about how these data tend to be gathered in other contexts,,,,,, and many potential improvements to EHR systems discussed here would also serve as improvements in those other contexts. When considering applications of medical machine learning, we can divide types of machine learning into those that generate sex- and gender-identity-related data (for example, the use of natural language processing to identify transgender patients) and those that use previously collected sex, gender, and sexual orientation data. In the following sections, we will focus on the use of sex- and gender-identity-related data.

Use of sex/gender in medical machine learning

In this section, we discuss a few examples of the ways that sex/gender slippage, sex confusion, and sex obsession manifest in machine learning research. Our exploration of medical machine learning more generally reveals that authors include so little information about where the sex parameter came from or what it does in their analysis so as to make sex confusion all but certain. We show how when medical machine learning researchers create or use datasets that lean on EHR data with limited information about sex/gender, they embed any assumptions about sex/gender that are present in those data and potentially introduce their own. Then, in the next section, we present a case study that takes a closer look at HIV and prescription of pre-exposure prophylaxis (PrEP). We chose this area because one of the inaugural predictive studies included transgender people explicitly, but subsequent studies still demonstrate sex obsession, despite the fact that HIV prevention is an area where one might expect a much more nuanced understanding of sex/gender. As part of this work, we used Google Scholar and snowball sampling to select 28 medical machine learning articles involving EHR data from adults with the mention of the words “sex” and/or “gender” published since 2016. We analyzed which sex/gender variables each article used and what (if any) analysis they performed using these data (for a list of papers, please see the supplemental appendix). Almost all the papers we reviewed use a binary male/female variable from the EHR system. Most studies also did not meaningfully distinguish between sex and gender, often using the terms interchangeably in ways that represent both slippage and confusion. For example, Ancochea et al. used the unstructured free text fields from the EHR of a hospital system in Spain to look at sex differences in management and treatment for COVID-19. The paper seems to pull sex data from the structured part of the EHR, but the results and discussion theorize a number of different potential explanations for differences between male and female patients. Some vary based on gender (e.g., women being more likely to have primary caregiving responsibilities), and some vary based on things like estrogen levels. Sex/gender slippage and presumption of concordance can have significant negative consequences in areas where transgender people face higher risks than cisgender people, like mental health problems. For example, Walsh et al. used EHR data to analyze suicide risk, incorporating gender data because “demographics such as age and gender are known risk factors for suicidal behavior.” In the study, gender was found to be a significant factor in prediction of suicide attempts. However, given that that paper was published in 2017 and Vanderbilt Medical system did not start collecting gender identity data until that year, it seems difficult to imagine that gender information was actually used. What seems more likely is that studies like Walsh et al. that study gender with male/female as options (and sometimes unknown or did not disclose) are actually using the sex variable from the EHR and calling it gender, engaging in sex/gender slippage throughout the work. Such a slippage is especially significant for papers like Walsh et al. because gender identity and gendered experiences likely do matter for suicidality, as transgender and gender-nonconforming people have significantly higher risk of suicide. A failure to understand and incorporate sex/gender may lead to inaccurate conclusions about suicide attempts as a gendered phenomenon and may specifically fail to recognize patterns around transgender suicidality. Sex confusion is most concerning in circumstances where machine learning is used to directly guide clinical care. For example, Lundberg et al. use explainable machine learning (ML) models to predict the prevention of hypoxaemia during surgery, allowing for anesthesiologists to adjust administration of anesthesia accordingly. In general, explainable models that have been thoroughly vetted and reviewed by working clinicians, like the one produced by Lundberg et al., reduce some of the risks that can be involved in black box models where sex/gender may end up playing a significant role as a feature without the knowledge of researchers, doctors, or patients. However, even this study only uses a binary sex variable and does not discuss any potential sex differences (let alone differences that incorporate gender). Additionally, this study incorporates data from an external anesthesia-monitoring system and other hardware that could have their own embedded sex/gender-related biases., Sex/gender may also be relevant if related laboratory testing or vital signs use sex/gender-based reference intervals or sex/gender offsets., If the model is built such that the EHR sex field is the relevant one for determining such thresholds/offsets, it could be wrong for whole subsections of the population without any clinician being the wiser. To our knowledge, none of the key features in Lundberg rely on backend sex/gender offsets. However, it is only the interpretability of their algorithm that allows us to analyze it. To the extent that ML systems may at some point be used to produce so called “dimensionless parameters,”, the failure to specify how questions of sex/gender might affect these numbers could have deleterious effects on transgender patients. The closer that ML based on EHR data gets to prescribing clinical care without allowing for a care team to determine whether the assumptions made as part of algorithmic development are correct for the patient, the more significant the risks are to individual patients who may “deviate” from the expectations of the people who built the systems. A more robust analysis of what the sex variable means in this context could allow for a more detailed analysis of how these variables interact with the population in question and support further research. Unfortunately, few of the papers we reviewed consider model performance based on sex/gender, and those that do were only because data was taken from a center that serves a significant population of transgender people., Even analyses that do consider sex or gender identity still rely on those variables to “stand in” for many underlying variables that could matter depending on the research question. As ML is likely to be used in the future to assist in “phenotyping” as a means of identifying patients for further research and treatment,, a more robust understanding of sex/gender is needed to ensure that transgender people are not left out of research and helpful medical care.

Case study: HIV and PrEP

Prediction of HIV risk for the purposes of administering pre-exposure prophylaxis (PrEP, medicine that can reduce the risk of HIV infection) has been a common topic among researchers for decades. Recent work has used ML for developing prediction models of patient risk of HIV infection.,,95, 96, 97, 98 This area is notable because it contains the only papers we reviewed that use sex data that are not binary male/female, as well as because researchers who work on HIV prevention are often explicitly focused on LGBTQ+ populations. Krakower et al., the earliest work we reviewed that suggests using ML to predict HIV risk and guide PrEP distribution, does not use a binary sex variable as the EHR data used were from Fenway Health, an organization specializing in LGBTQ healthcare. However, it is unclear which features were most predictive of HIV risk of the 168 variables extracted from the EHR system, so it is difficult to draw any conclusions about how sex/gender impacted model predictions. Of course, there is still much to improve here in the study’s representation of sex data, as it replaces a binary with three options: male, female, and transgender; as we discuss in the section 2010s to present: A decade of sex confusion and sex obsession, this is not the best practice. The authors of that work later built on their original poster to publish a full-length paper using two different sets of EHR data, one from Fenway, which does not employ a binary sex field, and one from a different local practice that used a binary sex field. Notably, the appendix table describing these data call the combined variable “sex or gender,” and the authors use both words to describe these data in the paper. The Krakower et al. study also extracted EHR information for “trans-sexualism” and for “gender identity disorder” (based on diagnosis codes), although neither of these variables were included in the final model. Most of the ML studies that predict risk of HIV infection began with hundreds of variables and then narrow to a smaller number that can be used without significant sacrifices to predictive power, using data methods like least absolute shrinkage and selection operator (LASSO). What this means in practice is that almost all of the ML models to predict HIV risk that we reviewed rely heavily on an “M” sex marker as a predictive factor. Because it is unclear what that variable means, both generally but especially for any given patient, sex confusion creates quite significant problems for HIV prediction algorithms. HIV is often transmitted through sexual activity, where questions of sex/gender play a large role. It might be tempting to reduce risk variance in HIV transmission to either sex assigned at birth or genitals, but such a framing is inaccurate. A 2017 discussion paper from Stardust et al. highlights a number of potential differences in how transgender people might experience risk related to HIV transmission compared with cisgender people. Potential differences include the differential properties of cis women’s vaginas and transfeminine people’s neo-vaginas and potential effects of testosterone on genital lubrication for transmasculine people with vaginas. And, of course, sex practices themselves are highly gendered. Slippages and confusion continue elsewhere. Another study used natural language processing to analyze unstructured EHR data and incorporated sex data but called it gender, demonstrating sex/gender slippage. Marcus et al. also used EHR sex as an input variable, specifically whether a patient had a male sex marker, as well as a variable for “transgender-related diagnoses.” The combination of these variables could create a fascinating opportunity to better understand HIV risks for some transgender people, but the paper does not deliver on this promise. The authors note that the algorithm produced “did not identify cases among women.” Because what the authors have is sex data, which may or may not correspond to gender, it is impossible to know if some of the cases identified were patients who were women. Nor does the existence of a variable for transgender-related diagnosis counteract that issue because without more information on how exactly people were identified in terms of gender, the presence of a diagnosis code for gender identity disorder does not tell you whether a particular patient is a trans woman or a trans man, let alone whether they were non-binary. The algorithms that the HIV risk prediction papers produced show the intertwined relationship between sex confusion and sex obsession, combining a failure to deeply engage with the construction of the sex/gender data that are present, with the assurance that sex data are the right type to use. And in the context of HIV transmission, a focus on a presumed set of body parts rather than the nature of gender expression and sexual practices is likely to produce predictive models that fail for those on the margins.

Additional challenges for ML researchers using EHRs

Our trio of sex troubles are not the only problems with sex/gender and medical ML on EHR data. EHR systems are not designed first for research purposes, and researchers will face challenges related to data quality and validation, lack of complete data capture, heterogeneity among systems, and a lack of system knowledge. With respect to sex/gender specifically, there are a few challenges ML researchers leveraging EHR data should be aware of. First off, collection of gender identity information by healthcare systems and other sources is inconsistent at best.,, Research from Reisner et al. suggests that there is a trade off between the number of patients/individuals and the availability of gender identity information, as many large databases used for population-level studies (e.g., datasets using driver’s licenses for demographics) do not include gender identity., Even in healthcare systems considered to be leading in this area, gender identity data may be missing for upwards of 75% of patients. When the data are present, information about how the data were collected may not be available (e.g., self-reported, taken from health insurance records, written down by intake staff, etc.). This can lead to inconsistencies between reported sex and gender information that require manual review to address. Transgender people may also be underrepresented in many datasets due to lack of disclosure and disenfranchisement., Both the availability of gender-identity- and other sex/gender-related variables has likely shifted and will continue to shift over time, which can pose challenges for analysis. Additionally, gender identity itself can change over time as well.

Recommendations

While assumptions about sex/gender are deeply embedded in EHR systems and medical ML, we believe that there are concrete steps ML researchers can take to better incorporate sex/gender in their work. In this section, we provide recommendations and further reading to help researchers better integrate sex/gender information into their research. We also discuss increasing data richness in more detail below. Unfortunately, even some of the papers we recommend here that are otherwise helpful resources may occasionally exhibit sex/gender slippage or overemphasize sex differences; we recommend keeping the concepts of sex/gender slippage, sex confusion, and sex obsession in mind as you interpret their recommendations. (1) Educate yourself about sex/gender and the experiences of transgender people inside and outside medical ML contexts.,,,,102, 103, 104, 105 (2) Work in teams with a range of competencies in addition to ML, including clinicians, patients, and EHR data experts. (3) Determine whether and which sex/gender data are relevant to the research, focusing on increasing data richness, and accounting for sex obsession.,,, (4) Document use of sex/gender variables clearly, including specific fields used, how they were gathered, how usage of those fields may have changed over time, and possible values. (5) Conduct data quality checks for sex/gender data and have strategies for handling missing data.,, (6) Consider bias from sources other than sex and gender-identity fields.,,,, (7) Audit model performance for subgroups without presuming or essentializing differences.,,108, 109, 110, 111

Avoid sex obsession by increasing data richness

We recommend focusing on data richness whenever possible instead of relying primarily on binary sex assigned at birth as a proxy for other variables. Data richness is a term from Hripcsak and Albers that involves moving “beyond the binary presence or absence of a condition to timing, degree, severity, cause, and relationship to factors like behavior, etc.” Hripcsak and Albers use diabetes as a case study, but similar ideas can also apply to conceptions of sex/gender. With “sex” (i.e., physiological variables), focusing on data richness includes considering using and consulting organ/anatomic inventories, especially ones that have information that account for the physiology of transgender people (for an in-depth discussion about implementation, see Kronk et al.). It also involves considering other potentially “sexed” variables like hormone levels, laboratory test results, and genomic data. With “gender” (i.e., sociocultural factors that may influence physiology), researchers can focus on selecting variables or indices related to gender identity, relations, roles, and institutionalized gender., Care should be taken not to be exclusive of transgender people when considering gender. We do not recommend, for example, the use of sex assigned at birth as a dependent variable to evaluate the creation of gender indices (see, e.g., Pelletier et al.). A focus on data richness is also a chance to consider the intersections between sex/gender and other demographic factors like race., Focusing on data richness will likely mean additional work in generating datasets rather than relying on existing ones that lack necessary sex/gender-related information. One possibility to address this is to choose representative subsets of patients to survey for additional information. However, even if it is additional work, we want to emphasize that this work is necessary as existing systems rely on flawed assumptions related to sex/gender that do not serve anyone.

Conclusion

Machine learning is only as good as the data it is built on, and in many medical contexts, that means EHRs. EHR systems were originally built with the assumption that sex and gender are binary, static, and concordant, translating to a number of common issues when data from them are used, including sex/gender slippage, sex confusion, and sex obsession. Many of the problems we have described are not unique to the phenomenon of sex/gender or to medical machine learning contexts., However, even researchers who are relatively savvy and thoughtful about the origins of their data and its (dis)contents, to borrow the phrasing of Paullada et al., can make the mistake of thinking that sex is one of the things in a medical dataset that does not need further unpacking. Sex is more like “family history” than it is like “age.” The same can, to some extent, be said for race and ethnicity. The nature of modern machine learning systems and the lack of interpretability to clinicians can make the dynamics described above even more concerning. As machine learning researchers look to the low-hanging fruit of EHR data, it is vital that they pay full attention to the contextual knowledge required to avoid sex trouble.

59 in total

1. Inclusion of Sexual Orientation and Gender Identity in Stage 3 Meaningful Use Guidelines: A Huge Step Forward for LGBT Health.

Authors: Sean R Cahill; Kellan Baker; Madeline B Deutsch; Joanne Keatley; Harvey J Makadon
Journal: LGBT Health Date: 2015-12-24 Impact factor: 4.151

2. Beyond a catalogue of differences: a theoretical frame and good practice guidelines for researching sex/gender in human health.

Authors: Kristen W Springer; Jeanne Mager Stellman; Rebecca M Jordan-Young
Journal: Soc Sci Med Date: 2011-06-15 Impact factor: 4.634

3. Providing Inclusive Care for Transgender Patients: Capturing Sex and Gender in the Electronic Medical Record.

Authors: Khushbu Patel; Martha E Lyon; Hung S Luu
Journal: J Appl Lab Med Date: 2020-12-17

4. Evolving Sex and Gender in Electronic Health Records.

Authors: Claire Burgess; Michael R Kauth; Caroline Klemt; Hasan Shanawani; Jillian C Shipherd
Journal: Fed Pract Date: 2019-06

5. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery.

Authors: Scott M Lundberg; Bala Nair; Monica S Vavilala; Mayumi Horibe; Michael J Eisses; Trevor Adams; David E Liston; Daniel King-Wai Low; Shu-Fang Newman; Jerry Kim; Su-In Lee
Journal: Nat Biomed Eng Date: 2018-10-10 Impact factor: 25.671

6. Electronic Health Records as Biased Tools or Tools Against Bias: A Conceptual Model.

Authors: Michael D Rozier; Kavita K Patel; Dori A Cross
Journal: Milbank Q Date: 2021-11-23 Impact factor: 4.911

7. Ethical Machine Learning in Healthcare.

Authors: Irene Y Chen; Emma Pierson; Sherri Rose; Shalmali Joshi; Kadija Ferryman; Marzyeh Ghassemi
Journal: Annu Rev Biomed Data Sci Date: 2021-05-06

Review 8. Approach to Interpreting Common Laboratory Pathology Tests in Transgender Individuals.

Authors: Ada S Cheung; Hui Yin Lim; Teddy Cook; Sav Zwickl; Ariel Ginger; Cherie Chiang; Jeffrey D Zajac
Journal: J Clin Endocrinol Metab Date: 2021-03-08 Impact factor: 5.958

9. The Imperative for Transgender and Gender Nonbinary Inclusion: Beyond Women's Health.

Authors: Heidi Moseson; Noah Zazanis; Eli Goldberg; Laura Fix; Mary Durden; Ari Stoeffler; Jen Hastings; Lyndon Cudlitz; Bori Lesser-Lee; Laz Letcher; Aneidys Reyes; Juno Obedin-Maliver
Journal: Obstet Gynecol Date: 2020-05 Impact factor: 7.623

10. What Sexual and Gender Minority People Want Researchers to Know About Sexual Orientation and Gender Identity Questions: A Qualitative Study.

Authors: Leslie W Suen; Mitchell R Lunn; Katie Katuzny; Sacha Finn; Laura Duncan; Jae Sevelius; Annesa Flentje; Matthew R Capriotti; Micah E Lubensky; Carolyn Hunt; Shannon Weber; Kirsten Bibbins-Domingo; Juno Obedin-Maliver
Journal: Arch Sex Behav Date: 2020-09-01