Context

Infant race and ethnicity are used ubiquitously in research and reporting, though inconsistent approaches to data collection and definitions yield variable results. The consistency of these data has an impact on reported findings and outcomes.

Objective

To systematically review and examine concordance among differing race and ethnicity data collection techniques presented in perinatal health care literature.

Data Sources

PubMed, CINAHL, and Ovid were searched on June 17, 2021.

Study Selection

English language articles published between 1980 and 2021 were included if they reported on the United States’ infant population and compared 2 or more methods of capturing race and/or ethnicity.

Data Extraction

Two authors independently evaluated articles for inclusion and quality, with disagreements resolved by a third reviewer.

Results

Our initial search identified 4329 unique citations. Forty articles passed title/abstract review and were reviewed in full text. Nineteen were considered relevant and assessed for quality and bias, from which 12 studies were ultimately included. Discordance in infant race and ethnicity data were common among multiple data collection methods, including those frequently used in perinatal health outcomes research. Infants of color and those born to racially and/or ethnically discordant parents were the most likely to be misclassified across data sources.

Limitations

Studies were heterogeneous in methodology and populations of study and data could not be compiled for analysis.

Conclusions

Racial and ethnic misclassification of infants leads to inaccurate measurement and reporting of infant morbidity and mortality, often underestimating burden in minoritized populations while overestimating it in the non-Hispanic/Latinx white population.

In the United States, the historical purpose of racial and ethnic categorization was an explicit and implicit effort to establish and perpetuate a hierarchy in which certain groups were advantaged over others.1  Resultant structural and institutional racism has led to inequities across all societal arenas, including health and health care.1  Scientific reporting has historically described racial and ethnic disparities in health outcomes as resulting from biological differences between groups. However, empirical evidence has shown more genetic diversity within racial groups than among them, demonstrating that racial categories are constructed to serve social, political, and cultural aims.2 

Although racial and ethnic categorization has its roots in perpetuating inequity, its current consideration in health care and public health settings is (ideally) for purposes of measuring and understanding disparities between and across populations so that inequitable practices and outcomes can be addressed. In this systematic review, we adopted the widely accepted definitions of health disparity and health inequity in which a health disparity is “a particular type of health difference that is closely linked with economic, social, or environmental disadvantage.”3,4  Health equity, as defined by Dr Paula Braveman, “means social justice in health (ie, no one is denied the possibility to be healthy for belonging to a group that has historically been economically/socially disadvantaged)” and “health disparities are the metric we use to measure progress toward achieving health equity.”4 

Vital statistics data and hospital-based medical records are used to identify and measure disparities in health care receipt and outcomes by race and ethnicity for population health reporting and research of the perinatal infant and birthing person population. The appropriate capture of these data is imperative to accurately report disparities and make progress toward health equity.

Several barriers to accurate reporting of perinatal health outcomes by race and ethnicity exist. Missing race and ethnicity data with subsequent exclusion of these individuals from reporting and research may lead to miscalculation of perinatal health disparities. The common practice of excluding those with missing race and/or ethnicity data from analysis, known as Complete Case Analysis, functions under the assumption that these data are missing at random. However, racial and ethnic minorities make up the majority of those with missing data.5  This practice leads to systematically excluding minoritized groups from research, resulting in inaccurate findings and conclusions, and an underestimation of health disparities.

The gold-standard method of collecting race and ethnicity data is self-report.2  This is complicated when considering the infant population, who are unable to self-report and for whom there is a lack of uniform policies and practices regarding collection of this data. In general, for U.S.-born infants, birth and death certificates are the most common way to collect infant race and ethnicity data and report it to vital records. Both are critical in calculating and reporting population-level perinatal health outcomes such as infant mortality; however, limitations in this data exist.

Government records (eg, standard U.S. birth and death certificates) have changed the way race and ethnicity are captured over time.1,6  Currently, infant race and ethnicity are not directly recorded on the standard birth certificate but are derived in central databases from the birthing parent’s self-reported identification.6  This practice excludes nonbirthing parents whose race and/or ethnicity is different from that of the birthing parent as well as nonbiological parents who may be listed on birth certificates in cases of same-sex parents or surrogacy. Race and/or ethnicity reporting practices on death certificates are also problematic. National guidelines recommend that funeral directors ask deceased individuals’ family members or friends to describe the individual’s race or indicate that no one is available to provide this information.7  However, adherence to these recommendations is not required and approximately one-half of funeral directors receive no training on this approach.7  Forty-three percent of funeral directors indicate they “sometimes” or “often” determine a deceased individual’s race by personal observation.8 

If data reflecting infant race and ethnicity are inaccurate or inconsistent across sources, the metrics used to assess overall infant population health and disparities are likely incorrect. Without proper policies and procedures in place ensuring the appropriate representation of infants in our data systems, we could be missing the detection of even more inequities than we currently estimate.

Given both the importance of collecting accurate race and ethnicity data as well as the tremendous variability in approaches, the objectives of this systematic review were to (1) identify methods for capturing and recording infant race and ethnicity in health care and public health systems and (2) assess concordance between approaches.

We searched PubMed, Ovid, and CINAHL on June 17, 2021. The search strategy was developed by a pediatric medical librarian for PubMed and transcribed for the other health care–focused literature databases and included articles published from 1980 through the date of search. Search terms were developed to capture methods of race and ethnicity data collection and included: “population surveillance,” “data collection,” “clinical coding,” format and records control,” “nursing records,” “medical records,” “hospital records,” “birth certificates,” “death certificates,” “electronic health records,” “vital statistics,” “censuses,” “hospital information systems,” “Asian American,” “African American,” “Hispanic,” “Latino,” “Caucasian,” “race,” “ethnicity,” “population groups,” “ethnic groups.”

Publications from 1980 through 2021 were included if they (1) assessed an infant population, (2) reported on a U.S. population, (3) were published in English, and (4) compared 2 or more approaches of capturing race and/or ethnicity data.

Our study screening process used a multidisciplinary team including experts in medicine (pediatrics and neonatology), social work, library sciences, and epidemiology/analytics. Publications were deduplicated. Title/abstract screenings and full-text reviews were completed by B.W. and 1 other reviewer, with conflicts resolved by a third reviewer of a different discipline when necessary. Each study considered for inclusion after full-text review was assessed for quality and bias with 2 reviewers of different disciplines completing a Joanna Briggs Institute’s Critical Appraisal Tools Checklist.9  When conflicts arose, the decision was discussed with the entire screening team until consensus was reached. All included studies were determined to have low risk of bias. Interrater reliability for the first 2 stages was measured with Cohen κ.

Data were extracted by B.W. and one other coauthor using a data extraction form (Supplemental Information) developed by the research team. Data extracted included study details (funding sources, authors, institutions), methods (study design, data sources, sampling procedure, analytical approach), population of study (setting, inclusion/exclusion criteria), and outcomes. When possible, raw data were extracted from studies by B.W. to conduct independent descriptive statistical analysis.

The study periods of included articles spanned a time during which revisions were made to the U.S. standard birth and death certificates, including variations in terms used to describe racial and ethnic groups. We could not completely map older terminology (eg, Indian) onto current racial groups (eg, American Indian/Alaska Native, Native Hawaiian/Other Pacific Islander). Additionally, older versions of birth and death certificates included nationalities as distinct racial categories (eg, Chinese, Japanese, Filipino) and some studies explored the concordance of these specific groups between data sources, which varies from current practice. To accurately represent study conclusions, original language of racial and ethnic categories is used throughout this manuscript.

We categorized studies based on the racial and/or ethnic data collection approaches compared, resulting in 4 categories: studies comparing birth certificates to (1) death certificates (n = 5), (2) self-reported surveys (n = 3), (3) government/health administrative records (n = 3), and (4) alternative race and/or ethnicity classification algorithms (n = 4). Three studies conducted more than 1 of these comparisons.

Data were not combined because of methodological and statistical heterogeneity between studies and overlapping study populations.

The screening process was managed with Covidence.10  All supporting data analyses were conducted in SAS 9.4 (Cary, NC).

The flow of study identification and inclusion is presented in Figure 1. Overall, 4329 unique articles were identified through searches of PubMed, CINAHL, and OVID. Of these, 12 studies were relevant to our objectives. The average Cohen κ for the abstract/title review was 0.29, and for full-text review was 0.34. All 12 assessed articles were included with full team consensus.

FIGURE 1

PRISMA flow diagram.

FIGURE 1

PRISMA flow diagram.

Close modal

Details for each study are summarized in Table 1. Across the 12 included studies, more than 116 million infants born over 46 years were analyzed. Two studies included the entire U.S. population, and 10 studies presented data from 6 states (California, Florida, New Jersey, North Carolina, Oklahoma, and Washington). All studies included analysis of birth certificate data. A timeline of relevant changes to the U.S. standard birth certificate is presented in Figure 2.

FIGURE 2

Timeline of relevant change to the US standard birth certificate.

FIGURE 2

Timeline of relevant change to the US standard birth certificate.

Close modal
TABLE 1

Summary of Studies

Author(s), Publication YearPopulationData Sources and Race and/or Ethnicity DefinitionsMethodsResults
Floyd Frost and Kirkwood K. Shy, 1980 Infants born to Washington state residents during 1968–1977 who died within 1 y and had linkable birth-death certificates
(n = 8390) 
1. Birth certificates
• 1978 NCHS algorithma
• Infant race determined from recorded parental races
2. Death certificates
• 1978 NCHS algorithma
• Infant race recorded as single entry; no parental demographics 
Analyzed race-specific infant mortality by race on birth certificate and race on death certificate
Cross-tabulations of infant deaths by:
1. Infant race at death with infant race at birth
2. Race of mother by race of father
Racial discordance assessed by age at infant death, 5-y interval of death, and cause of death 
• 4.2% (355) of infants had discordant races between birth and death certificates
• Calculating infant mortality with race on birth certificates resulted in increased nonwhite infant deaths (39% higher for Indians, 56% for Filipinos, 121% for Japanese, 117% for Chinese)
• White deaths decreased by 2.9% when coded by race at birth
• Discordance higher among infants with discordant-race parents (59% vs. 1%)
• Overall discordance among nonwhite infants increased from 48% to 56% between the two 5-y intervals of the study (P > .01)
• Cause of death not associated with rates of discordance 
Richard D. Kennedy and Roger E. Deapen, 1991 Infants born in Oklahoma during 1975–1988 who died within 1 y and had linkable birth-death certificates
(n = 7631) 
1. Birth certificates
• 1978 NCHS algorithma
2. Death certificates
• Infant race based on observation/information obtained by funeral director 
Cross-tabulation of race at birth vs race at death • Indian born were misclassified at death 28% of the time (usually as white)
• Infants born white or Black had <1% chance of being misclassified at death
• Calculating Indian infant mortality from death certificates (5.8/1000) results in lower rates than when calculating from birth certificates (10.4/1000)
• Over study period, racial misclassification of Indians increased to a high of 49% 
Robert A. Hahn, Joseph Mulinare, and Steven M. Teutsch, 1992 Race analysis: All U.S. infants born 1983–1985 who died within 1 y
(n = 117 188)
Ethnicity analysis: U.S. infants born 1983–1985 in states that reported parental Hispanic origin on birth certificates in compliance with NCHS criteria
(n = 30 244) 
1. Birth certificates
Race:
• NCHS pre-1989 algorithmb
• NCHS post-1989 algorithmc
Ethnicity:
Infant ethnicity derived from maternal ethnicity
2. Death certificates
• Race and ethnicity determined by funeral director based on information from kin/“knowledgeable party”
• Unknown race (0.2%) decedent assigned white if preceding record is white; otherwise assigned Black 
Cross-tabulation of race at birth vs race at death
Estimated infant mortality rates based on birth certificate and death certificate race classifications 
• 3.7% of all infants had discordant race at death vs birth
• 69.7% of infants had concordant ethnicity
• Racial discordance lowest for white (1.2%), 4.3% for Blacks, and highest (43.2%) for all other races
• 87.3% of infants with discordant race at death were classified as white
• Calculating infant mortality based on birth certificate results in 2.1% lower rate for whites, 3.2% increase for Blacks, 46.9% increase for American Indians, 33.3% increase for Chinese, 48.8% increase for Japanese, 78.7% increase for Filipinos, and 8.9% increase for Hispanics 
Donna O. Farley, Toni Richard, and Robert M. Bell, 1995 All live births in California 1985–1987
(n = 1 500 000)
Medical records abstracted from 1986 live California hospital births (n = 466 964) 
1. Death certificate
• Race and ethnicity classifications not defined
2. Birth certificate
• 6 collapsed racial and ethnic categories derived: white non-Hispanic, white Hispanic, African American, Asian, Native American, “Other”
Coding algorithms:
• Vital Statistics Rulea derived from paternal ethnicity
• Mother dominant: maternal race dominant, white race secondary
• Hierarchical rule: equal wt to parental races, assigned by hierarchy of African American, Native American, Asian, white Hispanic, and white non-Hispanic
• All-mother rule: maternal race assigned unless missing
• All-father rule: paternal race assigned unless missing
3. Hospital discharge records
• Race and ethnicity classifications not defined 
Compared distributions of race and ethnicity across 3 data sources using several birth certificate racial coding algorithms
• Nonlinked 1986 comparison of birth certificates to hospital records
• Linked birth-death certificates for those who died within 1 y
• Analyzed infants of racially discordant parents to determine effect of coding algorithms on distribution of races and ethnicities
Examined rates of racial and ethnic discordance by birth weight and age at death 
Hospital records vs birth certificates
• Hospital records underreported all except white and unknown/other race
• Greatest underreporting for Native Americans (1.1% birth certificate, 0.2% hospital records)
• Whites were overrepresented in hospital data (46.9% white on birth certificates, 52.7% in hospital records)
Birth certificates vs death certificates
• 42.9% white at birth, 47.3% white at death
• Fewer Asian, Native American and “other” race at death
• Findings directionally consistent across all 6 infant race algorithms
• Discordance between birth and death race higher among discordant race parents 
Lisa Baumeister, Kristen Marchi, Michelle Pearl, Ronald Williams, and Paula Braveman, 2000 California mothers from 16 hospitals who delivered between August 1994 and July 1995 and agreed to be interviewed, spoke English or Spanish, were ≥15 y old/emancipated minor/self-sufficient, and not incarcerated at any time during pregnancy/birth
(n = 7428) 
1. Birth certificates
• 1990 U.S. OMB Directive 15d
• Authors recreated race/ethnicity combined construct including African American, Asian/Pacific Islander, European/Middle Eastern, Latina, Native American, and Other
2. Face-to-face postpartum maternal interview
• “What racial or ethnic group do you consider yourself?” with responses converted to ≥1 of the 8 racial/ethnic categories used for birth certificate data
• Mothers asked to choose primary racial identification, if unable to do so coded as missing (n = 57) 
• Sensitivity: % of mothers in each racial/ethnic category in survey who were classified the same on birth certificate
• PPV: % of mothers classified as a race/ethnicity on birth certificate who were identified the same in interview
• Difference in proportions (%, 95% CI) 
• Sensitivity of birth certificate high (94%–99%) for African American, Asian/Pacific Islander, Europeans/Middle Easterners, and Latinas (Hispanics)
• Sensitivity 54% for Native Americans
• PPV high for all race/ethnic groups (96%–97%) 
Nancy E. Reichman and Erinn M. Hade, 2001 Singleton live births to New Jersey mothers enrolled in HealthStart (Medicaid only) 1989–1992
(n = 46 437) 
1. Birth certificate
• 1990 US OMB Directive 15d
2. HealthStart program MSSD form
• Maternal self-report of race and ethnicity 
• Sensitivity: proportion of cases in MSSD that are correctly identified as same race on birth certificate
• Specificity: proportion of cases identified as not a race in the MSSD who are not assigned to that race in the birth certificate
• PPV: proportion of birth certificates identified as a race who actually belong to that race (per MSSD)
• NPV: proportion of individuals not identified as a race on the birth certificate who are truly not that race (per MSSD) 
• High concordance for Hispanic ethnicity and Black race (91%– 98%)
• Hispanic ethnicity had 90.9% sensitivity, 97.2% specificity, 93.7% PPV, and 95.9% NPV
• Black race had 96.1% sensitivity, 97.6% specificity, 95.9% PPV, and 97.7% NPV 
Jennifer D. Parker and Jennifer H. Madan, 2002 All recorded US births 1968-98
(n = 113 818 502)
US-born NHIS respondents from 1990 to 1998 with valid race data (n = 339 342) 
1. US natality data
• 1977 OMB-15 Standarde
2. NHIS data
• Self-report coded to 1977 OMB-15 Standarde 
Compared racial distribution in birth data vs survey data by birth year
Calculated estimated race distribution for the NHIS using the age distribution of the survey respondents and their corresponding birth year–specific race proportions from natality files 
• Pre-1980, percentage of multirace births higher in natality files than NHIS
• Post-1980, percentage of multirace birth higher in NHIS than natality files
• Concordance among multirace groups was inconsistent over time between data sources 
Paul A Buescher, Ziya Gizlice, and Kathleen A. Jones-Vessey, 2005 Live 2002 births in North Carolina (n = 117 949) 1. Birth certificatec
• Free response of maternal race
• Race of infant derived directly from race of motherc
2. NCHS coding in national birth statistics registryf 
Compared race reported by mother on birth certificate with racial categories used in NCHS birth statistics
Examined effect of misclassification on maternal and child health indicators 
• 2/3 of Hispanic mothers identified as “other” race but classified as “white” in official birth statistics
• The minority/white ratios are affected by misclassification: ratio of low birth weight is 1.55 for self-reported race and 1.81 for NCHS races 
Ning Smith, Rajan L. Iyer, Annette Langer-Gould, Darios T. Getahun, Daniel Strickland, Steven J. Jacobsen, Wansu Chen, Stephen F. Derose, and Corinna Koebnick, 2010 Infants delivered in Kaiser Permanente Southern California hospitals January 1, 1998–December 21, 2008, with nonmissing maternal race data on birth certificate
(n = 325 810) 
1. Birth certificate
• Collected by clerks based on parental self-report
• If parental races differed, infant assigned multiple race
• If paternal race missing, infant classified based on maternal race
• Infant classified as Hispanic if either parent Hispanic. If either parent’s Hispanic ethnicity missing, known was used. If both missing, infant classified as unknown
2. Health plan administrative records (Kaiser Foundation System, EMR HealthConnect, hospital inpatient record before EMRs)
• Asian/Pacific Islander if any Asian language identified as preferred language
• Hispanic if any Spanish language listed as primary language
• If 3 sources of information contradicted on race, infant race classified as “multiple” 
Calculated racial/ethnic distributions of birth certificates compared with administrative records (unlinked data)
Sensitivity and PPVs were calculated with birth certificates considered gold standard; sensitivity was calculated with and without missing race data
Multivariable logistic regression used to estimate correct classification by length of health insurance coverage and number of medical encounters 
• Misclassification of ethnicity in administrative records 23.1% of the time (48.2% because of missing administrative data)
• Misclassification of race in administrative records found 33.6% of the time (40.9% because of missing data)
• Misclassification most common for minority groups
• PPV for white, Black, Asian/Pacific Islander, American Indian/Alaskan Native, “multiple,” and “other” were 89%, 87%, 74%, 18%, 52%, and 1%, respectively
• PPV for Hispanic ethnicity was 96%
• Concordance of racial data in administrative records with birth certificates increased with more medical encounters 
Erin K. Sauber-Schatz, William Sappenfield, Leticia Hernandez, Karen M. Freeman, Wanda Barfield, and Diana M. Bensyl, 2011 Infants, born in Florida to resident,s who died within one year between 2004-07
(n = 917 005) 
1. Birth certificates
• Pre-March 2004 infant race and ethnicity based on parental demographics
• Post-March 2004 infant race and ethnicity reflects maternal race and ethnicity only
2. Death certificates
• Pre-2005: single choice for race and ethnicity
• Post-2005: more than 1 race choice permitted 
Calculate traditional HIMR and nontraditional HIMR
• Traditional HIMR: based on death certificate ethnicity, compared with births to Hispanic women from birth certificate
• Nontraditional HIMR: linked birth-death data to provide consistent Hispanic ethnicity from birth certificate (maternal ethnicity = infant ethnicity)
Calculated and compared κ statistic for Hispanic agreement in 2004 and 2005 (after death certificate revision)
Compared infant’s ethnicity (death) to mother’s and father’s (linked birth)
Analyzed missingness by year to see if revision to collection affected this
• Logistic regression to assess if HIMR increase between 2004 and 2007 was “real” 
• Change in death certificate did not significantly change concordance (κ 0.7–0.76)
• Including paternal ethnicity insignificantly increased κ from 0.72 to 0.80
• Traditional HIMR increased 55% from 4/1000 in 2004 to 6.2/1000 in 2007 (P < .01)
• Nontraditional HIMR increased 20% from 4.5/1000 in 2004 to 5.4/1000 in 2007 (P = .03)
• 55% of HIMR increase likely artifactual, attributed to implementation of revised death certificate in 2005, leaves 45% of HIMR increase assumed to be real 
Lisa Reyes Mason, Yunju Nam, and Youngmi Kim, 2014 Live Oklahoma births April–June and August–October 2007 linkable to SEED OK survey
(n = 2663) 
1. Birth certificates
• Conventional: infant’s race and ethnicity same as maternal; if maternal ethnicity missing paternal used
• Alternative: infant classified based on both parents, if parental races or ethnicities differ, infant is classified as biracial
2. SEED for Oklahoma Kids baseline survey
• Conventional: maternal report of infant race with first mentioned race assigned
• Alternative: maternal report of infant race and ethnicity; Hispanic a distinct group 
Sensitivity and PPV to examine consistency of race/ethnicity across data sources; SEED OK treated as gold standard • Sensitivity of conventional measures highest for white and Blacks, lowest for Hispanics
• PPV highest for Hispanics and Blacks, lowest for American Indians
• Alternative measure improved values among whites, not as effective in other groups 
Rachel E. Rutkowski, Jason L. Salemi, Jean Paul Tanner, Jennifer L. Matas, and Russel S Kirby, 2017 Infants born with birth defects to non-Hispanic mothers in Florida 2005-2014
(n = 1 580 052) 
1. Florida Office of Vital Statistics
• Racial classification not defined
2. Florida Birth Defects Registry
• Racial classification not defined 
Compared differences in racial distribution (white, Black, American Indian/Alaska Native, Asian/Pacific Islander) using 6 race-bridging algorithms for multiracial mothers
Percent difference in racial groups of algorithm compared with common birth certificate references
Subanalysis of only multiple race infants 
• 1.2% of non-Hispanic Florida mothers reported more than 1 race on their infant’s birth certificate 
Author(s), Publication YearPopulationData Sources and Race and/or Ethnicity DefinitionsMethodsResults
Floyd Frost and Kirkwood K. Shy, 1980 Infants born to Washington state residents during 1968–1977 who died within 1 y and had linkable birth-death certificates
(n = 8390) 
1. Birth certificates
• 1978 NCHS algorithma
• Infant race determined from recorded parental races
2. Death certificates
• 1978 NCHS algorithma
• Infant race recorded as single entry; no parental demographics 
Analyzed race-specific infant mortality by race on birth certificate and race on death certificate
Cross-tabulations of infant deaths by:
1. Infant race at death with infant race at birth
2. Race of mother by race of father
Racial discordance assessed by age at infant death, 5-y interval of death, and cause of death 
• 4.2% (355) of infants had discordant races between birth and death certificates
• Calculating infant mortality with race on birth certificates resulted in increased nonwhite infant deaths (39% higher for Indians, 56% for Filipinos, 121% for Japanese, 117% for Chinese)
• White deaths decreased by 2.9% when coded by race at birth
• Discordance higher among infants with discordant-race parents (59% vs. 1%)
• Overall discordance among nonwhite infants increased from 48% to 56% between the two 5-y intervals of the study (P > .01)
• Cause of death not associated with rates of discordance 
Richard D. Kennedy and Roger E. Deapen, 1991 Infants born in Oklahoma during 1975–1988 who died within 1 y and had linkable birth-death certificates
(n = 7631) 
1. Birth certificates
• 1978 NCHS algorithma
2. Death certificates
• Infant race based on observation/information obtained by funeral director 
Cross-tabulation of race at birth vs race at death • Indian born were misclassified at death 28% of the time (usually as white)
• Infants born white or Black had <1% chance of being misclassified at death
• Calculating Indian infant mortality from death certificates (5.8/1000) results in lower rates than when calculating from birth certificates (10.4/1000)
• Over study period, racial misclassification of Indians increased to a high of 49% 
Robert A. Hahn, Joseph Mulinare, and Steven M. Teutsch, 1992 Race analysis: All U.S. infants born 1983–1985 who died within 1 y
(n = 117 188)
Ethnicity analysis: U.S. infants born 1983–1985 in states that reported parental Hispanic origin on birth certificates in compliance with NCHS criteria
(n = 30 244) 
1. Birth certificates
Race:
• NCHS pre-1989 algorithmb
• NCHS post-1989 algorithmc
Ethnicity:
Infant ethnicity derived from maternal ethnicity
2. Death certificates
• Race and ethnicity determined by funeral director based on information from kin/“knowledgeable party”
• Unknown race (0.2%) decedent assigned white if preceding record is white; otherwise assigned Black 
Cross-tabulation of race at birth vs race at death
Estimated infant mortality rates based on birth certificate and death certificate race classifications 
• 3.7% of all infants had discordant race at death vs birth
• 69.7% of infants had concordant ethnicity
• Racial discordance lowest for white (1.2%), 4.3% for Blacks, and highest (43.2%) for all other races
• 87.3% of infants with discordant race at death were classified as white
• Calculating infant mortality based on birth certificate results in 2.1% lower rate for whites, 3.2% increase for Blacks, 46.9% increase for American Indians, 33.3% increase for Chinese, 48.8% increase for Japanese, 78.7% increase for Filipinos, and 8.9% increase for Hispanics 
Donna O. Farley, Toni Richard, and Robert M. Bell, 1995 All live births in California 1985–1987
(n = 1 500 000)
Medical records abstracted from 1986 live California hospital births (n = 466 964) 
1. Death certificate
• Race and ethnicity classifications not defined
2. Birth certificate
• 6 collapsed racial and ethnic categories derived: white non-Hispanic, white Hispanic, African American, Asian, Native American, “Other”
Coding algorithms:
• Vital Statistics Rulea derived from paternal ethnicity
• Mother dominant: maternal race dominant, white race secondary
• Hierarchical rule: equal wt to parental races, assigned by hierarchy of African American, Native American, Asian, white Hispanic, and white non-Hispanic
• All-mother rule: maternal race assigned unless missing
• All-father rule: paternal race assigned unless missing
3. Hospital discharge records
• Race and ethnicity classifications not defined 
Compared distributions of race and ethnicity across 3 data sources using several birth certificate racial coding algorithms
• Nonlinked 1986 comparison of birth certificates to hospital records
• Linked birth-death certificates for those who died within 1 y
• Analyzed infants of racially discordant parents to determine effect of coding algorithms on distribution of races and ethnicities
Examined rates of racial and ethnic discordance by birth weight and age at death 
Hospital records vs birth certificates
• Hospital records underreported all except white and unknown/other race
• Greatest underreporting for Native Americans (1.1% birth certificate, 0.2% hospital records)
• Whites were overrepresented in hospital data (46.9% white on birth certificates, 52.7% in hospital records)
Birth certificates vs death certificates
• 42.9% white at birth, 47.3% white at death
• Fewer Asian, Native American and “other” race at death
• Findings directionally consistent across all 6 infant race algorithms
• Discordance between birth and death race higher among discordant race parents 
Lisa Baumeister, Kristen Marchi, Michelle Pearl, Ronald Williams, and Paula Braveman, 2000 California mothers from 16 hospitals who delivered between August 1994 and July 1995 and agreed to be interviewed, spoke English or Spanish, were ≥15 y old/emancipated minor/self-sufficient, and not incarcerated at any time during pregnancy/birth
(n = 7428) 
1. Birth certificates
• 1990 U.S. OMB Directive 15d
• Authors recreated race/ethnicity combined construct including African American, Asian/Pacific Islander, European/Middle Eastern, Latina, Native American, and Other
2. Face-to-face postpartum maternal interview
• “What racial or ethnic group do you consider yourself?” with responses converted to ≥1 of the 8 racial/ethnic categories used for birth certificate data
• Mothers asked to choose primary racial identification, if unable to do so coded as missing (n = 57) 
• Sensitivity: % of mothers in each racial/ethnic category in survey who were classified the same on birth certificate
• PPV: % of mothers classified as a race/ethnicity on birth certificate who were identified the same in interview
• Difference in proportions (%, 95% CI) 
• Sensitivity of birth certificate high (94%–99%) for African American, Asian/Pacific Islander, Europeans/Middle Easterners, and Latinas (Hispanics)
• Sensitivity 54% for Native Americans
• PPV high for all race/ethnic groups (96%–97%) 
Nancy E. Reichman and Erinn M. Hade, 2001 Singleton live births to New Jersey mothers enrolled in HealthStart (Medicaid only) 1989–1992
(n = 46 437) 
1. Birth certificate
• 1990 US OMB Directive 15d
2. HealthStart program MSSD form
• Maternal self-report of race and ethnicity 
• Sensitivity: proportion of cases in MSSD that are correctly identified as same race on birth certificate
• Specificity: proportion of cases identified as not a race in the MSSD who are not assigned to that race in the birth certificate
• PPV: proportion of birth certificates identified as a race who actually belong to that race (per MSSD)
• NPV: proportion of individuals not identified as a race on the birth certificate who are truly not that race (per MSSD) 
• High concordance for Hispanic ethnicity and Black race (91%– 98%)
• Hispanic ethnicity had 90.9% sensitivity, 97.2% specificity, 93.7% PPV, and 95.9% NPV
• Black race had 96.1% sensitivity, 97.6% specificity, 95.9% PPV, and 97.7% NPV 
Jennifer D. Parker and Jennifer H. Madan, 2002 All recorded US births 1968-98
(n = 113 818 502)
US-born NHIS respondents from 1990 to 1998 with valid race data (n = 339 342) 
1. US natality data
• 1977 OMB-15 Standarde
2. NHIS data
• Self-report coded to 1977 OMB-15 Standarde 
Compared racial distribution in birth data vs survey data by birth year
Calculated estimated race distribution for the NHIS using the age distribution of the survey respondents and their corresponding birth year–specific race proportions from natality files 
• Pre-1980, percentage of multirace births higher in natality files than NHIS
• Post-1980, percentage of multirace birth higher in NHIS than natality files
• Concordance among multirace groups was inconsistent over time between data sources 
Paul A Buescher, Ziya Gizlice, and Kathleen A. Jones-Vessey, 2005 Live 2002 births in North Carolina (n = 117 949) 1. Birth certificatec
• Free response of maternal race
• Race of infant derived directly from race of motherc
2. NCHS coding in national birth statistics registryf 
Compared race reported by mother on birth certificate with racial categories used in NCHS birth statistics
Examined effect of misclassification on maternal and child health indicators 
• 2/3 of Hispanic mothers identified as “other” race but classified as “white” in official birth statistics
• The minority/white ratios are affected by misclassification: ratio of low birth weight is 1.55 for self-reported race and 1.81 for NCHS races 
Ning Smith, Rajan L. Iyer, Annette Langer-Gould, Darios T. Getahun, Daniel Strickland, Steven J. Jacobsen, Wansu Chen, Stephen F. Derose, and Corinna Koebnick, 2010 Infants delivered in Kaiser Permanente Southern California hospitals January 1, 1998–December 21, 2008, with nonmissing maternal race data on birth certificate
(n = 325 810) 
1. Birth certificate
• Collected by clerks based on parental self-report
• If parental races differed, infant assigned multiple race
• If paternal race missing, infant classified based on maternal race
• Infant classified as Hispanic if either parent Hispanic. If either parent’s Hispanic ethnicity missing, known was used. If both missing, infant classified as unknown
2. Health plan administrative records (Kaiser Foundation System, EMR HealthConnect, hospital inpatient record before EMRs)
• Asian/Pacific Islander if any Asian language identified as preferred language
• Hispanic if any Spanish language listed as primary language
• If 3 sources of information contradicted on race, infant race classified as “multiple” 
Calculated racial/ethnic distributions of birth certificates compared with administrative records (unlinked data)
Sensitivity and PPVs were calculated with birth certificates considered gold standard; sensitivity was calculated with and without missing race data
Multivariable logistic regression used to estimate correct classification by length of health insurance coverage and number of medical encounters 
• Misclassification of ethnicity in administrative records 23.1% of the time (48.2% because of missing administrative data)
• Misclassification of race in administrative records found 33.6% of the time (40.9% because of missing data)
• Misclassification most common for minority groups
• PPV for white, Black, Asian/Pacific Islander, American Indian/Alaskan Native, “multiple,” and “other” were 89%, 87%, 74%, 18%, 52%, and 1%, respectively
• PPV for Hispanic ethnicity was 96%
• Concordance of racial data in administrative records with birth certificates increased with more medical encounters 
Erin K. Sauber-Schatz, William Sappenfield, Leticia Hernandez, Karen M. Freeman, Wanda Barfield, and Diana M. Bensyl, 2011 Infants, born in Florida to resident,s who died within one year between 2004-07
(n = 917 005) 
1. Birth certificates
• Pre-March 2004 infant race and ethnicity based on parental demographics
• Post-March 2004 infant race and ethnicity reflects maternal race and ethnicity only
2. Death certificates
• Pre-2005: single choice for race and ethnicity
• Post-2005: more than 1 race choice permitted 
Calculate traditional HIMR and nontraditional HIMR
• Traditional HIMR: based on death certificate ethnicity, compared with births to Hispanic women from birth certificate
• Nontraditional HIMR: linked birth-death data to provide consistent Hispanic ethnicity from birth certificate (maternal ethnicity = infant ethnicity)
Calculated and compared κ statistic for Hispanic agreement in 2004 and 2005 (after death certificate revision)
Compared infant’s ethnicity (death) to mother’s and father’s (linked birth)
Analyzed missingness by year to see if revision to collection affected this
• Logistic regression to assess if HIMR increase between 2004 and 2007 was “real” 
• Change in death certificate did not significantly change concordance (κ 0.7–0.76)
• Including paternal ethnicity insignificantly increased κ from 0.72 to 0.80
• Traditional HIMR increased 55% from 4/1000 in 2004 to 6.2/1000 in 2007 (P < .01)
• Nontraditional HIMR increased 20% from 4.5/1000 in 2004 to 5.4/1000 in 2007 (P = .03)
• 55% of HIMR increase likely artifactual, attributed to implementation of revised death certificate in 2005, leaves 45% of HIMR increase assumed to be real 
Lisa Reyes Mason, Yunju Nam, and Youngmi Kim, 2014 Live Oklahoma births April–June and August–October 2007 linkable to SEED OK survey
(n = 2663) 
1. Birth certificates
• Conventional: infant’s race and ethnicity same as maternal; if maternal ethnicity missing paternal used
• Alternative: infant classified based on both parents, if parental races or ethnicities differ, infant is classified as biracial
2. SEED for Oklahoma Kids baseline survey
• Conventional: maternal report of infant race with first mentioned race assigned
• Alternative: maternal report of infant race and ethnicity; Hispanic a distinct group 
Sensitivity and PPV to examine consistency of race/ethnicity across data sources; SEED OK treated as gold standard • Sensitivity of conventional measures highest for white and Blacks, lowest for Hispanics
• PPV highest for Hispanics and Blacks, lowest for American Indians
• Alternative measure improved values among whites, not as effective in other groups 
Rachel E. Rutkowski, Jason L. Salemi, Jean Paul Tanner, Jennifer L. Matas, and Russel S Kirby, 2017 Infants born with birth defects to non-Hispanic mothers in Florida 2005-2014
(n = 1 580 052) 
1. Florida Office of Vital Statistics
• Racial classification not defined
2. Florida Birth Defects Registry
• Racial classification not defined 
Compared differences in racial distribution (white, Black, American Indian/Alaska Native, Asian/Pacific Islander) using 6 race-bridging algorithms for multiracial mothers
Percent difference in racial groups of algorithm compared with common birth certificate references
Subanalysis of only multiple race infants 
• 1.2% of non-Hispanic Florida mothers reported more than 1 race on their infant’s birth certificate 

Abbreviations: EMR, electronic medical record; HIMR, Hispanic Infant Mortality Race; MSSD, Maternity Services Summary Data; NPV, negative predictive value; PPV, positive predictive value

a

For infants with 1 white and 1 nonwhite biological parent, nonwhite race assigned. For infants of 2 nonwhite biological parents, paternal race is assigned, except if maternal race is Hawaiian.

b

Same as in footnote a. If race data are missing for either parent, infant assigned known race. If there is no information on either parent’s race, infant assigned is race of preceding record.

c

Infant is assigned mother’s race.

d

Four racial categories (American Indian or Alaska Native, Asian or Pacific Islander, Black and white) and 2 ethnicities (Hispanic or non-Hispanic). If more than 1 race is selected, infant is assigned first reported race.

e

White, Black, American Indian or Alaska Native, Asian or Pacific Islander, and “other race.” Infant assigned “other race” if parental races differed.

f

National Center for Health Statistics allowed for only 1 race. Text entries converted to 1 of 10 racial categories (white, Black, Indian, Chinese, Japanese, Hawaiian, Filipino, Other Asian or Pacific Islander, Other Entries, and not reported) based on: if Hawaiian reported, code Hawaiian; if more than 1 race (other than Hawaiian), chose first race listed; if multiple races list with percentages (except for Hawaiian), code as race with highest percentage; multiracial/biracial/mixed coded as Other Entries.

Although both birth and death certificate data sources are imperfect, the self-reported nature of birth certificates makes them data superior to that captured by the death certificate; the authors of each article in this section treated birth certificates as the gold standard.

Each study exploring racial and/or ethnic concordance between linked birth-death certificates, found more infants reported as White on death certificates than on birth certificates. Nationally, Hahn et al found that 87.3% of misclassified infants of color were classified as white at death; Frost and Shy found a rate of 73% in Washington.11,12  Farley et al’s California investigation found a 9% increase in the number of white infants at death versus the number at birth, with fewer Asian, Native American, and Other Race infants being captured on death versus birth certificiates.13 

Hahn et al’s, Kennedy and Deapen’s, and Frost’s studies each found significantly different frequencies of racial misclassification at death by race at birth (P < .0001).11,12,14  Across studies, white, Black, and American Indian individuals were misclassified 1%, 3% to 10%, and 33% of the time, respectively. Kennedy and Deapen, investigating the misclassification of American Indian infants in Oklahoma from 1975 to 1988, found that misclassification increased over time, with average misclassification surpassing 40% from 1983 to 1988.14 

Farley et al’s and Frost’s studies showed racial discordance to be more common for infants born to racially discordant parents than racially concordant parents.12,13  In California, 35% of Black infants with racially discordant parents were misclassified as non-Black at death versus less than 3% for Black infants with racially concordant parents.13  For Asian infants the difference was 73% versus 8%.13  In Washington, discordance among all infants with racially discordant parents was 59% versus 1% for infants born to concordant-race parents.12 

Sauber-Schatz et al explored capture of Hispanic/Latinx ethnicity in Florida before and after a revision in death certificate data.15  In 2005, Florida made revisions to include more specific questions about the decedent’s race and ethnicity, asking about race before ethnicity to avoid Hispanic being entered as the race, and training vital statistics reporters to never leave Hispanic origin blank. Subsequent analysis found no significant change in concordance between maternal ethnicity at birth and infant ethnicity at death after revisions.

In studies comparing birth certificates to self-reported survey data, demographics identified in surveys were universally treated as the gold standard.

Baumeister et al’s comparison of birth certificates to maternal surveys and Mason et al’s to the Oklahoma’s SEED survey found the sensitivity of racial classification on the birth certificate to be between 82% and 94% for African Americans, whites or European/Middle Eastern infants, and Asian or Asian/Pacific Islanders.16,17  These studies had disparate findings for Hispanics/Latinx and American Indians/Native Americans, with Baumeister et al finding sensitivities of 94% and 54%, respectively, and Mason et al finding sensitivities of 72% and 77%, respectively.16,17  Overall racial discordance was 3% in Baumeister’s study, with significant variation by survey-indicated race (P < .0001). Nineteen percent of discordant cases were among mothers who identified with more than 1 race in the survey.16  The Oklahoma study showed 7% discordance for infants of racially concordant parents and 37% among those of discordant race parents.17 

Parker and Madans compared proportions of racial groups between birth certificates and the National Health Interview Survey (NHIS), specifically exploring the concordance of multiple-race reporting, finding that rates of discordance ranged from 9% to 67% depending on time period and specific multiple-race group.18 

Studies exploring racial and ethnic concordance between birth certificates and hospital/government records all considered birth certificates to be the gold standard because of their self-reported nature.

Farley et al compared proportional distributions of infant race between unlinked birth certificates and hospital discharge records, finding that hospital records reported fewer infants in all racial groups except for white, “Other,” and “Unknown.”13  Hospitals overcategorized infants as white, with 11% more infants being recorded as white in discharge records than on birth certificates (52.7% vs 46.9%). Underreporting in hospital records was greatest for Native American infants, with 82% more infants being identified as Native American on the birth certificate than by the hospital (1.1% vs 0.2%).

In comparing birth certificates with hospital administration records, Smith et al found the sensitivity of correctly identifying Hispanic ethnicity in administrative records to be 76.9% and the positive predictive value to be 95.6%.19  Most cases of ethnic misclassification were due to missing ethnicity data in hospital records. Concordance of Hispanic ethnicity between data sources was significantly higher for infants born to concordant ethnicity parents than those born to parents with different ethnicities (P < .001). Sensitivity was 66.4% for the white, Black, and Asian/Pacific Islander races, with 40.9% of discordance accounted for by missing hospital data. Racial sensitivity and specificity were lowest among children with multiple racial identities selected on the birth certificate.

In comparing birth certificates with records of a HealthStart program for Medicaid-enrolled families, Reichman and Hade found high levels of agreement for Black race and Hispanic ethnicity with sensitivity, specificity, and positive and negative predictive values all greater than 90%.20 

Four studies explored the impact of alternative race classification schemes for categorizing infant race from the birth certificate, each in comparison with standard categories required for federal reporting.

Buescher et al compared self-reported race data from birth certificates to the 10 National Center for Health Statistics race categories.21  Of 15 074 birth certificates with Hispanic ethnicity indicated by the parent, 69% selected Other for their race, with 9945 (66%) specifying Hispanic as their race and ethnicity. The National Center for Health Statistics’ coding conversion categorized 98.5% of these Hispanic births as being of white race, indicating that these federally required categories do not align with self-reported birth certificate race and ethnicity.

Farley et al and Rutkowski et al explored various algorithms for classifying infant race from birth certificates compared with the standard race categories used by the federal government.13,22  Among non-Hispanic Florida births, Rutkowski et al compared a reference race-bridging algorithm, in which multiple-race mothers were counted once for each race they selected, to 6 other bridging algorithms.22  When comparing this reference with an algorithm that allocated mothers to the largest of the racial groups selected, the total number of white, Black, and Asians/Pacific Islander individuals increased. The methods categorizing mothers into the smallest racial group selected and the largest nonwhite group both resulted in increased numbers of Black, Asians/Pacific Islander, and American Indian/Alaska Native individuals. Rutkowski et al’s exploration points to the lost granularity that occurs when reducing race and ethnicity to a limited set of options, no matter the approach.

Farley et al similarly explored 4 algorithms for categorizing infant race from the birth certificate.13  When determining infant race from maternal and paternal race equally with a racial hierarchy rule prioritizing nonwhite races by population frequency, non-Hispanic white infants were shifted into the Hispanic white group. Algorithms using maternal or paternal race only resulted in more infants being classified as non-Hispanic white than the algorithms that consider both parents’ races. Across all explored algorithms, misclassification of race and/or ethnicity occurred 10.9% to 12.8% of the time, indicating that any system of reclassifying race from raw data results in a loss of accuracy.

Mason et al tested alternative coding schema for race and ethnicity on birth certificates and SEED OK survey results that allowed for a biracial classification not permitted on the traditional versions of these tools.17  The positive predictive value between the survey and birth certificate using alternative coding methods with a biracial category was greater than 80% for all single-race groups except for American Indians. More than 50% of infants who were classified as biracial on the birth certificate were classified as only 1 race and/or ethnicity on the survey. Concordance between conventional racial categories of the birth certificate and surveys was greater than that between the alternative coding schema of both data sources, demonstrating that current classification systems are better, though still imperfect.

In this systematic review, we found that discordance in infant race and ethnicity data were common among multiple data collection methods, including those frequently used in perinatal health research. In general, infants of color have greater rates of discordance than their non-Hispanic white counterparts. Infants born to racially and/or ethnically discordant parents were the most likely to be misclassified across data sources.

These studies highlight the impact of racial and ethnic misclassification on common perinatal outcomes used to represent overall population health. Farley et al found that mortality rates for infants of color were underestimated because of infants being misclassified as white at death, with underestimates most pronounced for Native Americans and East Asians.13  In Kennedy and Deapen’s study, American Indian infant mortality was underestimated by almost 43% when based on the race documented in death certificates compared with survey self-reported race (5.9/1000 vs. 10.4/1,000).14  Generally, administrative records overclassify infants as white, resulting in underestimates of outcomes among infants of color and overestimates among white infants.

In general, misclassification of infant race and ethnicity resulted in the overestimation of the number of white infants because infants of color were misidentified as white. When considering adverse outcomes such as preterm birth, neonatal morbidities, or infant mortality, this leads to a systematic underestimation of the racial and ethnic inequity. Underestimation of inequities has the potential to affect funding for research and community-based programming designed to address inequities and may decrease institutional awareness, focus, and efforts to close the equity gap through the use of equity dashboards or equity-focused quality improvement.

The 2021 March of Dimes report card demonstrated that Hispanic, American Indian/Alaska Native, and Black infants had higher mortality rates in 2018 than their non-Hispanic white counterparts.22  A leading causes of infant mortality in the United States is preterm birth, for which Black birthing people experienced a 50% higher rate in 2020 than their non-Hispanic white or Hispanic counterparts.23  These disparities have been pervasive throughout U.S. history and persist even though overall preterm birth and infant mortality rates have decreased with advances in neonatal care that improve survival (eg, antenatal steroids and surfactant). Importantly, although the U.S. preterm birth rate has decreased over the past 50 years, the equity gap in neonatal mortality and morbidities has worsened. Our findings lead us to believe that these inequities are likely greater than have previously been reported because of racial and ethnic misclassification.

To accurately measure inequities in infant morbidity and mortality data, infants’ race and ethnicity should be captured systematically, allowing for granularity and consistency. In 2009, an Agency for Healthcare Research and Quality (AHRQ) report outlined standardized practices for the collection of race, ethnicity, and language data that allows for more granularity than current federally mandated categories.2  In addition to the Office of Management and Budget (OMB)-required ethnicity categories (Hispanic or non-Hispanic), the AHRQ recommends an additional question be asked about ethnic origin, descent, heritage, or place of birth of the person or their ancestors. Many Hispanic/Latinx individuals report not identifying with the current 5 OMB-provided race categories, often reporting Hispanic/Latinx as their race as well as ethnicity.2  It is recommended that an other race category become universal to allow for all racial identities to be represented. The subcommittee additionally recommends that people be given the opportunity to select all identities they wish.

Although AHRQ’s recommendations seek to improve the accuracy of race and ethnicity data, their report specifies that institutions should offer additional race categories that are relevant to their population and must choose such categories from an expanded standard list.2  It is important to consider the process by which institutions determine their own relevant minoritized populations and the privilege of those who make this decision. A legacy of institutional exclusion of minoritized people and a history of inaccurate data collection make this process fraught. The inclusion of community stakeholders and individuals from minoritized groups to help define institutional race and ethnicity categories that are organizationally relevant is imperative. Given AHRQ’s requirement that any granular racial and ethnic categories be collapsible into the 5 standard OMB race and 2 ethnicity categories, we are again left with a process that can fundamentally ignore the heterogeneity of racial and ethnic groups. In addition, the reduction of racial and ethnic identity into 1 category obscures the social construction of race and fails to acknowledge the intersectional process of identity formation.23 

Although we have highlighted significant discordance in race and ethnicity data collection and reporting across various data sources, we also recognize that potentially better practices do exist. To date, the most reliable source of race and ethnicity data available at the population-level is the birth certificate. In all U.S. states, birthing individuals who have delivered a live-born infant can self-report their own race and ethnicity. Some states also collect data on the race and ethnicity of other nonbirthing parent(s) and that of the infant. For states that collect infant race and ethnicity data as a separate data element from maternal race and ethnicity, this data source is even more reliable. Given that nearly every infant born in the United States has a birth certificate, we must identify innovative ways to link birth certificate data to other health records to allow for accurate perinatal health disparities reporting. The Pregnancy to Early Life Longitudinal (PELL) data system in Massachusetts provides 1 example.24  At its core, the PELL data system consists of birth certificates of infants born to Massachusetts residents that are linked to maternal and infant death records, birth hospitalization records, subsequent hospitalizations, and social service use. Race and ethnicity data in electronic medical health records and medical billing data, such as Medicaid data or the All Payors Claims Database, is laden with missingness or misclassification. With the linkage of birth certificate data to other health records, the PELL data system allows for accurate race and ethnicity data reporting beyond the immediate birth period.

At the local birthing hospital level, we must clearly understand how birth certificates are completed (when, by whom, and how) and develop workflows that transfer accurate race and ethnicity data from birth certificates to health records for birthing individuals and infants so that hospitals can measure perinatal health care processes and outcomes stratified accurately by race and ethnicity. Institutional policies should reflect an expectation of giving all patients the opportunity to self-identify their race and ethnicity, and staff should receive training on how to ask and record this information.25  A growing number of birthing hospitals are engaging in health equity efforts, and it is only with reliable disaggregated data that health care systems can assess whether their care delivery and outcomes are, indeed, equitable.

Strengths of this systematic review include a literature search that included multiple databases and spanned 4 decades to garner the highest number of potential articles for inclusion. We used a multidisciplinary team including expertise in medicine, social work, health services research, library sciences, and epidemiology. The review process, designed to be inclusive and comprehensive, included multiple reviewers from different disciplines at each level of abstract and article review. Limitations of this work include the overall paucity of data for the subject, with a total of 12 studies meeting full inclusion criteria, 7 of which were published more than 20 years ago. Classifications of race and ethnicity terminology have evolved, resulting in challenges in comparing data over time. Because of the heterogeneity of data on this topic, a meta-analysis of these data was not feasible.

This systematic review provides valuable evidence to the public health, clinical, and perinatal research communities, demonstrating critical deficits in the collection and reporting of race and ethnicity data. These data have broad implications for measuring perinatal health inequity and are essential to identifying the appropriate scope of interventions to address these inequities at the hospital, community, and national levels.

Ms Weikel conceptualized and designed the study, reviewed studies for inclusion, designed data extraction tool, extracted data, conducted additional analyses, drafted the initial manuscript, and reviewed and revised the manuscript. Drs Klawetter and Bourque conceptualized the study, reviewed studies for inclusion, extracted data, drafted the initial manuscript, and reviewed and revised the manuscript. Ms St. Pierre conceptualized the study, developed and conducted the search strategy, and reviewed the manuscript for important content. Dr Hannan reviewed studies for inclusion, extracted data, and reviewed and revised the manuscript. Dr Roybal and Ms Soondarotok reviewed studies for inclusion and reviewed and revised the manuscript. Dr Hwang conceptualized the study, reviewed studies for inclusion, drafted the initial manuscript, reviewed and revised the manuscript, and provided supervisory guidance and direction. Dr Fraiman drafted the initial manuscript, reviewed and revised the manuscript, and provided content expertise. All authors approved the final manuscript as submitted and agree to be accountable for all aspects of the work.

FUNDING: No external funding.

CONFLICT OF INTEREST DISCLOSURES: The authors have indicated they have no potential conflicts of interest to disclose.

COMPANION PAPER: A companion to this article can be found online at www.pediatrics.org/cgi/doi/10.1542/peds.2022-059540.

AHRQ

Agency for Healthcare Research and Quality

NHIS

National Health Interview Survey

OMB

Office of Management and Budget

PELL

Pregnancy to Early Life Longitudinal

1
Pearson
SJ
.
The Birth Certificate: An American History
.
Chapel Hill, NC
:
The University of North Carolina Press
;
2021
:
189
224
2
Institute of Medicine (US) Subcommittee on Standardized Collection of Race/Ethnicity Data for Healthcare Quality Improvement
.
Race, ethnicity, and language data: standardization for health care quality improvement
.
4
Braveman
P
.
What are health disparities and health equity? We need to be clear
.
Public Health Rep
.
2014
;
129
(
suppl 2
):
5
8
5
Yi
SS
,
Kwon
SC
,
Suss
R
, et al
.
The mutually reinforcing cycle of poor data quality and racialized stereotypes that shapes Asian American health
.
Health Aff (Millwood)
.
2022
;
41
(
2
):
296
303
6
Brumberg
HL
,
Dozor
D
,
Golombek
SG
.
History of the birth certificate: from inception to the future of electronic data
.
J Perinatol
.
2012
;
32
(
6
):
407
411
7
National Center for Health Statistics
.
Death edit specifications for the 2003 revision of the US Standard Certificate of Death
.
Available at: https://www.cdc.gov/nchs/data/dvs/death_edit_specifications.pdf. Accessed November 15, 2022
8
Hahn
RA
,
Wetterhall
SF
,
Gay
GA
, et al
.
The recording of demographic information on death certificates: a national survey of funeral directors
.
Public Health Rep
.
2002
;
117
(
1
):
37
43
9
JBI
.
Critical appraisal tools
.
Available at: https://jbi.global/critical-appraisal-tools. Accessed March 17, 2022
10
Covidence systematic review software
.
Available at: https://www.covidence.org. Accessed October 20, 2022
11
Hahn
RA
,
Mulinare
J
,
Teutsch
SM
.
Inconsistencies in coding of race and ethnicity between birth and death in US infants. A new look at infant mortality, 1983 through 1985
.
JAMA
.
1992
;
267
(
2
):
259
263
12
Frost
F
,
Shy
KK
.
Racial differences between linked birth and infant death records in Washington State
.
Am J Public Health
.
1980
;
70
(
9
):
974
976
13
Farley
DO
,
Richards
T
,
Bell
RM
.
Effects of reporting methods on infant mortality rate estimates for racial and ethnic subgroups
.
J Health Care Poor Underserved
.
1995
;
6
(
1
):
60
75
14
Kennedy
RD
,
Deapen
RE
.
Differences between Oklahoma Indian infant mortality and other races
.
Public Health Rep
.
1991
;
106
(
1
):
97
99
15
Sauber-Schatz
EK
,
Sappenfield
W
, %
Hernandez
L
,
Freeman
KM
,
Barfield
W
,
Bensyl
DM
.
Reasons for the increasing Hispanic infant mortality rate: Florida, 2004-2007
.
Matern Child Health J
.
2012
;
16
(
6
):
1188
1196
16
Baumeister
L
,
Marchi
K
,
Pearl
M
, %
Williams
R
,
Braveman
P
.
The validity of information on “race” and “Hispanic ethnicity” in California birth certificate data
.
Health Serv Res
.
2000
;
35
(
4
):
869
883
17
Mason
LR
,
Nam
Y
,
Kim
Y
.
Validity of infant race/ethnicity from birth certificates in the context of U.S. demographic change
.
Health Serv Res
.
2014
;
49
(
1
):
249
267
18
Parker
JD
,
Madans
JH
.
The correspondence between interracial births and multiple-race reporting
.
Am J Public Health
.
2002
;
92
(
12
):
1976
1981
19
Smith
N
,
Iyer
RL
,
Langer-Gould
A
, et al
.
Health plan administrative records versus birth certificate records: quality of race and ethnicity information in children
.
BMC Health Serv Res
.
2010
;
10
:
316
20
Reichman
NE
,
Hade
EM
.
Validation of birth certificate data. A study of women in New Jersey’s HealthStart program
.
Ann Epidemiol
.
2001
;
11
(
3
):
186
193
21
Buescher
PA
,
Gizlice
Z
,
Jones-Vessey
KA
.
Discrepancies between published data on racial classification and self-reported race: evidence from the 2002 North Carolina live birth records
.
Public Health Rep
.
2005
;
120
(
4
):
393
398
22
Rutkowski
RE
,
Salemi
JL
,
Tanner
JP
,
Matas
JL
,
Kirby
RS
.
Assessing the impact of different race-bridging algorithms on the reported rate of birth defects
.
J Registry Manag
.
2017
;
44
(
4
):
146
156
23
Crenshaw
KW
.
On Intersectionality: Essential Writings
.
New York, NY
:
The New Press
;
2017
24
Kotelchuck
M
.
Pregnancy to Early Life Longitudinal (PELL) data system: research into child health policy
.
25
Stanford Medicine
.
We ask because we care
.
Available at: https://med.stanford.edu/health equity/WABWC.html. Accessed May 17, 2022

Supplementary data