A Novel Machine Learning Approach to Identify COVID-19 Deaths Among Excess Deaths Reported to Non-COVID-19 Causes

Mathew Kiang , Stanford University
Andrew Stokes, Boston University

The total number of deaths caused by the SARS-CoV-2 virus in the United States has been heavily debated since the start of the COVID-19 pandemic. Researchers have previously developed excess mortality models to estimate the number of deaths that would have occurred in the absence of the pandemic and the number of COVID-19 deaths that were not reported to COVID-19. However, estimates of excess deaths represent an upper-bound of total COVID-19 deaths as some of these deaths were likely related to health care interruptions and others to the pandemic’s social and economic effects. We use a machine learning approach to leverage information from death certificate data, county characteristics related to population, health systems, and the death investigation system, and county-month trends in excess deaths and reported COVID-19 deaths to produce refined estimates of the total number of COVID-19 deaths throughout the United States from 2020 through 2022.

See extended abstract

 Presented in Session 2. Machine Learning Approaches for Population Research