(Some) COVID-19 deaths are probably vastly underreported pending

Update 07/18/2020: “Unclassified” includes “pending”

The mystery of the excess “unclassified” deaths has been resolved, thanks to here and here.

Long story short, the deaths in R00-R99 include “pending” cause of death. So the changes in the R00-R99 deaths are not due to fraudulent data, but rather a very, very slow reporting process. States are required to report timely deaths, but because analyzing the cause of death can take months (yes, months) those cases that are pending are put in the “unclassified” category. This includes potentially sensitive and complicated deaths like drug overdose. Over time these extra unclassified deaths get resolved.

Here is a chart from the CDC (Figure 3), which shows how the percent of “unclassified deaths” at a certain time decreases over a typical (read: non-COVID-19) year as the deaths get re-classified and put into appropriate categories.

For the sake of intellectual honesty, I’m leaving the post unedited to record my thought process. But I’m glad it appears that the “unclassified deaths” are not due to fraudulent COVID-19 reporting. (Though the pace of death classification can be strikingly slow!)


The CDC gives the number of deaths reported and coded for a given week, giving insight into how many people died of what cause.

The data is not perfect: It only counts deaths received in that period, so it might not represent the deaths that actually occurred in that period. And some deaths may be provisional or incomplete due to lag between time of death and completion of death certificate. This is especially true in the most recent weeks, which are often incomplete (especially due to COVID-19) because of lag in counting. Regardless, it is helpful to see historical trends.

One category is especially interesting to explore and that is the “mysterious” deaths due to “Symptoms, signs and abnormal clinical and laboratory findings, not elsewhere classified (R00-R99)”. We will just refer to these as “Unclassified”.

Basically it is the junk drawer of deaths, but remains fairly constant from year to year.

Here’s a plot of Unclassified deaths in 2020 in the USA, compared with the previous 5 years as a baseline. I’ve also plotted COVID-19 deaths, where COVID-19 has been marked as the underlying cause.

You can see the dark blue line continuing to increase, even as the total COVID-19 deaths fall. The unclassified deaths are increasing several times the baseline.

Of course, the USA is big, so it helps to compare with some individual states. Here’s New York, New Jersey, and Massachusetts:

These Northeast states saw/reported huge COVID-19 deaths in April. The unclassified deaths are increasing for all, but a small proportion relative to those coded as due to COVID-19. But the unclassified deaths remain several standard deviations away from the baseline. It’s hard to see above, but the baseline is shaded light blue by plus-or-minus one standard deviation from the mean. (It’s easier to see below.)

Compare that to some of the southern states that saw a huge resurgence in COVID-19 cases in May and June, like Florida, Arizona, and Texas:

These states have reported far fewer deaths in total than the Northeastern states. But look at the relative numbers. In Florida and Texas, the unclassified deaths have now outpaced the COVID-19 deaths. In Arizona, unclassified and COVID-19 deaths are neck-and-neck. These states have an equal-to or larger proportion of unclassified deaths compared to COVID-19. In all cases (just like in the Northeast) the unclassified deaths are several (> 5) standard deviations (the light blue shading) away from the 5-year baseline. These death rates never fell; at best they plateaued.

No reporting is perfect and in every state we see the “mysterious” unclassified deaths increasing several times beyond the baseline — we are talking several standard deviations beyond the mean. This is not normal. And for some states, these deaths are more than, or nearly equal to, the number of deaths due to COVID-19.

Every state wants to re-open now that lockdown has drop-kicked the economy. I want to go back to normal (whatever that means) more than anything. But re-opening based on “lower death rates” is more complicated than it seems if the data is indeed this complicated.

I’m not saying that all these unclassified deaths are COVID-19. But it would be very strange if most of them were not. The worst interpretation is that these data are being fudged to justify re-opening. Best case would be a strained medical reporting system. Maybe there is another explanation that I can’t think of – after all, it’s been a strange year.