Explained: Counting India’s Covid deaths

The World Health Organization (WHO) report on excess mortality due to Covid-19 is the latest in a series of exercises in the last over a year that tend to suggest India’s official death toll is an undercount. The WHO report has pegged India’s excess mortality (people who probably would not have died if there was no pandemic) for 2020 and 2021 at 47.4 lakh. Several other studies have shown India’s Covid-related death count at anywhere between 25 lakh to 60 lakh. In fact, the upper bound for the WHO study is even higher.

While these figures are being debated, many important nuances are being missed.

The official count

Any discussion on the undercount as of now is premature because India has not yet stopped counting its Covid deaths. The 5.24 lakh deaths counted until now is not the final official toll. The number is under constant revision, almost daily, and is likely to remain so for several months, if not years. Kerala, for example, is updating its death toll almost every day, and many other states have been doing it periodically. Last week, Assam added 1,300 deaths on a single day. Other states have made similar adjustments in the past.

Even the death numbers reported in 2020 and 2021 are not final. The more than 21,000 deaths Kerala has reported in the last four months have not all happened this year. Most of them pertain to last year. The 1,300 deaths Assam added on April 25 did not all happen that day, or that month, or this year. They most likely happened the previous year. Several hundreds, possibly thousands, of deaths that states adjusted in 2021 would have actually happened in 2020. The additions to the overall tally are being made on the day these deaths are being confirmed, and not the day these might have happened.

That would mean that even though the death count for 2020 still shows up as 1.49 lakh, chances are that it has already been substantially corrected, and may be revised even further at a later stage. Many of the deaths that would have happened in 2020 but are not included in the 1.49 lakh tally would have been accounted for at a later stage. Those deaths have not been missed; they will reflect in the statistics. The same is true for 2021.

It is difficult to measure the scale of the undercount in a situation like this, particularly when the counting exercise is still on. A physical count, and verification, of the dead in a country as vast as India during such chaotic times is bound to take a little more time than running some equations in a computer model.

The WHO report does not get into calculating the scale of the undercount, for India or any other country. It has done a more straightforward exercise of calculating excess mortality. It has estimated the total number of people who likely died in India in 2020 due to all causes and, from that, has subtracted the expected number of all-cause deaths if there was no Covid. These ‘excess’ deaths are considered to be a direct or indirect result of Covid-19.

Multiple studies and estimates

It is often argued that, because multiple studies have been pointing to similar estimates, they must be reflective of the true death toll in the country. What is being overlooked is that these studies have been throwing up similar estimates probably because they have all been using similar mathematical models and statistical methods. The researchers, and those doing peer reviews, belong to an overlapping set of people.

What these studies certainly show is that there is a general agreement in academic circles on the usefulness of these models in the current situation, possibly based on their ability to simulate the reality in some earlier situations. That, however, is no guarantee that these models have an unquestionable ability to accurately an mimic the dynamics of the current pandemic, whose nature and behavior is far from fully understood.

Computer modeling is routinely used by academics to simulate real-life situations. The accuracy of their results depend on the underlying quality of data and assumptions. Because of multiple layers of extrapolations involved, slight changes in assumptions or input data can significantly alter results.

To estimate India’s Covid toll, the WHO study has relied, among various sources, on death registration data from the Civil Registration System (CRS). Several media houses had published monthly CRS data for a few states last year, and these numbers did not always match. Depending on which publication was selected to pick these numbers from, the output of the mathematical model would have been different.

Newsletter | Click to get the day’s best explainers in your inbox

Also, the monthly CRS death data, even if obtained through an RTI applications, was only ‘provisional’ and subject to change. Only the numbers mentioned in the annual CRS report, released last week, are final.

The quality of assumptions can also induce large uncertainties in the final result. In March 2020, a widely quoted computer modeling study led by Ramanan Laxminarayan had predicted 1 to 3 million Covid-related deaths in India by the middle of April, roughly three weeks from that time. When nothing like that happened, Laxminarayan acknowledged that the risk of dying from Covid was actually a lot lower than he had originally assumed.

Scientists admit that they still do not fully understand the nature and behavior of this virus. It is thus difficult to assume that these processes and behaviors that epidemiologists have not fully understood have somehow found accurate description in modeling or machine learning algorithms.


As discussed in The Indian Express earlier, the CRS data released last week do not throw any fresh light on this debate on their own. That is because CRS only has death registration data, and not every death in the country is registered. The actual death data is revealed by the Sample Registration Survey (SRS) whose report for 2020 has not yet been released.

CRS and SRS are annual exercises that complement each other. The SRS uses a door-to-door survey in a few thousand sample towns and villages to produce an estimate of the total number of births and deaths in the country every year. This exercise is repeated after a few months to avoid duplication.

The CRS is a database of all births and deaths that get registered. The CRS database is therefore a subset of the SRS. Over the last few years, as more and more people are registering their births and deaths, the CRS numbers are converging closer to the SRS estimates.

These two systems might still not be perfect, but they are extremely robust sources of birth and death data. These data are consistent with the findings of the Census and a vast array of other data-collection exercises that together make up all the demographic, social and economic indicators that everyone agrees on.

SRS numbers from the past 15 years have established that about 83 lakh people die in India every year on an average. If the SRS for 2020, whenever it comes out, reveals that 90 lakh or more people died in the country in that year, instead of the expected 82 to 84 lakh, then it would suggest that the computer models used by WHO or other studies were accurate in estimating 8 lakh excess deaths due to Covid-19 in India in 2020. If the SRS numbers are not close to that, it can be inferred that those many never died.


Leave a Comment