Saturday, April 2, 2022

The Danger of Comparing Proportions

In Ellenberg's book How Not to Be Wrong, Chapter 4, How much is that in Dead Americans?, the author writes about how a common statistic used to express tragedies is to calculate the proportion of people killed in some accident or terrorist bombing to the general population of the country and then to equate that to the same percentage if it happened in the US. Purportedly this is done so Americans grasp the equivalent tragedy if it happened here.

It makes sense on the face of it and I've personally never questioned this. And if the proportion is high, then presumably it makes one feel the tragedy at a personal level if one were to imagine it happening in their own country.

So in this chapter, Ellenberg takes us on a mathematical tour of the methods of determining the 'equivalences' of such calamities and I found it quite illuminating.

I will go through his reasoning below to likewise give a similar journey in this blog post:

1. While using proportions per country seems to make sense at the outset, he says that if you can compute these numbers in multiple ways and get different answers, this is an indicator that it may not be the best method to use.

Example: If a bombing in Tegucigalpa, Honduras killed 200 people, you may see news articles that state that the equivalent number of deaths had we a similar bombing in NYC would be: close to 1,400 deaths which of course if you are in NY, you would be shocked. And that of course is the desired effect.

To get to 1,400 deaths, you do this:

PopulationPct
Tegucigalpa11580000.01727%
Honduras99050000.00202%

So using 0.01727% and multiplying by NYC's population gives you:
Equiv Victims
Population# using Tegucigalpa-basis# using Honduras-basis
NY80000001381.692573161.5345785
USA33000000056994.818656663.301363

1381.7 which is ~1400 victims. Shocking!
But as you can see, there are measures you can compute by using the entire country instead (Honduras vs. USA). There you get 6663 victims. Shocking as well. One can dial up/down the shock as one sees fit.

So Ellenberg states This multiplicity of conclusions should be a red flag. Something is fishy with the method of proportions.

Of course, we can't just throw out proportions! They can be useful, depending on the circumstances. So Ellenberg continues his tour by diving deeper.

2. He next goes into deaths by brain cancer as calculated by state. We do this on a proportional basis as this is preferable to absolute numbers. Absolute numbers by state aren't useful because the populations differ widely per state; thus we are better off "computing the proportion of each state's population that dies of brain cancer each year". Calculating this by state showed that South Dakota came in first with 5.7 brain cancer deaths per 100,000 people per year, well above the national rate of 3.4; However North Dakota was on the bottom of the list! Why is one Dakota at the top and the other Dakota at the bottom?! Similarly with Vermont and Maine; In Vermont, the rate is low, but Maine is in the top 5!

What these states have in common is that the populations are low! "Living in a small state, apparently, makes it either much more or much less likely you'll get brain cancer."

This sounds ridiculous and so there must be something else going on. To get to the bottom of this, the author goes into coin-flipping and demonstrates with random simulation that with small samples there is a lot more variability. This is at the heart of the Law of Large Numbers which states that as the sample size grows, the means of those samples approach the mean of the general population. So with small samples, you'll get more variability but as the sample sizes grow, you'll get more stability in the long run.

Thus, this explains what's happening with brain cancer. "Measuring the absolute number of brain cancer deaths is biased towards the big states; but measuring the highest rates - or the lowest ones!- put the smallest states in the lead. That's how South Dakota can have one of the highest rates of brain cancer death while North Dakota claims one of the lowest... it's because smaller populations are inherently more variable."

He then goes on give examples from (1) NBA where players you never heard of have the highest averages simply because they made a couple of shots in the limited amount of playing time they had and (2) schools in North Carolina doing well (and badly) on standardized tests; in the latter example he states "the reason small schools dominate the top 25 isn't because small schools are better, but because small schools have more variable test scores. A few child prodigies or a few third-grade slackers can swing a small school's average wildly". He continues "So how are we supposed to know which school is best, or which state is most cancer-prone, if taking simple averages doesn't work? If you're an executive managing a lot of teams, how can you accurately assess performance when the smaller teams are more likely to predominate at both the top and bottom tiers of your rankings?"

"There is, unfortunately, no easy answer... You could accomplish this by taking some kind of weighted average [of the state rate] with the national rate. But how to weigh the two numbers? That's a bit of an art, involving a fair amount of technical labor."

Incidentally, this variability in small sample sizes explanation rang a bell and so I looked up why it did. It was because I previously had read another excellent book: The Flaw of Averages by Sam L. Savage. In chapter 17, he characterizes this variability in small sample sizes in the chapter The Flaw of Extremes, in which he asks "Did you know that localities whose residents have the largest average earlobe size tend to be small towns?" This seems baffling until he explains it. "The sizes of towns and earlobes have nothing to do with each other; it's just that averages with small samples have more variability than averages over large samples". This applies not just to earlobes but to prevalence of diseases, crime rates, educational test scores and anything else you may care to average. "To summarize, the flaw of extremes results from focusing on abnormal outcomes such as 90th percentiles, worse than average cancer rates, or above average test scores. Combining or comparing such extreme outcomes can yield misleading results."

To bring this back to calculating which tragedies are worse, there is no one-size-fits-all rule. If you use the proportion rule, then for the 20th century you get: (1) Massacre of the Herero of Namibia by German colonists, (2) slaughter of Cambodians by Pol Pot, (3) King Leopold's war in the Congo. Hitler, Stalin, Mao and the big populations they killed don't make the list. So obviously the proportions ranking is not a good measure. "How much distress should we experience when we read about the deaths of people in Israel, Palestine, Nicaragua, or Spain?"

Ellenberg gives a rule of thumb: "If the magnitude of a disaster is so great that it feels to talk about 'survivors', then it makes sense to measure the death toll as a proportion of total population." He gives the example of the Rwandan genocide; there we can talk about survivors, so we use proportions: in this case, 75% of the Tutsi population was wiped out. Then we can equate other disasters that wiped out a similar ratio as the 'equivalent of the Rwandan genocide", (say if 75% of the Swiss population was wiped out in a catastrophe).

On the other hand, in some cases, you would stay away from using proportions (and stop equating tragedies in one country to a supposed tragedy in another) when there's no need to talk about survivors. You wouldn't call someone who lives in Seattle a 'survivor' of the World Trade Center attack for example. "So it's probably not useful to think of deaths at the WTC as a proportion of all Americans. Only about one in a hundred thousand Americans, or 0.001%, died at the WTC that day. That number is too close to zero for your intuition to grasp hold of it; you have no feeling for what that proportion means."

I also like what Ellenberg says at the beginning of the chapter. "When there are two men left in the bar at closing time, and one of them coldcocks the other, it is not equivalent in context to 150 million Americans getting simultaneously punched in the face"; yet this is the approach that's taken when 'equating' tragedies across different countries. (Or perhaps it's done primarily with Americans when some political party wants to push a talking point in favor of their agenda).

At the end of the chapter, it is posed "So how are we supposed to rank atrocities, if not by absolute numbers and not by proportion? Some comparisons are clear. The Rwanda genocide was worse than 9/11 and 9/11 was worse than Columbine and Columbine was worse than one person getting killed in a drunk-driving accident. Others, separated by vast differences in time and space, are harder to compare....the question of whether one war was worse than another is fundamentally unlike the question of whether one number is bigger than another...if you want to imagine what it means for 26 people to be killed by terrorist bombings, imagine 26 people killed by terrorist bombings - not halfway across the world, but in your own city."

 So to review:

(1) Absolute numbers aren't sufficient to compare.

(2) Proportions are better for comparison but even then, the proportions should be used in 'exceptional' cases (for e.g., when you can talk about survivors). The point here is that proportions can be gamed, based on what you use as the denominator; thus if you come across these kinds of comparisons, be weary and cognizant of the author's possible agenda in swaying you.

(3) Side Journey into using proportions: must be careful of coming to conclusions when there are small samples involved


No comments:

Post a Comment