Wednesday, April 6, 2022

Misbehaving Proportions

In Chapter 5, More Pie than Plate, Ellenberg continues his journey into how proportions can mislead. It boggles my mind how such a simple thing such as percentages/proportions can be gamed! In fact, it's quite possible that the people coming up with these aren't even aware themselves. Maybe they just do a back of the envelope calculation, see that it advances their agenda and slap it on an advertisement or endorsement somewhere. But that view is probably too forgiving.

The key lesson in this chapter is: "Don't talk about percentages of numbers when the numbers might be negative...When you disobey the slogan I gave you, all sorts of weird incongruities start to bubble up."

It allows you to tell fake stories!

He gives plenty of real-life examples. One instance is when Governor Scott Walker of Wisconsin touted "50 percent of U.S. job growth in June came from our state". This was based on the US economy as a whole adding only 18,000 jobs nationally, and Wisconsin having a net increase of 9,500 jobs. Thus 9500/18000 > 50%; Seems fair at first but upon closer examination, it's weird how one state could claim over 50% of the jobs created. Turns out this is what happens when you include negative numbers in the analysis. By the same logic, Minnesota could have reported 70% job creation! Texas, California, Michigan and Massachusetts all beat Wisconsin! How could that be? What's happening is that "job losses in other states almost exactly balanced out the jobs created in places like Wisconsin, Massachusetts and Texas". Thus taken together, that's how you get to 18,000 net jobs nationally. And slapping on 9500 jobs as the numerator allows Walker to take credit for something that in reality is just a mathematical oddity.

Other juicy examples are:

  1. Income Inequality: Common ways to pick apart the data is to separate out the 1% and the 99% and show how the 99% keep losing ground. However, if you dive into the top 10% (excluding the top 1%), you can see that this next cohort (so the top 89% -> 99%) also has been increasing their gains in income. And for this to happen, the bottom 89% is actually making negative gains! So to accommodate these kinds of growth for the top 10% (or 11%), the other 90% (or 89%) are actually making less! This should probably be more of a headline than just saying that the top 1% or 10% are outpacing everyone else in their shares of income.
  2. Ad campaign in the 2012 US election claiming that "Women account for 92.3% of all jobs lost under Obama". This kind of outlandish claim is on the face of it already suspect so there must be some kind of gaming of numbers here. And indeed there is. The Romney campaign who was responsible for this ad, took net job numbers for a set period of time and divided one by the other to arrive at the 92.3% figure. While the mechanics of the calculation is technically correct, this is not really the right thing to measure. It's a spin on the numbers to claim that Obama is responsible for the drop. If the purpose of the ad is to make Obama look bad, mission accomplished! "The net job loss is positive sometimes, and negative other times, which is why taking percentages of it is a dangerous business." So if you want the truth, you need to ask a different question (and by association, make a different calculation). But that's only if you want the truth.

One indication that this is a bad methodology to get a result is that if they had shifted to a different period to start their calculation (ie, start in Feb 2009 rather than Jan 2009), they would have shown that "women accounted for over 3,000% of all jobs lost on Obama's watch!" But that is OBVIOUSLY silly, so they couldn't make that claim and have it taken seriously. So we should be on red alert that there's something fishy about their claim. This is reminiscent of Gary Smith's chapters on Data Mining; he raised the example of the Foolish Four stocks and their stock picking strategy. Of course you had to buy the stocks in January. Researchers tried their strategy but chose July as the month to purchase the stocks and guess what? It was a miserable failure. Looks like the Romney campaign employed their own data-mining strategy.


What I find fascinating in all these examples is that one of the most fundamental concepts in math, one that elementary schoolers learn to compute, (we're referring to the proportion / percentage here) can be misused in such a dramatic way! Just because you know how to compute something doesn't mean it's right. If you take a grade-school example of say, dividing 3 apples by 4 oranges when the question is: what is the ratio of oranges to apples. There, the mistake is obvious. These real-life examples are more insidious. It's not readily apparent what the problem is. You take net numbers, divide and show a drastic increase in crimes, or x% of women lost their jobs.

Either the people doing these calculations aren't aware of their 'innocent' calculations or they are intentionally misleading us. How do we poke holes into this? Training ourselves to recognize the ways in which we can be misled is important...

 


No comments:

Post a Comment