Tuesday, November 21, 2006

Look beyond the numbers

Information visualization teaches us the intricacies of displaying data graphically, and how one must look beyond first impressions to see the "truth" behind the data (you can see a gallery of "best" and "worst" examples here).

To take a (presumably) fictitious example: a line chart showing how the cost of living has gone down at the same time that sales of the automotive Big Three has gone up. A politician representing Michigan might be able to get considerable mileage (no pun intended) out of a chart like that. Problem is, the data tends may be true in and of themselves, but by placing them together, the impression is given that they are interrelated, when in fact, it's very likely that they're not.

The lesson is: the data doesn't necessarily represent the picture it intends to portray.

Recently I came across a website for an elite boarding school in the Northeast. In the Admissions section, there is a page detailing grants and financial aid, with the following sentence:

"Intent on enrolling an economically diverse student body, (the school) offers
financial aid to students, based on financial need. In this school year, $2.8
million of financial aid was awarded to 28 percent of the student body. "

A very admirable goal, and with board-in tuition at just shy of $39,000/year, $2.8 million in grant aid is truly nothing to sneeze at. This chart is offered as an outline of their distribution method:

Income LevelGrants more than
$30,000
Grants from
$15,001 - $30,000
Grants up to
$15,000
$0 - $ 25,000900
$25,001 - $ 50,000510
$50,001 - $ 75,0001361
$75,001 - $100,000562
Over $100,0009227
Total413510

After looking at it for a while, I decided to do the math, using the maximum dollar amounts for each grant award type (i.e., for this example, "grants more than $30,000" equals the full in-board price of $38,800) and came up with the following:

Income Level

% of
grant recipients

% of
grant money
Grant $ awarded
(in dollars)
$0 - $ 25,00010%12.5%349k
$25,001 - $ 50,0007%8%224k
$50,001 - $ 75,00024%25%699k
$75,001 - $100,00015%14.5%404k
Over $100,00044%40%1114k
Total2790k

Let's take it a bit further -- remember, according to the opening statement, 72% of the student body is not receiving any grant aid, and it's reasonable to assume that those students come from households earning more than $100,000/year. If almost half of the remainding 28% are from >$100k households that are receiving grant aid of some kind, that means that roughly 85% of the total student body falls into that income category.

So, upon closer examination of the data, those for whom "economic diversity" is a major concern might want to start asking some detailed questions.

Of course, there is equal danger in using this data to make a potentially false judgement of the school as a haven for the well-to-do -- after all, I'd venture to guess that many of those >$100k households are middle-class two-earner families, just trying to stay afloat while giving their children to an academically advantageous environment.

The point here is not to use data to make snap judgements, but rather to advocate for a healthy skepticism of data representations, so that more informed decisions can be made.

No comments: