When I got away
I only got so far
--Slipknot, "Dead Memories"
I am new parent — my daughter is 3 years old. Like many new parents, I’m incredibly stressed that I’m doing it wrong — whatever "it" is — with her. I’m talking too much, or not enough. I’m praising too much, or not enough. I’m allowing her to fail too much, or … well, you get the picture.
And there is no lack of “research” to guide — or exacerbate — my worrying. I read it all — books, magazines, blog posts. You name it, I’ve probably read it.
It’s not surprising, I guess, that it all seems compelling. In fact, two pieces of research suggesting the exact *opposite* of each other can be equally compelling.
Sigh, this parenting thing is hard.
So, when confronted with something hard and anxiety-inducing, I fall back on the math.
Because, my dear readers, the math does, in fact, matter.
I am a music junkie. And I grew up during the PMRC era, when bad hair and lazy thinking created a cultural meme about the dangers of music. Parenthetically, Dee Snider’s Congressional testimony on the topic is epic.
But, lazy thinking aside, there must be some relationship to the media we are exposed to and our views of the world and willingness to act. I mean, propaganda works, right? So music must have some impact. Seems logical.
So when that renowned journalistic empire, USA Today, posted an article about the connection between rock music lyrics and binge drinking, I listened.
And then read.
And then sighed.
OK, the relevant paper is “Receptivity to and Recall of Alcohol Brand Appearances in U.S. Popular Music and Alcohol-Related Behaviors”.
I won’t go through the article in detail, but let’s pick on the math. First, they created a complicated coding system for whether you liked a song, owned a song, and could identify any of the brands in the song.
They used 10 songs, scoring song one time if you liked it, a second time if you owned it. So, there were up to 20 points to be had on the song side. The mean score of song points was 3.7. The standard deviation was 4.2.
Sigh. Keep in mind the normal distribution rules: ~66% of all responses fall between the mean plus one standard deviation and the mean minus one standard deviation. Keep in mind that the standard deviation was 4.2, against a mean of 3.7. The standard deviation is greater than the distance the mean is from zero. For normal distributional rules to be true in this case means that a substantial number of respondents would have had to have scores that are less than zero (ie., out towards the 3.8 - 4.2 range of the standard deviation). Negative scores require, yup, that the subjects needed to give points back in some magic way.
That’s unlikely. So, obviously, the distribution isn’t normal. And, it can’t be normal, anyway, because it’s a count, and counts aren’t normal, but whatever.
So, any traditional significance coefficient, or confidence interval, is largely random — it might be right, but you can’t assess how likely it is to be right.
They then took this wildly strange variable and broke it up into thirds (low, medium, high). (Humorously, they refer to these as “tertiles” which is correct, but incredibly pompous.) Low was defined as 0, medium 1 - 4, high as 5 or more. Keep in mind that a 5 — the “highs” — had about 2 songs out of the 10 overall. By breaking the results up, they are able to obfuscate the non-normality and dial up any potential relationships.
Back to the research, but this time on the alcohol brand recognition side. Only about 8% of all respondents were able to correctly identify even a single brand across the 10 songs. In other words, 92% of the respondents couldn’t figure out a single brand. Naturally, when facing what is called a wildly unbalanced data set, they did what you shouldn’t do — split the results into 2 groups (yes or no), which makes it sound like you have half the results in each group.
Their behavior measures make more sense mathematically, but as confusing to me semantically. Their most serious categories are “have reported bingeing at least monthly” (which is >6 drinks on one occasion — yikes!) and “reported problems such as injuries due to alcohol” (which sounds absolutely terrifying). The “injuries” category includes a bunch of scary things, but then also includes “feeling guilty after drinking” (which seems far less serious than the category sounds). Since answering yes to any of the 7 injury questions makes "been injured" a yes, I wonder how much of that data is due to guilt rather than injuries. Don't know. But regardless that’s a semantic issue, not directly a math one, so I’ll move on.
...to another point they get wrong. The authors use an Odds Ratio to see if the groups have differential odds for the alcohol outcomes based on the segments from above. I think they really mean a risk ratio, since that captures relative likelihood of something bad happening in difference cases. Odds ratio and risk ratio are the same under something called the “rare disease assumption” which basically says “it happens so frequently that the risk can’t be measured directly”. I don't think that applies here, so I think they simply used the wrong statistic.
Here's where it gets interesting. If you read the text of their results, it sounds like the results are stunningly clear — this factor has a OR = 2 (by which they mean that this is twice as likely as that), this other factor has an OR of about 3, and so on. They also give confidence intervals of the OR (which can’t be computed, since we are in a wildly non-normal distribution that is then chopped up into dichotomous variables that look balanced, but aren’t). The results section of their paper — which drove the USA Today article — reads like a very clear indictment of alcohol in music.
However, when you look at the tables of their results, you discover that in almost all cases, the OR is driven by an alias to age — People who are 15, for example, are wildly less likely to have had a full drink than the sample, while people who are 21 are wildly more likely to have had a full drink than the sample. Right. 21 is the legal drinking age. It’s possible that on or about one's 21st birthday *everyone* goes out and drinks. This weird age effect, if ignored, would create the entire OR finding about binge drinking.
White on
White translucent capes
--Bauhaus, "Bela Lugosi's Dead"
In other words, their data — insofar as I can recreate it from their paper — simply shows that 21 year olds drink more than 15 year olds. Which isn’t terribly surprising.
This kind of thing really annoys me. There is some scary data in this paper about binge drinking. There should be a policy implication there, or something we should be paying attention to. By wrapping that fact in a bodyguard of mathematical lies, they created an unworthy article (that was USA Today-worthy) that added absolutely nothing to the ongoing policy debate about music, media, and development.
We can do better.