Science Memes

24 July 2011 by Drew Crawford Published in: rants 2 comments

The trouble with science is, we tend to take results out of context–to be the magical universal truth instead of something that happened at a particular time under particular conditions.

Audio compression

Daniel Pogue from the NY Times writes:

Yes, these songs are encoded at a higher bit rate (256 kbps instead of 128). But does that mean twice the quality?
Not by a long shot. So far, two different tech outfits — Gizmodo and Maximum PC — have performed careful comparison tests, A/B tests using fancy headphones, to see if people can hear any difference between the old and new iTunes formats. And you know what?
People really can’t.

This is true, probably. “People” can’t hear the difference between bitrates–but I am not some arbitrary person, I am a specific person. And I can tell the difference.

Methodology

I chose three specific tracks from my library with which I am intimately familiar and which I know from experience make lesser encoders cry. Group “A” and “B” both had all three tracks, with A being one bitrate and B being another bitrate. I could listen as long as I wanted to any track, but had to ultimately select the group with the higher bitrate. I used a pair of Tapco S8 studio monitors and a Sony MDR-7508 headset in any combination I wanted (my own equipment, the response of which I am intimately familiar).

A couple of things to note here. First of all, while being fully scientifically valid, my methodology is more realistic to my own listening (and less clinical). For starters, playing whole songs, ability to repeat, etc. Many people in the cited “studies” do not use their own equipment or music library, and in my experience it takes time to acclimate to new equipment and music, making their results somewhat questionable. Second, my equipment (for some definition of reasonable price, audiophiles man the vinyl-mobile!) provides fairly flat frequency response, unlike a lot of other fancy equipment which produces some sort of more aesthetically-appealing response curve which can mask compression artifacts, but in my personal judgment is not true to the original track.

And let’s face it–in real life, you listen to music on your own equipment over and over again, not in some room with questionable acoustical properties!

Results

Lame VBR 3.x, n >= 30

64kbps vs 128kbps (p = .01)

128kbps vs 192kbps (p = .01)

192kbps vs 256kbps (p = .03)

256kbps vs 320kbps (p = .07)

AAC VBR (iTunes 9), n >= 30

64kbps vs 128kbps (p = .01)

128kbps vs 192kbps (p = .02)

192kbps vs 256kbps (p = .05)

256kbps vs 320kbps (p = .16)

So as you can see, I can hear MP3 artifacts up to probably 320kbps, and AAC artifacts up to 256kbps. I don’t necessarily mean to slam those “studies”, as they measure the general population, and I am a very specific person. But, when people on HN trot out things like “320kbps is just a waste of bandwidth“, we’re stretching the available “science” quite a bit too far.

Ask yourself this question: would Apple double their bandwidth costs to deliver 256kbps AAC files to customers unless some nontrivial fraction of their customer base could hear a difference? Don’t be so cynical as to suggest that it’s “just” marketing. Steve Jobs is in the habit of telling customers when they’re wrong (see Antennagate). They wouldn’t have done it unless their internal testing (much more rigorous than Gizmodo or me) revealed that some part of their customer base could tell. And incidentally, 256kbps AAC is right around my cutoff point, so they picked the right bitrate. Not a coincidence.

Bottled Water

Next on the “pop science” chopping block: bottled water. Penn & Teller famously did an expose on people being unable to taste the difference in bottled waters, which makes for great TV, but terrible science. The “prevailing wisdom” is that people simply can’t tell which water is which…

…and perhaps that might be true for the general population, but not for me.

Methodology

I poured “bottled” water (reverse osmosis-purified from a grocery store machine) and Austin tap water into two identical glasses and chilled them to an identical temperature. Never mentioned in any blind taste test writeup I’ve ever read, temperature is critical to my ability to distinguish different waters. The ideal temperature for me is slightly below room temperature–but above “refrigerated” or “cold” temperatures. Excess coldness numbs the tongue and masks the flavor; warm water seems to have a similar, but more subtle, effect.

Every couple of hours, I would spin the waters on a rotating platform (think like a record player) to confuse the order, and taste test each glass.

Results

In 24 trials, I made only one misidentification.

Further comments

I’ve always been able to give information about water by taste. What I actually “taste” in water is the mineral content remaining in the water from the various types of filtration. Municipal water supplies are generally processed through various forms of filtration (usually sedimentation), together with disinfecting agents. Some taste superior to others. The worst type of filtration to my taste is actually “activated carbon” filtration, a small-scale filtration that is used in some types of bottled waters and many cheap home tap filters. Reverse osmosis is my preferred type of water. With further research and access to different filtration systems I could probably narrow my tasting abilities to particular minerals left or removed by various types of treatment.

Conclusion

The moral of the story is, don’t confuse some article Gizmodo wrote about four people listening to music or some TV show hidden-camera BS (even a show produced by otherwise-intelligent skeptics) with real science. (Or this blog post either, for that matter.) Those “studies” don’t go beyond the middle school science fair level of rigor. While they’re maybe better than “absolutely no” evidence, they’re a far cry from the longitudinal studies that people pretend they are on the Internet.

Second, suppose that both of the “studies” in question actually were longitudinal, rigorous studies with sample sizes in the hundreds or thousands. Even then, that doesn’t say anything about my ability to hear different bit rates or taste water purification. All it says is that most people can’t. I’m not most people. I’m a particular person. And the birthday problem tells us that every particular person is likely to be superhuman in some area. Even if it’s an area as mundane as tasting water.

Want me to build your app / consult for your company / speak at your event? Good news! I'm an iOS developer for hire.

Like this post? Contribute to the coffee fund so I can write more like it.

Comments

Andy Brice

Mon 25th Jul 2011 at 4:15 am
I understand that many HiFi magazines refuse to do blind testing. The obvious implication being that even their ‘golden ears’ can’t really tell the difference between expensive high end equipment (let alone cables!).

How did you ‘blind’ the audio test?
kats

Tue 26th Jul 2011 at 10:16 am
This reminds me of the TED Talk by Malcolm Gladwell: http://www.ted.com/talks/view/id/20

Comments are closed.

Science Memes

Audio compression

Methodology

Results

Lame VBR 3.x, n >= 30

AAC VBR (iTunes 9), n >= 30

Bottled Water

Methodology

Results

Further comments

Conclusion

Comments

Tags

Subscribe via e-mail

Subscribe via e-mail