After my hysterics and humiliation subsided, it was time for reflection. I knew perfectly well that as the number of trials grew, the probability of picking the 'R' earphone would remain 50% if the earphones were marked 'L' and 'R' and there were no other variables that would consistently bias the one that I grabbed. Yet I had a 100% 'R' sample proportion after dozens of trials. I am ambidextrous and the way I uncoiled and recoiled the earbuds would not have an effect, especially as the darned things always got tangled in their storage pouch. The elementary lesson of coin-flipping probability imparts that no matter how many times in a row the coin lands heads up, the cosmic accountants do not owe me a 'tails' flip any more than I was due to select the 'L' earbud the next time.
There were two confounding elements to this situation that should have given any probabilistic statistician pause for thought. First, there were times that the earbuds were uncomfortable as I wore them. Second, some songs that I knew fairly well from past listening sounded indeterminately different. These elements harbored difficulties in investigation that encouraged improbable and empirically lazy explanations. For instance, I dismissed the occasional physical discomfort of the earbuds as an inconsistency in my own physicality; why shouldn’t the furls of my ear cartilage change day to day? I use my device only in shuffle mode and the order of tracks is random. Scanning back to analyze a familiar song that did not sound right was technologically difficult and often impractical in my normal usage context (e.g. riding my bike on a rough trail). Even if I had pinned the stereo mixing as suspect, I was unable to validate it because of inconsistencies in orientation to my home speakers and therefore not having reliable previous empirical observation. I do not have enough of an aurally eidetic memory to catch when, say, the bass should thump on the left and the horns should blast on the right.
In this case, reasonable doubt failed to overcome the awe of my empirical observations and make the transgressing of face validity obvious. Good research should always keep reasonable doubt at the fore; it is part of the regular and necessary consideration of face validity in any experiment. Theories attribute stock market performance to lunar cycles, athletes repeatedly wear filthy 'lucky socks' to maintain a winning streak, and gamblers perform elaborate rituals when throwing dice in order to increase their chances of success. The irony is that assessments of face validity and employing reasonable doubt – being qualities of a human being’s innate rational faculty that are unexplainably simple – are often suspended when our preferences are better served by the completely illogical explanation of luck. This could be construed as a nice testament to the power of hope and expectations that underdogs can triumph in the face of insurmountable odds, but that is an injustice to our more useful faith in the explanatory powers of logic.
The mystery between my earbuds was not a statistical anomaly; it was merely a production mistake. A quick shot of reasonable doubt and test of face validity could have illuminated that the problem in my analysis was in my own measurement of the data. I assumed that the indicator of an earbud’s orientation would be completely encompassed in its marking as 'L' or 'R' and that each one of the pair would have a unique marking. Statistics textbooks do not take special pains to discuss the chance that the coin in the probability module is correctly minted with unique sides. In fact, the predominant theme when it comes to worries over bias in research is the common adage that, 'statistics don’t lie, but liars use statistics.' Who could imagine that my little earbuds were dishonest?
The grand lesson of this story is that reasonable doubt and face validity are essential parts of statistical analyses. They serve to incorporate a uniquely fantastic and effective human capacity for detecting improbability that is more nuanced than any mathematical filtering algorithm. Reasonable doubt and face validity as elements of investigative practice are supported throughout the history of knowledge. Great minds like Thomas Aquinas, William of Ockham (Occam’s razor), Isaac Newton, and Bertrand Russell all offered some formulation of the logical postulate that the simplest, most parsimonious and obvious explanation is likely. Appropriately, the most elementary phrasing comes from Sherlock Holmes2: 'When you have ruled out the impossible, whatever remains, however improbable, must be the truth.'
References
- 1. Babbie, E. (2011). The Basics of Social Research. Belmont, CA: Wadsworth.
- 2. Doyle, A. (1890). The Sign of the Four. London: Spencer Blackett.