A piece appeared last week in the WSJ on the rating system of wines and how a particular winery proprietor and Professor conceived and executed a blinded trial to assess the ability of wine tasting to provide any sort of consistent and reproducible results. The piece can be found at:
A relevant excerpt which summarizes the key findings is:
"The unlikely revolutionary is a soft-spoken fellow named Robert Hodgson, a retired professor who taught statistics at Humboldt State University. Since 1976, Mr. Hodgson has also been the proprietor of Fieldbrook Winery, a small operation that puts out about 10 wines each year, selling 1,500 cases
A few years ago, Mr. Hodgson began wondering how wines, such as his own, can win a gold medal at one competition, and "end up in the pooper" at others. He decided to take a course in wine judging, and met G.M "Pooch" Pucilowski, chief judge at the California State Fair wine competition, North America's oldest and most prestigious. Mr. Hodgson joined the Wine Competition's advisory board, and eventually "begged" to run a controlled scientific study of the tastings, conducted in the same manner as the real-world tastings. The board agreed, but expected the results to be kept confidential.....
In his first study, each year, for four years, Mr. Hodgson served actual panels of California State Fair Wine Competition judges—some 70 judges each year—about 100 wines over a two-day period. He employed the same blind tasting process as the actual competition. In Mr. Hodgson's study, however, every wine was presented to each judge three different times, each time drawn from the same bottle.
The results astonished Mr. Hodgson. The judges' wine ratings typically varied by ±4 points on a standard ratings scale running from 80 to 100. A wine rated 91 on one tasting would often be rated an 87 or 95 on the next. Some of the judges did much worse, and only about one in 10 regularly rated the same wine within a range of ±2 points.
Mr. Hodgson also found that the judges whose ratings were most consistent in any given year landed in the middle of the pack in other years, suggesting that their consistent performance that year had simply been due to chance."
There is an important lesson to be learned from this study. Perhaps the most important knowledge that anyone can have is the knowledge required to recognize that you don't know. This actually reminded me of an experience I had in my internship in the late 1970's when I spent a year as a general medical intern. I worked in a general hospital caring for patients with acute bread and butter medical problems. We had a substantial numbers of patients with acute strokes and it was at the beginning of the era of sophisticated imaging tools. However, our specific hospital did not have a CAT scanner. We were able to get scans in a not so timely fashion from another sister hospital in the area.
For the first half of the year the medical service would admit the patient and get an immediate Neurology consultation. The Neurology resident would come with their big black bag and use the time honored tools of the bedside neurology exam to localize the lesion. They would then write a detailed noted confidently describing where the stroke lesion resided in the brain. They did so based upon years of experience using these tools and there was great confidence in the utility of thse time honored tools. After their assessment the Neurology team would then recommend the patient receive a CAT scan of the brain when the test could be done.
Unfortunately, the time honored bedside tools had never actually been validated since the tools to validate them had not yet been developed; that is until new imaging tools were developed and deployed in the late 1970's. During the first six months of my internship the sequence was always neurology evaluation, detailed report, and then CAT scan. The results were remarkable. The CAT scan showed the bedside neurological evaluation was basically always wrong when it came to identifying the actual site of the stroke. The second half of my internship the sequence was always neurology evaluation, CAT scan and then detailed report. When confronted with unambiguous evidence that the time honored tools were terribly flawed, these tools where quickly jettisoned.
There was no strong financial stake held in the bedside neurological assessment. In fact, it is not at all surprising that a tedious, time consuming, and poorly compensated activity such as this would become history when a better tool came along. However, strongly held but weakly supported beliefs are not always let go so readily, particularly when they serve as the underpinnings for financially lucrative activities.
Under those circumstances it is devilishly difficult to get the parties who have something at risk to objectively ask fundamental questions. How do I know what I believe to be true is actually true? What knowledge is really knowable and how do I actually know this to be true? While this may all sound to be the stuff of late night bull sessions in a college dormitory, it really is central to any professional activity where clients come to you as a trusted person of authority. Strongly held beliefs supported by nothing more than strongly held beliefs tend to serve only as rationalizations of self serving activities.