Noise is bigger than Bias

NOISE-A Flaw in Human Judgment By Daniel Kahneman, Olivier Sibony and Cass R. Sunstein [new release 2021]

In psychology or counselling “pattern noise” can adversely affect the assessment stage of the process. However, “occasion noise” can adversely affect the whole of counselling

Definition:

Bias is a disproportionate weight in favour of or against an idea or thing, usually in a way that is closed-minded, prejudicial, or unfair. Biases can be innate or learned. People may develop biases for or against an individual, a group, or a belief.”

---------

Description from the 2021 Book “Noise- A Flaw in Human Judgment” by Kahneman et.al, on page 78 “Speaking of analysing Noise”.

Level noise is when judges show different levels of severity. Pattern noise is when they disagree with one another on which defendants deserve more severe or more lenient treatment. Part of pattern noise is Occasion noise-when judges disagree with themselves. [eg the occasion of their football team losing on the weekend]

What are decision making judgments. Page 371 states:

“The goal of judgment is accuracy, not individual expression”…..”But when it comes to making a judgment…expressions of individuality are a source of noise.”

In the cognitive principle matrix theory, the main cause of “noise” would be explained by the two hemispheres [left -conscious and right-subconscious], not working together to make a correct judgment or decision. Both the conscious and the subconscious mind work on the goal together. The conscious mind uses automatic thoughts based on logic to achieve the goal and it applies memory in the form of "knowledge, skill and experience" received from the subconscious. The conscious mind will also apply the "rules and boundaries" supplied by the subconscious mind. However, pattern noise and level noise can come from core beliefs held in the subconscious mind which will affect decision making in the conscious mind.

No alt text provided for this image

Occasion noise comes in the form of triggers from two sources, namely internal and external. Eg A comment from a client may trigger a negative pattern match and an emotion from the past. The subconscious mind then makes a negative prediction which is not relevant, but adversely affects the evaluation in the conscious mind. Likewise, the subconscious mind may have a sensing experience and all of a sudden feels tired and looses concentration which affects the evaluation in the conscious mind.

No alt text provided for this image

There is also a second source of pattern noise which most people aren't aware of which comes from the Real Self. Note: In the diagram below Sys 2 is the conscious mind and Sys 1 is the subconscious mind. Om most occasions the remembering self will make consistent decisions based on the past, unless new information is available. The real self includes the conscience and holds "Your truth". A new trigger in the Sys 2 could trigger and the Real self to override the Remembering self and make a better decision. The conscious mind may and may not be aware why they changed their decision.

No alt text provided for this image

However, most times bias comes in the form of defense mechanisms to protect the Remembering self and the Experiencing self [ego], and true decisions aren't made by the Real self.

No alt text provided for this image

---------------------------------------------------------------------------------------------------------------

The following is a review of the book by Steven Brill -May 18, 2021

NOISE-A Flaw in Human Judgment By Daniel Kahneman, Olivier Sibony and Cass R. Sunstein

A study of 1.5 million cases found that when judges are passing down sentences on days following a loss by the local city’s football team, they tend to be tougher than on days following a win. The study was consistent with a steady stream of anecdotal reports beginning in the 1970s that showed sentencing decisions for the same crime varied dramatically — indeed scandalously — for individual judges and also depending on which judge drew a particular case.

A study at an oncology center found that the diagnostic accuracy of melanomas was only 64 percent, meaning that doctors misdiagnosed melanomas in one of every three lesions.

When two psychiatrists conducted independent reviews of 426 patients in state hospitals, they came to the equivalent of a tossup: agreement 50 percent of the time on what kind of mental illness was present.

When a large insurance company, concerned about quality control, asked its underwriters, who determine premium rates based on risk assessments, to come up with estimates for the same group of sample cases, their suggested premiums varied by an eye-popping median of 55 percent, meaning that one adjuster might have set a premium at $9,500 while a colleague set it at $16,700.

Doctors are more likely to order cancer screenings for patients they see early in the morning than late in the afternoon.

If employers rely on only one job interview to pick a candidate from among a similarly qualified group, the chances that this candidate will indeed perform better than the others are about 56 percent to 61 percent. That’s “somewhat better than flipping a coin, for sure,” the authors of “Noise” write, “but hardly a fail-safe way to make important decisions.”

In a study of the effectiveness of putting calorie counts on menu items, consumers were more likely to make lower-calorie choices if the labels were placed to the left of the food item rather than the right.

“When calories are on the left, consumers receive that information first and evidently think ‘a lot of calories!’ or ‘not so many calories!’ before they see the item,” Daniel Kahneman, Olivier Sibony and Cass R. Sunstein explain in this tour de force of scholarship and clear writing. “By contrast, when people see the food item first, they apparently think ‘delicious!’ or ‘not so great!’ before they see the calorie label. Here again, their initial reaction greatly affects their choices.” This hypothesis is supported, the authors write in a typically clever aside, by the “finding that for Hebrew speakers, who read right to left, the calorie label has a significantly larger impact if it is on the right rather than the left.”

These inconsistencies are all about noise, which Kahneman, Sibony and Sunstein define as “unwanted variability in judgments.”

Sometimes we treasure variability — in artistic tastes, political views or picking friends. But in many situations, we seek consistency: medicine, criminal justice, child custody decisions, economic forecasts, hiring, college admissions, fingerprint analysis or business choices about whether to greenlight a movie or consummate a merger.

Consistency equals fairness. If bias can be eliminated and sensible processes put in place, we should be able to arrive at the “right” result. Lack of consistency too often produces the wrong results because it’s often no better, the authors write, than the random judgments of “a dart-throwing chimpanzee.” And, of course, unexplained inconsistency undermines credibility and the systems in which those judgements are made.

As the authors explain in their introduction, a team of target shooters whose shots always fall to the right of the bull’s-eye is exhibiting a bias, as is a judge who always sentences Black people more harshly. That’s bad, but at least they are consistent, which means the biases can be identified and corrected. But another team whose shots are scattered in different directions away from the target is shooting noisily, and that’s harder to correct. A third team whose shots all go to the left of the bull’s-eye but are scattered high and low is both biased and noisy.

Despite its prominence in so many realms of human judgment, the authors note that “noise is rarely recognized,” let alone counteracted. Which is why the parade of noise examples that the authors provide are so compelling, and why gathering the examples in one place to demonstrate the cost of noise and then suggesting noise reduction techniques, or “decision hygiene,” makes this book so important. We are living in a moment of rampant polarization and distrust in the fundamental institutions that underpin civil society. Eradicating the noise that leads to random, unfair decisions will help us regain trust in one another.

“Noise” seems certain to make a mark by calling attention to the problem and providing a tangible guide to reducing it. Despite the authors’ intimidating academic credentials, they take pains to explain, even with welcome redundancy, their various categories of noise, the experiments and formulas that they introduce, as well as their conclusions and solutions.

Some decision hygiene is relatively easy. “Occasion noise” — the problem of a judge handing out stiffer sentences depending on whether a favorite sports team won or lost or whether it’s before or after lunch (yes, studies have found that, too) — can, like bias, be recognized during a “noise audit” and presumably dealt with. “System noise,” in which insurance adjusters, doctors, project planners or business strategists assess the same facts with that unfortunate variability, requires a more energetic decision hygiene.

However, as the authors point out, the steps of decision hygiene — like those of common hygiene, such as washing hands — “can be tedious. Their benefits are not directly visible; you might never know what problem they prevented from occurring.”

One example of effective decision hygiene has to do with the Apgar score, which looks at the overall health of newborns. Doctors score the baby on five criteria ranging from appearance of its skin to its heart rate, with scores of zero to two for each category. If the scores, once added up, arrive at a seven or higher, the baby is considered to be in good health.

“The Apgar score exemplifies how guidelines work and why they reduce noise,” the authors explain. “Unlike rules or algorithms, guidelines do not eliminate the need for judgment: The decision is not a straightforward computation. Disagreement remains possible on each of the components and hence on the final conclusion. Yet guidelines succeed in reducing noise because they decompose a complex decision into a number of easier subjudgments on predefined dimensions.”

Another compelling example of “decomposing” a decision involves a case study of a corporate merger decision. Rather than the bankers and executive team giving the company’s board the usual pro or con presentation, the C.E.O. first tasked various senior executives to come up with their assessments on seven aspects of the merger, ranging from talent of the team to be acquired to the possible financial benefits. Importantly, there were separate teams working on each aspect, so that their judgment was not colored by positive or negative noise emanating from another verdict, falling into the trap of what the authors call “excessive coherence.”

It’s also for that reason that none of four people interviewing a job candidate should know what their colleagues’ opinions are until they write down their own.

In other arenas, such as insurance underwriting, “Noise” does lean more toward establishing hard and fast rules and even using algorithms, which the authors assert should, in theory, “eliminate noise entirely.” However, they acknowledge that the way information gets entered into algorithms can itself be undermined by bias or noise.

The authors are sensitive to the costs of noise reduction, a point they illustrate in part with the story of the company that tangled itself up in an annual employee review process that included an overly complicated feedback questionnaire. Forty-six ratings on eleven dimensions for each rater and person being rated is just too much.

Similarly, the costs of eliminating noise have to be weighed. A fifth grader’s essay will be more fairly and accurately graded if five teachers read it independently using five or 10 criteria and averaging their assessments, instead of one teacher reading it and providing an overall impression. So will a high school senior’s college application. We can accept the noise in the fifth grader’s essay grade much more easily than we can accept it when deciding a college applicant’s fate.

Beyond bureaucracy and cost, there’s a loss of dignity when people are treated like numbers instead of individuals. There’s also the danger of forcing a rule — think of Jack Welch, the former C.E.O. of General Electric, who made it a set practice to fire a percentage of his lowest performers each year, even if many were still performing well. Forced ranking in this context, or in the case of an elite military unit, makes no sense, and relative scales and relative judgments would have made for better decision hygiene. In other situations, the opposite approach can create problems: rating everyone individually with no comparisons, such as the loosey-goosey standards that allow over 98 percent of the federal civil servant work force to be judged “fully successful.”

Thus, the authors cite the lawyer and author Philip Howard, who in books such as “The Death of Common Sense” has documented the dangers of bureaucracy, laws, rules and numerical ratings replacing human judgment in so many decisions.

Kahneman, Sibony and Sunstein also acknowledge the judicial backlash against federal sentencing guidelines passed in 1987 that were meant to reduce the massive inconsistencies. Many judges fervently believed that these federal guidelines — and even more stringent ones legislated in many states — sidelined them from making the human judgments they were put on the bench to make. That continuing backlash, and the fact that prosecutors and judges learned to game the new rules, has been a key force behind recent criminal justice reform efforts.

The authors’ general argument, however, is that there is now so much noise that a major hygiene effort is in order across multiple disciplines. In too many arenas, they maintain persuasively, we’ve allowed too much noise at too high a cost.

The trick is finding the right balance, not looking for perfect fairness or accuracy, which will always be illusory. A rule that sets a birth date of Jan. 1 for entrance into kindergarten is going to be arbitrary and unfair to the child born at 11:59 the night before. (Although another rule will give her parents a bonus tax deduction for being born in that earlier year.) But it’s a better way to choose who gets to start elementary school than interviewing every 4- to 6-year-old.

A digital body scan examined only by an algorithm might be an efficient way to check for melanoma, but I’d rather trust the terrific doctor who checks me every few months. Then again, I wouldn’t mind if he checked his conclusion against the algorithm.

“Noise” is about how our most important institutions can make decisions that are more fair, more accurate and more credible. That its prescriptions will not achieve perfect fairness and credibility, while creating pitfalls of their own, is no reason to turn away from this welcome handbook for making life’s lottery a lot more coherent.

Comments are closed.