Human beings understand the world through storytelling. The organizing narrative structure of a beginning, middle, and end helps us make sense of complexity. We are all hard-wired to love stories. Why do we love stories? Typically, stories have emotional content, which appeals to our conscious understanding of the presentation, our subconscious intake of information in memory, and our ability to rationalize through perceived causal relationships.
Many of you may be familiar with the famous phrase “correlation does not imply causation” that we learn in statistics class. The shocking thing is how often executives, managers and highly educated people who are also aware of the correlation/causation difference still fall into this trap every day. In industries that rely on data-driven analyses or policy establishment, or even in a friendly conversation with a colleague, stories trapped in the correlation/causation fallacy may negatively govern the way we think. In such cases, selecting observations or data to support a narrative (or conversely selecting a narrative to support an observation or data) becomes an incredibly persuasive tool we use to establish a chain of causation in our minds; a chain that can be destructively riddled with errors in judgment, and ultimately influence our beliefs and decisions.
Statistician Nassim Taleb coined this type of flawed logic, The Narrative Fallacy, in his 2007 book, The Black Swan. He opined that the Narrative Fallacy increases our “impression of understanding,” which negatively impacts our interpretation of facts, because we strive to force logical links between a series of independent phenomena. He stated: “The narrative fallacy addresses our limited ability to look at sequences of facts without weaving an explanation into them, or, equivalently, forcing a logical link, an arrow of relationship upon them. Explanations bind facts together. They make them all the more easily remembered; they help them make more sense. Where this propensity can go wrong is when it increases our impression of understanding.”
How do stories impact our impression of understanding data in science and policy? Well, we don’t just employ one logical flaw, but rather multiple logical fallacies and cognitive biases that work together to hinder our impression of understanding. As explained by Daniel Kahneman in his 2012 book, Thinking Fast and Slow: “the explanatory stories that people find compelling are simple rather than complex; are concrete rather than abstract… and focus on a few striking events.” Stories are then easily compounded by the halo effect—where people generate an overall impression of someone’s personality or characteristics based on an unrelated trait–when delivered by a person of perceived stature. Throw in the affect heuristic—in which people let their likes and dislikes determine their beliefs about the world, i.e. your political preference determines the arguments that you find compelling—and confirmation bias—the tendency to only seek out information that supports your existing viewpoint—and you start to realize why we’re facing such challenging times in understanding data and its implications at the intersection of science and policy.
As you’ve noticed, I’ve been highlighting a specific phase in Mr. Taleb’s definition: “impression of understanding.” What we observe in our own environments, what we choose to learn, what values we subscribe to, our likes, our dislikes, who we trust, etc., all contribute to our impression of understanding. As humans we are hard-wired to be skeptical or dismissive of data, observations, or opinions that go against our own impression of understanding. Most importantly, however, we form judgments, express opinions, and make decisions everyday through our own impressions of understanding.
A vast majority of these everyday judgments, opinions, and decisions are insignificant in a community or societal sense: Should I take a walk around the block during my lunch hour? Which is better for you: making your own meal or ordering take-out? Dogs are better pets than cats because they’re more playful. However, when we make decisions that potentially affect other people, whether directly or indirectly, I argue that we all should take the extra time to examine our own cognitive biases and impressions of understanding in order to make more wholesome and educated judgments, opinions, and decisions.
It’s Getting Hot in Here
Take the following example: If I say, “it’s too hot,” and you say, “it’s too cold,” those are just opinions, neither of us are right! If instead you state “it’s 70 degrees” then that would be a non-debatable fact and we can have a more fruitful conversation on the implications. Let’s consider another familiar analogy: the cup is 1/2 full and the cup is 1/2 empty are both accurate opinions (subjective reality) of the accurate statement “the 8oz cup has 4oz water in it” (objective reality). The latter is not any “truer” than the former.
In an era with an unparalleled abundance of facts (“big data” collected on everything, accelerating scientific progress, unprecedented innovation, etc.), we should be closer to the “truth” than any past generation has ever been. And yet our times are also filled with increasingly diverging opinions and beliefs, not only on ideologies, but also on core facts (fake news anyone?). Never in our modern era have facts and data been more abundant, and yet “truth” more elusive, relative and subjective.
Immanuel Kant said, “we never really experience reality but only perceive it through the veil of [our senses]” (he called it the “veil of perception”). It is good to remember that data in itself is an artificial human construct: IT IS how we perceive reality — and it’s appropriate to say it’s through a “veil” as well. Trusting data itself, rather than the person articulating the data and its purported significance, is a uniquely difficult task.
In the sections below, I highlight several logical fallacies and cognitive biases that significantly influence the interpretation of facts through data visualization and narrative. Some of these biases are inherent in our conscious and subconscious, others rooted in our intrinsic beliefs, culture and worldviews, and finally others in external factors, such as the group we happen to be in when we “consume” the facts.
“A Picture is Worth a Thousand Words”
You’ve heard the old adage. But its simplicity rings true in the way we interpret the presentation of data. The way we process information depends only partly on the presentation of information itself, and just as much on our own biased perception. If the visualization (i.e. charts, graphs, colors, etc.) of a data plot deals with the biases that the author introduces in conveying the data, our interpretation deals with the biases that the reader of the data introduces in making sense of the data (i.e. answering the “why”).
Consider the example above about the United States’ military engagement in Iraq. What opinions do you form about the data?
Your interpretation may differ from mine, of course. What is striking in the chart is the use of the red “bloody” color and the inverted axis that suggests dripping blood. It is extremely effective in stirring the reader’s emotions. The title “Iraq’s Bloody Toll” is quite fitting as well.
Now consider the chart above. What opinions do you form?
Did you form the same opinions about the two charts? Most likely not. In fact, not only do both charts include the same data, but both charts are the same — all the illustrator did was rotate the chart upwards and give it a different color. Of course, a different title is now more appropriate “Iraq: Deaths on the decline”.
The point of these illustrations is not to show that the data itself is wrong. But rather how the author’s presentation of the data, as well as our own subjective impressions of the data lead us to make one conclusion or another (i.e. the glass is half-empty or half-full). As you can probably imagine, this type of bias is present in all industries and occupations. Common examples include the financial industry, sales analysis, supply chain/logistics, healthcare, etc.
Intrinsic Beliefs Bias Our Interpretation of Data
Subconscious biases are intrinsic to human nature and, by definition, common across individuals. In contrast, our values and beliefs are specific to each person as they are influenced by our individual education, upbringing, cultural environment, past experience, etc. The topic of climate change is ideal to showcase how these values can affect the interpretation of the data. The following chart from NOAA shows the increase in CO2 in the earth’s atmosphere, which has recently surpassed 410ppm.
CO2 PPM trend — source: climate.go
Mark Levin, the conservative commentator and radio host, used this information to long argue that CO2 is only a trace gas in the planet’s atmosphere. He condemns “the establishment[’s]” often strong efforts to dissuade counterarguments stating: “They never mention what a tiny fraction of the atmosphere CO2 is.” https://www.washingtonexaminer.com/americans-for-carbon-dioxide-mark-levins-idea-whose-time-has-come
From a pure factual perspective, Levin is right: CO2 is indeed a tiny fraction of the atmosphere. But perhaps someone should give him a glass of water with a tiny fraction of arsenic in it and ask him to show us how brave he is — after all it’s so tiny …
Joking aside, this is a perfect illustration of how adherence to moral values and group thinking (in this case the climate-change denialists group) can heavily affect facts interpretation. A more nuanced chart could be the following, which looks at a much longer time period:
CO2 ppm trend — Source: NOAA at Climate.gov
This shows that the 410ppm is unprecedented, and a more dramatic figure to quote would be “the CO2 concentration today is more than 30% higher than it’s historical peak over the last 800,000 years.” But what we see with Levin is confirmation bias at work: when exposed to several facts, people will pick and choose the data that best supports their current beliefs and ignore those counter to their values and pre-existing beliefs. And confirmation bias is not a one stop-shop affecting only one political ideology over the other. We see the same confirmation bias on the other end of the political spectrum with some people picking the data that will make it sound most dramatic.
While the facts and data are often undisputed, there is a near-unanimous, yet no absolute consensus in the scientific community on the causal relationships (e.g. is this the cause of human action?) and the forecasting models (i.e. how fast will the impact happen?) regarding earth’s climate. This is where the feeding frenzy of political diatribe occurs. The scientific near-unanimity on climate change itself, the causes, and the prevention techniques becomes irrelevant to the climate-change denialists group. They disregard probability, instead attacking the people (or the “establishment”), and creating a narrative (the word “hoax” comes to mind) based on data that conforms to their pre-existing beliefs and values.
Now there is WAY more to this debate than what I have outlined in a simple paragraph, but my goal is to simply illustrate how one’s subjective beliefs and values can inhibit one’s ability to formulate wholesome opinions and make well-rounded decisions when presented with data. I know politics is a hot-button issue, so I want to be very clear in saying that these biases affect all people regardless of political persuasion.
External Factors are Important Too!
I am borrowing this example from Daniel Kahneman’s book, Thinking Fast and Slow, but I think it does a great job illustrating how external factors influence our subjective interpretations of events and even our own opinions.
Consider the following:
You are a data analyst in one of the leading private laboratory companies developing a new cancer treatment and have been tasked by the leadership to understand the geographies with highest incidence of kidney cancer. You do your research and present the following data map to the leaders. Note: This scenario is purely fictitious but the data is real from the National Cancer Institute. Still, form a conclusion yourself from the data in answering where and why the incidence of kidney cancer is higher in some places versus others.
Incidence rate of kidney cancer per 100,000 vs. US Population by county — Source: data by Howard Wainer and Harris L. Zwerling — visualization by Wissam Kahi
You show the chart to the executive leaders assembled around the table and for the first minute or so nobody knows what to make of it. But just then the CEO who has a sharp eye says: “I get it! The counties where the cancer incidence is highest are mostly rural, sparsely populated in the Midwest, and the South. These states have higher poverty, a higher fat diet, too much alcohol and tobacco and no access to good medical care. It makes total sense!” And immediately now everyone sees the pattern and nods in agreement.
If you have nodded in agreement as well, think again, because the conclusion is in fact false. Indeed, the counties where the cancer incidence is lowest are also rural, sparsely populated and in the Midwest, South and West. The opinion shift to one conclusion was driven by the power dynamics, role modeling, and influence from the external factors (in this instance the CEO). While this example has been explicitly exaggerated for the purpose of illustration, note that this happens in organization extremely frequently, albeit in more subtle ways. Examples of how these group dynamics emerge include for example who has stronger character or authority, or sometimes simply who gets to be the first at stating an opinion.
The rational explanation here, however, is the law of small numbers: smaller sample populations are more likely to have higher variances of the mean incidence rate. As a result, it is more likely to find extremes (both low or high) in the smaller samples (in this case smaller counties) whereas the larger samples will be tend to be closer to the mean with less variation (indeed you can see a higher % of larger counties in the “moderate” category.)
The above explanation is essentially random—there is no causal reason why smaller counties would have lower or higher incidence. But our minds don’t like randomness and will always actively seek a causal relationship, i.e. The Narrative Fallacy!
Mental Shortcuts Causing Problems
If you just skipped to this section without reading the rest of this article, you’ve taken a shortcut—making me cry in the process—but a shortcut nonetheless. You made a conscious decision to take a shortcut and skip to this section. In addition to these conscious shortcuts, we make subconscious shortcuts everyday in forming opinions and making decisions. These subconscious shortcuts come in the form of heuristics.
I simply wouldn’t be doing the psychology-world or data-analytics-world justice by discussing the narrative fallacy without also discussing heuristics (availability and representative) and the plausibility-probability paradigm. Heuristics are cognitive shortcuts we use when we must make a decision, but lack either ample time or the accurate information necessary to make the decision. Heuristics are advantageous in that they aid in quick decision-making, but the use of heuristics can lead to inaccurate predictions. In general, heuristics are automatic cognitive processes; that is, people use them in decision-making situations without necessarily being aware that they are doing so.
For instance, if I were to ask you: Do more words in the English language begin with the letter “K,” or have “K” as the third letter?
Most likely (in the actual study it was over 70% of people) you’ve answered, incorrectly, that more words in the English language begin with the letter “K” (“K” is the third letter twice as many times). This is because it is much easier for people to think of words that begin with “K” than words that have “K” as the third letter. Since words that begin with K are easier to think of, it seems like there are more of them. This is a prime example of the availability heuristic—i.e. making a judgment about something based on how available examples are in your mind.
Representativeness is used when we judge the probability that an object or event A belongs to class B by looking at the degree to which A resembles B. Simply put, we determine whether a person or an event should be put into a certain category by judging how similar that person or event is to the prototypical person or event of that category. Overall, the fallacy here is in assuming that similarity in one aspect leads to similarity in other aspects. While availability has more to do with memory of specific instances, representativeness has more to do with memory of a prototype, stereotype, or perceived average. When we make decisions based on representativeness, we are likely to make more errors by overestimating the likelihood that something will occur. Just because an event or object is representative does not mean its occurrence is more probable. In succumbing to the logical flaws associated with the representative heuristic, people will “force” statistical arrangements to represent beliefs about them.
Consider the following famous experiment (Kahneman, Daniel; Tversky, Amos. “On the psychology of prediction”. Psychological Review. 80 (4): 237–251):
Tom is a college student of high intelligence, although lacking in true creativity. He has a need for order and clarity, and for neat and tidy systems in which every detail finds its appropriate place. His writing is rather dull and mechanical, occasionally enlivened by somewhat corny puns and by flashes of imagination of the sci-fi type. He has a strong drive for competence. He seems to feel little sympathy for other people and does not enjoy interacting with others. Self-centered, he nonetheless has a deep moral sense.
Based on this description: In which of these fields is Tom most likely to be a student? How would you rank these fields of study in terms of the likelihood that Tom W is a student in that field (1 being most likely; 9 being least likely)?
1. Business Administration
2. Computer Science
3. Humanities and Education
7. Library Science
8. Physical and Life Sciences
9. Social Sciences and Social Work
When Kahneman and Tversky performed this experiment, here is the average order that their subjects provided:
1. Computer Science
3. Business Administration
4. Physical and Life Sciences
5. Library Sciences
8. Humanities and Education
9. Social Sciences and Social Work
In all likelihood, your answers were not so far off from the average. But here’s the mistake these subjects—and probably you—made: they based the most likely field of study (computer science) on Tom’s description, rather than sticking to the number of students likely to be studying each subject. Thus, the representative heuristic guided the evaluation that Tom was more likely to be studying computer science rather than the subject with a significantly higher probability of a random student to be studying (i.e. social sciences/social work).
Representativeness-based evaluations are a common cognitive shortcut across contexts. For example, a consumer may infer a relatively high product quality from a store (generic) brand if its packaging is designed to resemble a national brand (Kardes et al., 2004). Representativeness is also at work if people think that a very cold winter is indicative of the absence of global warming (Schubert & Stadelmann, 2015) or when gamblers prefer lottery tickets with random-looking number sequences (e.g. 7, 16, 23, …) to those with patterned sequences (e.g. 10, 20, 30, ….) (Krawczyk & Rachubik, 2019). In finance, investors may prefer to buy a stock that had abnormally high recent returns (the extrapolation bias) or misattribute a company’s positive characteristics (e.g., high quality goods) as an indicator of a good investment (Chen et al., 2007).
It may not sit well in our guts, but that’s what judgment biases, like the availability and representative heuristics, do. We make judgments and decisions everyday based on the representativeness of the information in relation to the instances and stereotypes that readily come to our minds. And unfortunately this flawed logic can affect your belief system, relationships, and even your business decisions and bottom-line.
We’ve now discussed how subconscious biases, intrinsic beliefs and values, and external factors may lead us to jump to conclusions, and ultimately hinder our data analysis and decision-making. The narrative fallacy and heuristics of judgment occur because we make the mental shortcut from our perceived plausibility of a scenario to its actual probability. The most coherent stories are not necessarily the most probable, but they are certainly plausible, and therefore, are easily confused by the unwary.
I urge you to stay vigilant in formulating a comprehensive opinion before jumping to conclusions. We all have much to learn, but taking the small step of understanding our own biases will only enhance our judgments and decisions, which will lead us to have a more empathetic, diplomatic, and conscientious life. How you choose remedy these biases is totally up to you. I will do this by asking more questions, engaging in active listening, and educating myself on a topic before making a conclusion about it. I encourage everyone to follow suit.