Berkson’s Bias and Trade-Offs

Abstract: Most real-world “trade-offs” in decision-making are artifacts of decision-maker preferences for certain types of outcomes. If we consider all possible choices, instead of just the ones that intuitively have some value to decision-makers (i.e., are not “dominated” by others), then there would be no inverse correlation between the relevant outcome axes. Thus it is the act of selection, equivalent to the conditional independence result of Berkson’s bias, that creates most classes of trade-offs. Acknowledgments: Thanks to Brian Potter for discussion related to this idea. First Published: 12/30/12. Last Updated: 12/30/12.

A decision is dominated if there are no possible combinations of parameters under which it can yield an outcome with a higher value. For example, consider the following types of MP3 players:

MP3 Players

Reasonable people might choose A or B if they prefer more storage or lower prices. However, no matter what your preferences on these two variables are, there is never any reason to choose option C over option A, since it is more expensive and has less storage capacity. In the real world, consumers tend to ignore choices that are dominated. Indeed, businesses tend to not even bother to offer such choices, unless they think that their product offers some other quality, such as brand name, that also will merit consumer consideration as a third parameter.

More generally, imagine that you have a decision to make and can choose only one option. You have many choices, each with two parameter values that you care about, which are independently sampled from a uniform distribution. The salient class will be those options that you’ll consider, because you wouldn’t even consider a choice if some other choice systematically dominates it.

Uniform Distributions

Just to be clear, note that the existence of a selection-based trade-off does not imply that the two variables are distributed independently of one another. My point is merely that the existence of a trade-off does not imply non-independence.

Berkson’s paradox says that two independent events can be made dependent by conditioning on a consequence of both of them. As an example, consider three events:

  • A = Randomly Choose Between Integers 1 to 10,
  • B = Randomly Choose Between Integers 1 to 10,
  • C = A + B,

which establishes a causal network of A -> C <- B. Imagine that you only select triplets (A, B, C) such that C > 16. Although A and B are each chosen randomly, such a stratification on the levels of C will induce an inverse correlation between A and B. And this is exactly the sort of selection you impose upon possible choices when you make a decision that involves a trade-off.

Sexy vs Skilled Actors

Here is Manfred’s nice example:

Sexy people are more likely to be hired as actors. Good actors are also more likely to be hired as actors. So if we look at “people who are actors,” then we’ll get people who are sexy but can’t really act, people who are sexy and can act, and people who can act and aren’t really sexy. If sexiness and acting ability are independent, these three groups will be about equally full.

Thus if we look at actors in general in our simple model, 2/3 of them will be sexy and 2/3 of them will be good actors. But of the ones who are sexy, only 1/2 will be good actors. So being sexy is correlated with being a bad actor! Not because sexiness rots your brain (a), or because acting well makes you ugly (b), and not because acting classes cause both good acting and ugliness, or diet pills cause both beauty and bad acting (c). Instead, it’s just because how we picked actors made sexiness and acting ability “compete for the same niche.”

Thus if you consider the set of Hollywood actors, you might conclude that beautiful people are less likely to be able to actually act. But this observed relationship doesn’t mean that these two qualities are intrinsically inversely correlated. Instead, selection effects explain the relationship quite well.

Sodium vs Sugar In Starbucks Drinks

As an example with real data, consider the macronutrients in the set of Starbucks drinks circa 2008. In particular, compare the sugar content of a drink to its salt content (the data now requires an e-mail request;code):

Starbucks Drinks

normalized to the number of calories, filtered for >50 calories only

Although the relationship is far from perfect, the selection effect here is clear: in order to actually be bought, every drink needs to offer some sort of benefit to consumers. And consumers typically value taste. I expect that those drinks which are low on both of these categories offer something else, such as catechins.

Using Berkson’s Bias To Classify Trade-Offs

Many trade-offs are the result of a Berkson’s paradox-like selection bias. Still, there can still be trade-offs in the absence of selection. For example, consider the false alarm vs oversight trade-off, which can be parametrized in a null hypothesis testing setting by the p-value threshold. No selection is necessary to invoke a trade-off between the rate of false positives and false negatives. Over the full range of both the type I error rate and the type II error rate, shifting the p-value threshold will have the same direction of effect.

Thus this is a property we can use to carve up trade-off space into two classes:

– trade-offs between two variables which are “intrinsically” mostly independent, but become dependent as a result of our selecting only a particular subset of them, or

– trade-offs between two variables which are dependent, to at least some degree, over their full range of possible values.

It also suggests that another potentially fruitful way of dividing the first set of trade-offs is to think about what type of process or agent is performing the selection.