This is a post about hidden correlations.1

Part of my job is thinking through how disease control interventions will fail, or, more gently, will fall short of their goals. One pattern I see over and over again is someone comes up with a good intervention, lots of people work together to get it to people who will benefit, and things get better for the people who get the help. And then they notice there are other people who would benefit too, so the helpers do more, and things get better still for more people. But then, just when it’s tempting to think that doing the good intervention more and better will solve the problem, progress prettymuch stops, no matter how much effort is applied.

What happened? While making all the progress, not enough people notice that there are some parts of the problem that aren’t changing. It isn’t obvious why progress isn’t being made. The situation looks the same, the data don’t show any major differences between people or places, and no new risk factors are discovered, and yet the problem isn’t going away in some places or for some people. Lurking underneath the progress was the 5% of the problem that we don’t know how to solve and that becomes 100% of the problem.

But that sounds obvious enough. Why is it hard to anticipate this dynamic?

The problem is ubiquitous because it’s inherently mathematical. There are always risk factors we2 can’t measure and some we don’t know exist, and nothing in Nature is independent of anything else. Which means that there are surely risk factors that we don’t know that are correlated with all the risk factors we do know. But we design interventions to act on the risk factors we know. (Because how else would we do it? Actually, comment with ideas please.) And so if we take all our interventions, and apply them along the risk factor axes we understand, there will always be clusters of risk factors in which the limitations of our interventions all land on groups with correlated risk factors we can’t address with the tools we have. After our best efforts, these areas remain.

The first tricky part about this is when we design intervention studies, we purposely randomize to remove the effects of unknown correlations. And so when we study our interventions to figure out how well they work, we often purposely don’t have the information we’d need to figure out when they will fail. And with that information lost separately for each intervention “rigorously studied with gold standard methods”, we have no information to anticipate where interventions will collide to fail together. Remember this next time you hear someone say something to the effect of “randomized control studies are the gold standard in evidence based medicine.”

The second tricky part is that the larger the number of risk factors that combine to meaningfully affect risk, the larger the fraction of people who will live in some correlated cluster where interventions fail together, and the more difficult it will be to reduce their risk. This is a basic fact of mathematics. The larger the dimension of the space of variation, the more of the probability mass in that space is at the edges. This is the mathematics of structural inequity. Correlated risk factors, coupled with tools to address them that don’t understand the correlations, leave people in whom everything collides most at risk and least able to benefit from the interventions that have worked so far.

To the extent that this can be anticipated and the stalled progress prevented, it requires noticing when tools are working less well in some places than others and not assuming that is just random chance but rather that there must be a reason. There is always a reason. We need ways to notice and study under-performance before we hit the wall where progress is stalled. Once stalled, it’s much harder to figure out what we’re missing because there is no signal left – nothing works. So we always need to keep an eye on things that are acting funny, or don’t quite look right, or might be doing something unexpected. And we must reserve some of our effort for looking for new factors and unknown correlations in the places where things are acting funny, so that by the time those places are the only places, we’ve learned what to do next.

This is a post about polio eradication. Or the panic-neglect cycle and burden of COVID-19. Or the persistence of AIDS. Or the drivers of addiction. Or the stability of homelessness. Or…


For attribution, please cite this work as

Famulare (2022, May 13). Nothing is one thing: The 5% of the problem we can't solve becomes 100% of the problem. Retrieved from https://famulare.github.io/2022/05/13/The-5-Percent-We-Cant-Solve.html


  1. It’s also a post that doesn’t really capture what I want to say, but let’s call it a first try. 

  2. I intend the collective “We, humanity”, although being a Seattlite working in global health, I must also mean the interventionist we, the personal we, the colonial we. 


<
Previous Post
The best tools are fit for purpose
>
Next Post
Talk today -- Choices give meaning to uncertainty.