A problem of curves

First published by Linux User & Developer.

What does the phrase ‘problem solving’ evoke for you? Maths puzzles, or maybe an engineering experiment? Or do you think of that moment when the code runs and does what it’s supposed to?

But when solving problems, it’s all too easy to jump in at the deep end and start the search for a solution without considering what exactly it is that we’re facing. Yet how we tackle our problems depends on how we frame them. Sudoku, for example, has a very simple framing: Figure out which numbers go in which boxes – do it correctly and every vertical and horizontal line will display the numbers 1 – 9. The ‘solving’ part of the process is relatively easy, particularly once you have a sound methodology.

For political problems the framing is crucial, but the more complex the problem, the harder it is to model it. An overly simple model will fail to reflect reality and so give an ineffectual – and potentially even damaging – solution. Indeed, oft-times, we see politicians propose ‘sticking plaster’ solutions to complex problems which clearly won’t work: they have mis-framed the problem and thus produce inadequate plans for tackling it.

I recently read an article by Malcolm Gladwell in the New Yorker where he discusses the framing of problems like homelessness, car exhaust emissions and corruption in the LAPD in terms of normal (bell-curve) and power-law distributions.

After Rodney King, the LAPD were investigated to find out the extent of racism, violence and ill-discipline amongst officers. According to Gladwell, the assumption was that “those problems had spread broadly throughout the rank and file”, and that if you drew a graph, it would be a bell-curve with “a small number of officers at one end of the curve, a small number at the other end, and the bulk of the problem situation in the middle.”

He adds, “the bell-curve assumption has become so much a part of our mental architecture that we tend to use it organize automatically”.

But it turns out that the bell-curve doesn’t fit. Instead, most officers are well behaved and only a minority are troublesome: It’s a power-law problem and the types of solutions that would be effective in a bell-curve scenario, such as widespread training, do nothing to improve the behaviour of the small number of officers who are causing all the trouble.

Think now about terrorism. Terrorism is not a bell-curve problem but a power-law one: A tiny minority of people commit the worst atrocities. But the solutions the government want to put into place are bell-curve solutions. Data retention, for example, does not focus on identifying and gathering intelligence on the dangerous minority, but on collecting and storing data on the majority – the lump in the bell-curve. ID cards are the same – lots of data collected about lots of people who have nothing to do with terrorism or organised crime.

If anything, it is the anomalous criminals in the head of the power-law who will be most likely and most able to circumvent data retention, or to forge their ID. The solutions will fail because the problem has been framed improperly.

It’s the same with child abuse. There are 11.7 million children in the UK and, (according to the NSPCC), 32,000 children are known to be at risk of abuse, and on average 77 are murdered each year. If you plotted a graph of the risk to each child, you’d see a long tail of millions for whom there are no abnormal risk factors, a short curve of increasing risk for 32,000 children, then a big spike of lots of risk for a minority. Another power-law curve.

And another inadequate solution. Huge databases that track every child in the UK are not going to solve the problem because what’s required is intensive, focused work to locate and protect the vulnerable minority.

But although child abuse affects only a small number of children it has a ‘big’ emotional impact on us. Our instincts tell use, wrongly, that big problems need big solutions. A national database is a big solution which appeals to our desire to protect, but it won’t work.

Of course, the government tell us that they need all this data in order to identify the minority that are at risk – or posing a risk – but what they don’t seem to understand is that the more data you gather, the longer it takes to process and the more deeply buried crucial information becomes. Millions of records with hundreds or thousands of datapoints might seem like a good idea, but are current data mining techniques really up to making sense of mass surveillance?

People who feel threatened want to see a visible show of force by their ‘protectors’ so that they can feel safe. Sadly, the government know this, and are more interested in ostentatious – but ineffective – bell-curve answers than in re-framing problems correctly and finding more appropriate power-law solutions.