Saturday, July 5, 2008

In Defense of Occam’s (Ockham’s) Razor

Tēnā koutou katoa - Welcome to you all
There’s been a lot said about Occam’s Razor on the Net recently, though not always supportive of its application. Also known as the law of economy, or the law of parsimony, it derived its name from a principle held by the medieval scholar William of Ockham.

He is believed to have said non sunt multiplicanda entia praeter necessitatem - entities are not to be multiplied beyond necessity. It is thought that this principle may have been broached before Ockham by a French Dominican theologian Durand de Saint-Pourcain (1270 - 1334).

A well used blade

It has been used in many disciplines. Science in particular has enjoyed its application over the centuries and it has been exploited a fair bit in philosophy – Nicole d’Oresme, Galileo, Newton and Einstein all drew on Occam’s Razor in some form in their studies. Einstein used it in his theory of special relativity that predicted that the lengths of objects shorten and watches slow down when they move. In considering the tangible, explicit aspects of a theory, Stephen Hawking, in A Brief History of Time, claims “it seems better to employ the principle known as Occam's razor and cut out all the features of the theory that cannot be observed.”

Stroppy criticism

Lately the principle seems to have come under the knife, and it’s validity as a tool has recently been given the chop by several writers.

Is it condemned to the scrap-heap of pre-21st century gadgets?

Or is there life in the old blade yet?

These questions came to mind when I read Peter Turney’s post, Ockham’s Razor is Dull. He speaks disparagingly of its use in the theoretical context of machine-learning and cites Geoffrey Webb’s “Further Experimental Evidence against the Utility of Occam's Razor”. Webb criticises Occam’s Razor for it is found that it doesn’t always work when commonly applied to modern machine-learning.

While I’ve no doubt that Turney’s opinion has some merit, I feel that the original intention behind the principle of Occam’s Razor has been forgotten, or at least severely misunderstood in its application to machine-learning. I suspect the initial application of the principle in such a context was ill-advised, that it was assumed as a means to an end rather than a way of viewing things.

A useful cutter

Occam’s Razor is often applied in many acute and incisive ways. Approximations when quoting populations, expressed to the nearest thousand heads, have been shaved by it. The so-called rounding up of numbers involves its use in trimming away minor fractional odds and ends. But its willy-nilly application can cause sharp problems for the unwary, as any keen statistician or accountant can attest.

It is most fittingly used as a way of looking at abstractions. For some practical purposes it can be brought into play when a decision is to be made on what to select as best from an array of ideas, or processes and strategies that give rise to almost identical outcomes.

Phil Gibb’s useful statement of the principle for scientists is "when you have two competing theories which make exactly the same predictions, the one that is simpler is the better." In that regard it has application in efficient decision making.

It’s all to do with what is perceived to be significant. I suspect that the ability to discern keenly is a vital component of the capacity to use the device effectively. This talent, present almost subconsciously in some people, may be what puts human sensitivity to differentiate well, above that of a machine.

Gibbs’ delightfully brief “What is Occam’s Razor?” lists an array of statements that seem to have been derived from the principle:

If you have two theories which both explain the observed facts then you should use the simplest until more evidence comes along.

The simplest explanation for some phenomenon is more likely to be accurate than more complicated explanations.

If you have two equally likely solutions to a problem, pick the simplest.

The explanation requiring the fewest assumptions is most likely to be correct.

Keep it simple, otherwise known as the KIS approach.

A worthwhile tool

It is an excellent tool to apply when looking for a model or a metaphor to use in a learning context. Map makers use its principle when simplifying the features of a landscape in two dimensions. The famous colourful London Underground Tube maps, aside being now considered works of art, are splendid examples of the practical use of Occam’s Razor as a design tool.

Current concerns expressed by many bloggers about the glut of data, otherwise known as infowhelm, could well be emeliorated by the judicious and skillful wielding of the blade. A honing of the discerning abilities may help to carve away redundant or duplicated information. Methods for trimming and blood-letting the burgeoning plethora of knowledge on the Internet are ripe for reaping.

Simplification is its keenest feature. A slice or so could well put Occam’s Razor at the cutting-edge of 21st century knowledge management.

Ka kite anō - Catch ya later

No comments: