Consistency in Cricket

15th May, 2012

In cricket, we usually compare batsmen by their averages, that is to say their mean scores over their careers (total runs scored divided by number of times out). On this measure, the best batsman of all time is Donald Bradman (pictured), with a test average of 99.94[1], miles beyond his closest rival, Graeme Pollock, on 60.97.

However, as every statistician knows, the mean of a set of data is very far from telling the whole story. The obvious next thing to do is calculate its standard deviation. This will describe how ‘spread out’ a batsman’s scores are.

What does this mean in sporting terms? It is something like measuring his consistency. But better batsmen (i.e. those with higher averages) have more ‘space’ to spread into, which will result in larger standard deviations. On this measure alone the ‘most consistent’ performers are likely to be the lowest average scorers (not very helpful).

To take this into account, Ric Finlay hs suggested scaling this down, defining a batsman’s consistency index as his standard deviation divided by his average. (Statisticians know this as the coefficient of variation.) In 20 odd years of following the game, I have never seen these sorts of statistics before.

The lists that Finlay produces are fascinating[2], but need to be treated with caution for various reasons: most obviously, they fail to distinguish between consistently good and consistently bad performances! Secondly, consistency is not necessarily a positive thing: if a batsman scores exactly 50 in nine innings and then 200 in his tenth, his average will improve but his consistency index will be damaged. But no-one would argue that the double-century makes him in any sense a worse player. What is more, not every batsman aims for consistency. Players with a more aggressive style deliberately take more risks than whose who play more safely and defensively. The pay-off is that they also score more quickly (if they don’t get out).

Nevertheless, with all these caveats, I think the data is very interesting. It seems clear that Bradman’s reputation is enhanced still further, as is that of his successor as Australian captain Ricky Ponting. Meanwhile Sachin Tendulkar and Brian Lara (usually considered the greatest batsmen of the modern age[3]) turn out to be more inconsistent in comparison.

Rather than work with standard deviations, cricketers more commonly talk about consistency in passing certain milestone scores: 1 (‘getting off the mark’, 10 (‘into double digits’), and so on. One of the most important stages is somewhere between 20-35, where a batsman is considered to have ‘settled in’. The idea is that even the best batsmen get out to low scores sometimes; a mark of quality is that once they have settled in, they then go on to compile big scores with great reliability. (The cricketing world could really do with a standardized numerical value here, unfortunately the decimal system fails to provide an obvious candidate.)

One common measure here is a batsman’s “conversion rate”, meaning the proportion of the times in which they score 50 that they go on to hit a century. Again, Bradman tops the table, with a conversion rate of 69.95%.

 

[1] Famously, Bradman only needed to make four runs in his final innings in order to achieve an average of 100, but was out for a duck.

[2] Though slightly out of date.

[3] And I’m not saying that this analysis necessarily contradicts that opinion!

Categories: Sport, Stats | Comments (0) | Permalink

AI and the Church-Turing Thesis

8th May, 2012

Over on G+, Alexander Kruel pointed me towards an article by Alex Knapp entitled “Why your brain isn’t a computer”.

The point of this post is not to argue whether it is or it isn’t, but to draw attention to a salient point, which I think Alex waves away rather too quickly (and which in any case is important and interesting).

One might reason that there are plenty of different types of ‘computation’ around these days: ordinary computer programs, embedded systems, neural nets, self-modifying code, and so on. So, with all this variety, why should we expect that a human brain, being a neurochemical network, should fall into the same computational category as a laptop? Might it not simply be that the two have different capabilities? As Alex argues “Electric circuits simply function differently then electrochemical ones”.

The problem with this argument is that it overlooks the great robustness of the notion of computability, specifically the Church-Turing thesis.

A bit of history: in the 1930s, Alan Turing was investigating the capabilities of Turing machines, funny little devices which creep along a ticker-tape, and respond to the symbols they find there. Meanwhile over in the US, Alonzo Church was exploring the semantics of a formal system he had developed, called lambda calculus.

On first sight, the two topics appear to have little in common. But when the two men encountered each others’ work, they quickly realised something unexpected and profoundly important: that anything which can be expressed in lambda calculus can also be computed by a Turing machine, and vice versa. Shortly afterwards, a third approach known as recursion was thrown into the mix. Again, it turned out that anything recursive is Turing-computable, and vice versa.

This leads to the assertion we know as the Church-Turing thesis: that a process which is computable by any means whatsoever, must also be computable by a Turing machine.

It is important to stress that the Church-Turing thesis has good experimental support. Every computational system we know of obeys it: cellular automata, neural networks, Post-tag systems, logic circuits, genetic algorithms, string rewriting systems, even quantum computers*. Anything that any of them can do can (in principle) be done by conventional computational means.

So when Alex comments that “the brain itself isn’t structured like a Turing machine”, the obvious response is, “well, no, and neither are lambda calculus, cellular automata, and the rest”. (Come to think of it my phone doesn’t much look like a Turing machine either.)

The ‘dualism’ which distinguishes software from hardware (which Alex argues fails for the human brain), is not something built in from the outset. Rather it emerges from the deep, non-obvious fact that computational systems beyond a certain complexity can all emulate each other.

Needless to say, there have been no shortage of people claiming to have developed systems of different kinds which go ‘Beyond the Turing Limit’. (See Martin Davis’ paper on The Myth of Hypercomputation.) And who knows, maybe our brain embodies such a process. (I have my doubts, but if we’re going to find such a system anywhere, the brain is certainly an obvious place to look.)

The bottom line here is that if you don’t want to accept that

  0)   The human mind is computable

then I’d say you have three positions open to you:

  1)   It requires an extra metaphysical ingredient;
  2)   It’s a hypercomputer which violates the Church-Turing thesis;
  3)  It relies in an essential way on a non-computable process, meaning some inherent element of randomness.

Personally I’d order these 0312, in order of likeliness. (At the same time, I’d say talk of reverse-engineering the human brain is like a toddler planning a manned expedition to Mars. How about we concentrate on crossing the room without falling over first?)

 

*Quantum computers may be able to compute certain things quicker than conventional ones, but they won’t be able to compute essentially different things.

Categories: Bloggery, Brain Science, Logic, Philosophy | Comments (3) | Permalink

Detective Work and P versus NP

13th April, 2012

Nightjack was an award-winning blog which ran from 2008-2009, written by an anonymous British police officer. In 2009, in what they claimed to be the public interest – but which struck many observers as an exercise in needless spite – The Times newspaper published an article exposing the author as Detective Constable Richard Horton.

Since then there have been claims and counter-claims about how that story came to press, culminating in the recent revelation that the Times likely misled both the High Court and its own readers.

We now know how Times journalist Patrick Foster knew Nightjack’s identity: he hacked his email account. What is interesting – and where considerations of computational complexity come in – is what happened next. Having found Nightjack’s identity through illegal means, Foster then approached Times lawyer Alastair Brett, and was advised to set to work on reproving Nightjack’s identity, purely through legally available sources. He was indeed later able to do this, with the result that the Times went to court and successfully fought off an injuction barring them from exposing Nightjack. But, critically, during the hearing, they omitted all mention of the original email hack, and spoke solely of Foster’s subsequent reproof.

Brett explained the thinking behind this advice in his testimony to the ongoing Leveson enquiry into the culture, practices, and ethics of the press:

“If he or any other journalist could identify Nightjack through legitimate sources and information in the public domain then we’ve got what I felt was a perfectly legitimate public interest story…

“He had to demonstrate to me that he could do it legitimately from outside in, and that’s what he did. He persuaded me that he was able, that the only person who could have been Nightjack was DC Horton…

“Rightly or wrongly I had believed you could separate the earlier misconduct by Mr Patrick Foster and you could then say that once he had done this legitimately then that could be presented to the court perfectly properly as he had done it legally. Now I accept that you say that the two are inextricably intertwined, but that, if I may say so is a subjective judgement. I happened to take the view that you could separate out the one from the other.”

Robert Jay QC, cross-questioning, makes the observation that the reproof was “a much easier exercise”, now that he “had the advantage of knowing the answer to his question”.

Of course this is true, but one might ask how much of an advantage he had. Is this a subjective judgement? Perhaps Foster could have determined the answer, legally, from the outset. In order to have a leg to stand on, the minimum he needed to establish was that Nighjack’s identity was in legal P, something which could discovered legally in polynomial time. All the reproof showed, however, was that it was in legal NP (as well as being in illegal P, of course). Are the two equal…? (And if they are, what is the polynomial mark-up…?)

David Allen Green has been monitoring the case since the beginning, and it was he who likened the identification to solving a maze ‘from the inside out’ (P) rather than ‘from the outside in’ (NP). He has an excellent run-down of the whole affair so far, at The New Statesman. (If you scroll down to the discussion of Oliver Kamm’s blogposts, you will also see how the Times went on to mislead its readers.)

The latest developments are that the Times’ editor James Harding has written to the hearing’s original judge to apologise, and that Nightjack is suing the Times for breach of confidence, misuse of private information and deceit. He is claiming aggravated and exemplary damages.

You can also watch Alastair Brett’s very uncomfortable evidence to the Leveson enquiry here. It begins at 72′, and the real drama starts at 129′, in which Lord Leveson accuses the Times of providing ‘utterly misleading’ evidence to the Court, an accusation which Brett essentially admits.

Categories: Complexity, Nonsense, Politics | Comments (0) | Permalink

                                          Lawrence Oates

17th March, 2012

Like many British people I was brought up on stories of glorious failure, perhaps epitomised by Captain Robert Scott’s exploration of Antarctica.

Although the expedition was successful in its primary aim (they made it to the South Pole, gathering many scientific specimens along the way) it failed in two major respects: they were not the first to reach the Pole, finding on their arrival a tent left behind by Roald Amundsen‘s team. Secondly, Scott’s party never made it back.

One of the most famous episodes, documented in Scott’s diaries, took place on the disastrous return trip. With one of their party, Edgar Evans, already dead, the team’s progress was being slowed by another, Captain Lawrence Oates, who was suffering with severe frostbite and possible scurvy. As Scott wrote, “Oates’ feet are in a wretched condition… The poor soldier is very nearly done.”

Aware that he was holding his colleagues back, Oates suggested that they go on without him. They refused; so, on the morning of 16th March 1912, Oates’ 32nd birthday, he took matters into his own hands.

Leaving his boots behind, and uttering the famous words “I am just going outside and may be some time”, Oates stepped out of the tent into a blizzard, never to be seen again.

I have known the heroic tale of Captain Oates for as long as I can remember. What I did not know, until today, was that he lived around the corner from me, in what is now Meanwood Park in Leeds, and was then Meanwoodside, the Oates family estate. Today marks exactly 100 years since Oates ‘went outside’, so an exhibition was held in the park, and a commemorative blue plaque unveiled.

As well as being largley responsible for the park’s attractive appearance, the Oates family included at least one other explorer: Lawrence’s uncle Frank Oates, who explored Central America, before dying during an expedition to Africa. The Oates approach to life can perhaps be summed up by Frank’s saying:

“I like anything that seems difficult of attainment”.

Categories: Uncategorised | Comments (1) | Permalink

John Derrick

10th February, 2012

I meant to post something about John Derrick, a long-standing and much loved member of the logic group at Leeds University, who died in December. I only knew John in his later years (some time after his official retirement), but would regularly see him at the Wednesday afternoon logic seminar, which was often followed by a trip to the pub. He was always a thoughtful and benevolent presence in the seminar room, and made for entertaining and knowledgeable company over a drink.

For younger members, he was also something of a link to an earlier era, the days when the group was led by Martin Löb (of Löb’s theorem fame).

It is a testamant to his strength of character, and his love of the subject, that he continued to attend and contribute to these seminars through many year of ill-health, up until only a few weeks before his death.

An obituary by Garth Dales appeared in the LMS news-letter and a longer one can be read on the University of Leeds website.

Categories: Logic, Maths, Philosophy, Uncategorised | Comments (0) | Permalink

Me, Elsevier, and the New Scientist

8th February, 2012

In my last post I said that I had added my name to the growing anti-Elsevier boycott at The Cost of Knowledge.

I need to add something to that, since Yemon Choi has pointed out that the New Scientist magazine, for whom I have done (paid) work in the past (listed here), is owned by Reed Business Information, part of the Reed-Elsevier group.

So am I going to refuse further (paid) work for the New Scientist? It’s a perfectly fair question. My answer is no.

Here is some self-justification: I like the New Scientist as a magazine. Granted, it’s had its share of problems in the past. But overall I believe that it is – in and of itself – a force for good in the world. I regret that RBI is a stable-mate of Elsevier.

I’ll readily admit that there is self-interest at work here too. I like to write about progress in the mathematical sciences. I like my articles to reach a broad audience, and, yes, I also like getting paid.

There are very few outlets where a story about mathematics can be written at reasonable length, without being excessively dumbed down (hopefully!), reach a decent number of people, and earn the author a few quid. So I’m not willing, at this stage, to cut myself off from the biggest one in the UK.

Having said all this, I believe that I can, in good faith, remain a signatory to The Cost Of Knowledge. This is a space to “declare publicly that you will not support any Elsevier journal”. I understand this as referring to academic journals pulished by Elsevier, rather than magazines published by RBI. Certainly the discussions that I have read around the petition seems to reinforce that interpretation. However, I would be happy to reconsider my position if anyone can make a strong case that I’m guilty of having my cake and eating it.

Categories: Politics | Comments (0) | Permalink

Let it be known that…

6th February, 2012

…I have just signed the Cost of Knowledge petition.

I don’t think I need say any more, since the issues have been thoroughly discussed elsewhere.

Categories: Politics | Comments (2) | Permalink

Pick’s Theorem & Ehrhart Polynomials

1st February, 2012

Pick’s theorem is a simple, beautiful, and usful fact of elementary geometry. It should be much better known than it is! In fact, I have half a mind that it should be on the A-level (high school) syllabus.

Less famous – but equally wonderful – are Ehrhart polynomials, which are what you get when you try to lift Pick’s theorem into higher dimensions. Though geometrically intuitive, they quickly lead into deep mathematical waters. They’re also valued as tools in optimisation problems and in other areas of computer science (I’m told).

This afternoon I gave a – hopefully fairly accessible – talk on these topics. The slides are available here.

(Update: PDF of slides here).

Categories: Geometry, Maths, Uncategorised | Comments (2) | Permalink

<< Previous: Elwes Elsewhere