In cricket, we usually compare batsmen by their *averages*, that is to say their mean scores over their careers (total runs scored divided by number of times out). On this measure, the best batsman of all time is Donald Bradman (pictured), with a test average of 99.94^{[1]}, miles beyond his closest rival, Graeme Pollock, on 60.97.

However, as every statistician knows, the mean of a set of data is very far from telling the whole story. The obvious next thing to do is calculate its standard deviation. This will describe how ‘spread out’ a batsman’s scores are.

What does this mean in sporting terms? It is something like measuring his *consistency*. But better batsmen (i.e. those with higher averages) have more ‘space’ to spread into, which will result in larger standard deviations. On this measure alone the ‘most consistent’ performers are likely to be the lowest average scorers (not very helpful).

To take this into account, Ric Finlay hs suggested scaling this down, defining a batsman’s *consistency index* as his standard deviation divided by his average. (Statisticians know this as the coefficient of variation.) In 20 odd years of following the game, I have never seen these sorts of statistics before.

The lists that Finlay produces are fascinating^{[2]}, but need to be treated with caution for various reasons: most obviously, they fail to distinguish between consistently good and consistently bad performances! Secondly, consistency is not necessarily a positive thing: if a batsman scores exactly 50 in nine innings and then 200 in his tenth, his average will improve but his consistency index will be damaged. But no-one would argue that the double-century makes him in any sense a worse player. What is more, not every batsman aims for consistency. Players with a more aggressive style deliberately take more risks than whose who play more safely and defensively. The pay-off is that they also score more quickly (if they don’t get out).

Nevertheless, with all these caveats, I think the data is very interesting. It seems clear that Bradman’s reputation is enhanced still further, as is that of his successor as Australian captain Ricky Ponting. Meanwhile Sachin Tendulkar and Brian Lara (usually considered the greatest batsmen of the modern age^{[3]}) turn out to be more inconsistent in comparison.

Rather than work with standard deviations, cricketers more commonly talk about consistency in passing certain milestone scores: 1 (‘getting off the mark’, 10 (‘into double digits’), and so on. One of the most important stages is somewhere between 20-35, where a batsman is considered to have ‘settled in’. The idea is that even the best batsmen get out to low scores sometimes; a mark of quality is that once they have settled in, they then go on to compile big scores with great reliability. (The cricketing world could really do with a standardized numerical value here, unfortunately the decimal system fails to provide an obvious candidate.)

One common measure here is a batsman’s “conversion rate”, meaning the proportion of the times in which they score 50 that they go on to hit a century. Again, Bradman tops the table, with a conversion rate of 69.95%.

[1] Famously, Bradman only needed to make four runs in his final innings in order to achieve an average of 100, but was out for a duck.

[2] Though slightly out of date.

[3] And I’m *not* saying that this analysis necessarily contradicts that opinion!

Pingback: Mathblogging.org Weekly Picks « Mathblogging.org — the Blog