Main Content

Over a third of numbers start with the digit 1

Archive - Originally posted on "The Horse's Mouth" - 2009-06-19 15:42:35 - Graham Ellis

Someone told me recently that over a third of numbers start with the digit one - which of course is untrue - it's 10% of numbers (if you allow for zero starts), or 11.11111% if you don't.

Except ...

If you take all the numbers that are used in a typical report, you won't find the even balance. Ever one to explore, and while unable to get into a serious project today as I'm on call to wait, front of house ('can we have some more water', 'can we use a room for a breakout meeting', 'did you make these yummy pastries'), I thought I would test it out - I evaluated the size of all of yesterday's accesses to our web site:

-bash-3.2$ ./sn ac_20090618
1 - 54658 of 127322 (42%)
2 - 21778 of 127322 (17%)
3 - 7256 of 127322 (5%)
4 - 5988 of 127322 (4%)
5 - 5320 of 127322 (4%)
6 - 15472 of 127322 (12%)
7 - 4985 of 127322 (3%)
8 - 6271 of 127322 (4%)
9 - 5594 of 127322 (4%)
-bash-3.2$


Yup ... in our case, over 40% of numbers start with a 1.

Code:

#!/usr/bin/perl -na
$c1 = substr($F[9],0,1);
if ($c1 ne "-" and $c1 ne '"') {
$c++;
$co{$c1}++; }
END {
  foreach $p (sort(keys %co)) {
    $pc = sprintf("%d",$co{$p}*100/$c);
    print "$p - $co{$p} of $c ($pc%)\n";
  }
}


(Note use of command line options for autosplit mode, and for looping through all lines of an incoming file. Also note use of sprintf to do the awkward bit of the formatting, while leaving print to do the easy stuff.)