Using Perl 6 to analyse and report on data
Archive - Originally posted on "The Horse's Mouth" - 2016-01-02 17:44:38 - Graham EllisWithin the last 24 hours I've got my hands on the first "official release" or Perl 6 - and I'm going to share with you the first practical program that I've written in the language.
Scenario - I have a data file listing the 2540 railway stations on the national network in the UK together with annual ticket sales numbers. I'm looking to identify how many of them have passenger numbers below a threshhold (35) people joining trains there each day, and what proportion of passenger journeys on the network as a whole start from such lightly used stations. It's all to do with new technology, and whether you would really want / need to install electronic ticket systems at the likes of Sandplace, Shippea Hill and Swale.
Let's take a look at the code - this should be understandable for readers who are experienced Perl scripters already - and for others I'll explain the details when you attend one of our courses ;-)
Tell the operating system / command line handler to find the Perl6 program and interpret the rest of th file under that:
  #!/usr/bin/env perl6Read in the file called rstats2015.txt, separate the lines into separate strings, and store those strings into an array called @stations.
  my @stations = slurp("rstats2015.txt").split("\n");Set up a number of scalar variables for later use, initialising them to zero, and also set up an empty has called %small:
  my $countStations = 0;
  my $countSmall = 0;
  my $countIncomplete = 0;
  my $countStationPassengers = 0;
  my $countSmallPassengers = 0;
  my %small;Set up a variable (which we may change later, or read in from the user) to sets the passenger numbers joining trains per day below which we'll regard a station as being too small to warrant its own set of electronic equipment:
  my $daily = 35;Loop through each of the station records we have read, splitting it into a number of fields which are tab separated:
  for @stations -> $station {
    my @fol = $station.split("\t");Take a look at the last data value in each record, and count the station and sum the number of passengers if there's fewer than the threshold. If you can't even convert the last field to a number, jump to other code to handle that case:
  try {
    if (@fol[*-1] < 365 * 2 * $daily) {
      %small{@fol[6].gist} = @fol[*-1];
      $countSmall++;
      $countSmallPassengers += @fol[*-1];
  }If there was a valid passenger number (small or large) add than into the grand total
  $countStationPassengers += @fol[*-1];And handle the special case of the last field not being a number (it's a station that didn't return any passenger numbers for the year - perhaps because the station isn't open yet ....):
  CATCH {
    default {
      $countIncomplete++;
      }
    }
  }Finally in the loop, increment the overall count of stations.
  $countStations++;
}Once all the data has been analysed and summed, let's output some results. The elems method counts the number of elements in a collection, and the floor method rounds a number down to the whole number below:
  my $unders = %small.keys.sort.join(", ");
  print "$unders\n";
  print "Number of small stations: ",%small.keys.elems,"\n" ;
  print "1 passenger in ",($countStationPassengers / $countSmallPassengers).floor ,"\n" ;Full code [here]. Data file [here].
If you want to learn Perl 6, please get in touch. From March 2016 we'll be running "Learning to program in Perl 6" for newcomers to programming, and "Perl 6 programming" for experienced programmers in other languages. If you're already a Perl (5) programmer, we can help you with learning into Perl 6 - that would be a tailored course, based on your background ... there's no "one size fits all" for intermediate / advanced courses.