Main Content

Significant work - beyond helloworld in Ruby

Archive - Originally posted on "The Horse's Mouth" - 2015-05-27 07:21:06 - Graham Ellis

A little program can do a LOT of work! Scenario - I have a web server log file of some 50 Mbytes (the data from one particular day on our server) and the secon field in each line tells me which of our hosted web sites was being visited. The question I was asked - "how many hits on each host?". Solution - in Ruby:

  # Program to count server acceses to each virtual host
  
  fyle = File.new "ac_20150225"
  
  records = fyle.readlines
  
  counter = Hash.new
  
  records.each do |record|
    pieces = record.split " "
    domain = pieces[1]
    if counter[domain] == nil then
      counter[domain] = 1
    else
      counter[domain] = counter[domain] + 1
    end
  end
  p counter
  
  __END__
  
  Sample Output
  
  trainee@kingston:~/lrp$ ruby toppers
  {"www.wellho.net"=>136408, "www.savethetrain.org.uk"=>591,
  "www.firstgreatwestern.info"=>59831, "www.across-the-pond.co.uk"=>207,
  "www.twcrp.org.uk"=>1040, "www.melkshamchamber.org.uk"=>1117,
  "melksh.am"=>192, "twhc.org.uk"=>301, "www.wellhousemanor.co.uk"=>638,
  "transwilts.org.uk"=>70, "railcustomer.info"=>12,
  "thebutlerdidit.info"=>4, "www.consultations.org.uk"=>3}

As a quick and dirty single time solution, I didn't implement this using objects, and I allowed myself to use the built in p function to display the data in a quick but crude and effective way when it had been gathered.

I'm presenting a Ruby Training Course this week and we've been discussiong how books and web sites seem to go straight from "hello world" to complicated examples. That's because the pubication of quick, dirty examples that don't validate data - such as the one above - often lead to criticism from readers taking them out of context and down-marking them / slagging them off in public because of their dirtyness. In the interest of providing my readers with an intermediate step, I'm risking the wrath of these out-of-context readers in publishing the above. .... if you've been on one of our courses, you'll know what I'm doing. If you haven't, and are uncomfortable with the above, please come along - see [here] for list of currently scheduled public courses.

The complete source of the example above is [here].