Significant work - beyond helloworld in Ruby
Archive - Originally posted on "The Horse's Mouth" - 2015-05-27 07:21:06 - Graham EllisA little program can do a LOT of work! Scenario - I have a web server log file of some 50 Mbytes (the data from one particular day on our server) and the secon field in each line tells me which of our hosted web sites was being visited. The question I was asked - "how many hits on each host?". Solution - in Ruby:
# Program to count server acceses to each virtual host
fyle = File.new "ac_20150225"
records = fyle.readlines
counter = Hash.new
records.each do |record|
pieces = record.split " "
domain = pieces[1]
if counter[domain] == nil then
counter[domain] = 1
else
counter[domain] = counter[domain] + 1
end
end
p counter
__END__
Sample Output
trainee@kingston:~/lrp$ ruby toppers
{"www.wellho.net"=>136408, "www.savethetrain.org.uk"=>591,
"www.firstgreatwestern.info"=>59831, "www.across-the-pond.co.uk"=>207,
"www.twcrp.org.uk"=>1040, "www.melkshamchamber.org.uk"=>1117,
"melksh.am"=>192, "twhc.org.uk"=>301, "www.wellhousemanor.co.uk"=>638,
"transwilts.org.uk"=>70, "railcustomer.info"=>12,
"thebutlerdidit.info"=>4, "www.consultations.org.uk"=>3}
As a quick and dirty single time solution, I didn't implement this using objects, and I allowed myself to use the built in p function to display the data in a quick but crude and effective way when it had been gathered.
I'm presenting a Ruby Training Course this week and we've been discussiong how books and web sites seem to go straight from "hello world" to complicated examples. That's because the pubication of quick, dirty examples that don't validate data - such as the one above - often lead to criticism from readers taking them out of context and down-marking them / slagging them off in public because of their dirtyness. In the interest of providing my readers with an intermediate step, I'm risking the wrath of these out-of-context readers in publishing the above. .... if you've been on one of our courses, you'll know what I'm doing. If you haven't, and are uncomfortable with the above, please come along - see [here] for list of currently scheduled public courses.
The complete source of the example above is [here].