Main Content

When you should use Object Orientation even in a short program - Python example

Archive - Originally posted on "The Horse's Mouth" - 2012-07-06 10:04:48 - Graham Ellis

I was training earlier this week in a room with a view - a view of a major link road towards Cambridge City centre, and I was struck by the very different mix of vehicles travelling along the road compared to what I would expect to see in Wiltshire. And - ever keen to have new, relevant data sets available for my customers to use I kept a log to see what passed in a few minutes. Here's my data file - with in brackets my guess as to the typical number of seat sin each type of vehicle:

  # As I looked out I saw ...
  36 (4) cars
  26 (5) taxis
  10 (60) local buses
  1 (60) sightseeing bus
  2 (2) lorries
  8 (3) vans
  13 (1) cycles
  1 (6) private hire car
  1 (2) tandem
  0 (1) motor bikes
  3 (70) guided buses
  3 (80) park and ride buses


Exercise set ... to produce (via a Python program) a list of transport modes, sorted by the capacity that had passed me during the survey:

  munchkin:tlcpy grahamellis$ python story 
  local buses          count:  10 number:  60
  park and ride buses  count:   3 number:  80
  guided buses         count:   3 number:  70
  cars                 count:  36 number:   4
  taxis                count:  26 number:   5
  sightseeing bus      count:   1 number:  60
  vans                 count:   8 number:   3
  cycles               count:  13 number:   1
  private hire car     count:   1 number:   6
  lorries              count:   2 number:   2
  tandem               count:   1 number:   2
  motor bikes          count:   0 number:   1
  munchkin:tlcpy grahamellis$ 


Now - there are multiple ways to do this - here is the first one that I came up with / the program that generated the result above:

  import re
  record = re.compile(r'(\d+)\s+\((\d+)\)\s+(.*)')
  
  passengers = {}
  
  for line in open("sob"):
          raw = record.findall(line)
          if not raw: continue
          passengers[raw[0][2]] = [int(raw[0][0]),int(raw[0][1])]
  
  
  def bycapacity(that,this):
          return cmp(passengers[this][1]*passengers[this][0],\
 nbsp;         passengers[that][1]*passengers[that][0])
  
  modes = passengers.keys()
  modes.sort(bycapacity)
  
  for mode in modes:
          print "%-20s count: %3s number: %3s" % \
 nbsp;         (mode,passengers[mode][0],passengers[mode][1])


Well - it's quite short. And it works. But all that double subscript and conversion stuff may cause you to draw a deep breath and say "oh dear ...", and if you had to enhance the program or re-use the data in another program, you would probably do better to start from scratch.

I decided to refactor. To take all the logic that related to a transport mode and put it into its own class, and then create obejcts for each mode. Here's the second piece of code I wrote - refactoring from my initial "spike solution":

  import re
  
  class transport(object):
          record = re.compile(r'(\d+)\s+\((\d+)\)\s+(.*)')
  
          def __init__(self, source):
                  values = transport.record.findall(source)
                  if not values:
                          self.created = False
                  else:
                          self.created = True
                          self.mode = values[0][2]
                          self.seats = int(values[0][1])
                          self.counted = int(values[0][0])
  
 '         @staticmethod
          def factory(source):
                  option = transport(source)
                  if not option.created:
                          return None
                  return option
  
          def __cmp__(this,that):
                  return that.seats * that.counted - this.seats * this.counted
  
          def __str__(this):
                  return "{0:20s} seats: {1:3d} count: {2:4d}".format\
                          (this.mode, this.seats, this.counted)
  
  passengers = []
  
  for line in open("sob"):
          item = transport.factory(line)
          if item: passengers.append(item)
  
  passengers.sort()
  
  for mode in passengers:
          print mode


It's longer - much longer. But it's clearer - much clearer to follow. And the logic which makes use of the objects that I created is down to a few, very simple, lines of code - allowing me to bolt together new applications using the same data, and sharing the same logic. It's also far too easy in the first version to make a change which makes a "silly" use of the data - unintentionally introducing a bug - than in the second.

You should really write data driven code all the time. Take the object approach; it may take you a few more minutes initially during development, and it may result in code that's a bit longer. But that extra time spent coding early on will - 99 times in 100 - pay you huge dividends during the lifetime and maintainaince of the code.


Data - [here].
Source (without objects) - [here].
Source (with objects) - [here].