When you should use Object Orientation even in a short program - Python example
Archive - Originally posted on "The Horse's Mouth" - 2012-07-06 10:04:48 - Graham Ellis
# As I looked out I saw ...
36 (4) cars
26 (5) taxis
10 (60) local buses
1 (60) sightseeing bus
2 (2) lorries
8 (3) vans
13 (1) cycles
1 (6) private hire car
1 (2) tandem
0 (1) motor bikes
3 (70) guided buses
3 (80) park and ride buses
Exercise set ... to produce (via a Python program) a list of transport modes, sorted by the capacity that had passed me during the survey:
munchkin:tlcpy grahamellis$ python story
local buses count: 10 number: 60
park and ride buses count: 3 number: 80
guided buses count: 3 number: 70
cars count: 36 number: 4
taxis count: 26 number: 5
sightseeing bus count: 1 number: 60
vans count: 8 number: 3
cycles count: 13 number: 1
private hire car count: 1 number: 6
lorries count: 2 number: 2
tandem count: 1 number: 2
motor bikes count: 0 number: 1
munchkin:tlcpy grahamellis$
Now - there are multiple ways to do this - here is the first one that I came up with / the program that generated the result above:
import re
record = re.compile(r'(\d+)\s+\((\d+)\)\s+(.*)')
passengers = {}
for line in open("sob"):
raw = record.findall(line)
if not raw: continue
passengers[raw[0][2]] = [int(raw[0][0]),int(raw[0][1])]
def bycapacity(that,this):
return cmp(passengers[this][1]*passengers[this][0],\
nbsp; passengers[that][1]*passengers[that][0])
modes = passengers.keys()
modes.sort(bycapacity)
for mode in modes:
print "%-20s count: %3s number: %3s" % \
nbsp; (mode,passengers[mode][0],passengers[mode][1])
Well - it's quite short. And it works. But all that double subscript and conversion stuff may cause you to draw a deep breath and say "oh dear ...", and if you had to enhance the program or re-use the data in another program, you would probably do better to start from scratch.
I decided to refactor. To take all the logic that related to a transport mode and put it into its own class, and then create obejcts for each mode. Here's the second piece of code I wrote - refactoring from my initial "spike solution":
import re
class transport(object):
record = re.compile(r'(\d+)\s+\((\d+)\)\s+(.*)')
def __init__(self, source):
values = transport.record.findall(source)
if not values:
self.created = False
else:
self.created = True
self.mode = values[0][2]
self.seats = int(values[0][1])
self.counted = int(values[0][0])
 ' @staticmethod
def factory(source):
option = transport(source)
if not option.created:
return None
return option
def __cmp__(this,that):
return that.seats * that.counted - this.seats * this.counted
def __str__(this):
return "{0:20s} seats: {1:3d} count: {2:4d}".format\
(this.mode, this.seats, this.counted)
passengers = []
for line in open("sob"):
item = transport.factory(line)
if item: passengers.append(item)
passengers.sort()
for mode in passengers:
print mode
It's longer - much longer. But it's clearer - much clearer to follow. And the logic which makes use of the objects that I created is down to a few, very simple, lines of code - allowing me to bolt together new applications using the same data, and sharing the same logic. It's also far too easy in the first version to make a change which makes a "silly" use of the data - unintentionally introducing a bug - than in the second.
You should really write data driven code all the time. Take the object approach; it may take you a few more minutes initially during development, and it may result in code that's a bit longer. But that extra time spent coding early on will - 99 times in 100 - pay you huge dividends during the lifetime and maintainaince of the code.
Data - [here].
Source (without objects) - [here].
Source (with objects) - [here].