Spike solution, refactoring into encapsulated object methods - good design practise
Archive - Originally posted on "The Horse's Mouth" - 2015-03-05 06:38:57 - Graham Ellis
Initial test code on new data sets - your research work - will most likely take place in smaller chunks of code while you work out just how the data flows and what you need from it (and you are remembering to read documentation supplied by the data originator, aren't you??).
But having done that initial work, with careful analysis of separator characters, field widths and counts, exceptiosn to normal data rules, and so forth you should take an early opportunity to encapsulate your newfound data expertise into classes, objects and methods that others can use without the heartache / of having to learn into that you had, and so that they can ride on your coattails if data formats are amended, requiring code changes, in the future.
On yesteday's Python programming course, we took a typical data set with some glitches and "funnies" in it, and undertook our research. And afterwards we refactored the logic so that we added a calling shell through which others can use the code without the need to understand individual elemental aspects of how the original data is formed.
The code's use (getting the names of various stars from a whole lot of records including names, heights and eights becomdes as staightforward as
for lyne in data.split('\n'):
johnny = person(lyne)
if johnny.isValid() :
print johnny,johnny.getBMI()
else:
pass
but if you want to see the details of how we dealt with different data formats you'll need to look at the complete source code [here]. The beauty of this approach is that the regular user need not take a look - the hard work's already been done and made avaiable.