Main Content

Json is the new marshall, pickle and cPickle / Python

Archive - Originally posted on "The Horse's Mouth" - 2015-02-22 12:10:23 - Graham Ellis

Conversion of objects into serial data, such that it can be stored in a file or passed over a network, and restoring it when read back, is a vital topic within any serious object oriented application. It's all very well working with an object on the heap (i.e. in memory while your program runs), but if you dump it out and then restore it, you're going to need to do so in such a way that it can go to a different memory location, as chances are that the addresses it was saved from are going to be in other use when you restore. Thus a serialised format, and json has beceome very much the standard - it's compact, character based rather than binary, quite flexible, and widely supported.

In Python, you may still have reasons to use marshall, pickle and cpickle, or xml, but 9 times out of 10 you'll do better using Json.

This morning I have been working with a 1 Gbyte data flow, and rather than repeat the analysis of my raw data (136 seconds elapsed time) to create the dicts and lists I want to go on and graph, I've chosen to do the extraction once and dump out the Json data to a file. Subsequent runs of my program will check for the file, and read it if present (well under a second!) - see source code here. It's going to save me an awful lot of time during the course this week (and will stop my delegates getting bored) as I'm using Json as my cache. Note the "pretty print" option I have added too - allowing my Json objects to be written raw (which would be the norm), or well formatted (which I've used to show you on our web site - data is [here].