Main Content

Graphing presentations in Python - huge data, numpy and matplotlib

Archive - Originally posted on "The Horse's Mouth" - 2015-02-28 18:43:27 - Graham Ellis

A picture paints a thousand words. Our server log files for February are 1.6 Gbytes in size, with 5.3 million individual requests. How busy are we at what times of day? I've been looking at this through matplotlib in Python - here's a wireframe of the month - days on the X axis, hour of the day on the Y axis, and height being the accesses per hour:



Rather that extract the data each time we draw a graph, I've written the program in two stages - a data extraction routine that writes our a .json object (source code [here]) and the graph drawing program [here]. The plotting program (also used to illustrate Python graphics and numpy on last week's course) produces four different displays - here's another:



Data's available if you wish to run the graph program - [here]. And if you want to learn about numpy and matplotlib, let me know before your Python course and we''ll add an extra session on the penultimate evening of your public course, or build it in to your private course. See [here].