Main Content

Tcl - Some example of HOW TO in handling data files and formats

Archive - Originally posted on "The Horse's Mouth" - 2011-03-04 09:27:42 - Graham Ellis

During the Tcl course I was running earlier this week, we got involved in a number of interesting topics such as
• How to clean up input lines into lists of fields (use split)
• How to reformat awkward fields (use regexp and regsub)
• How to combine files that vary in format from year to year (define the format in a list of fields for each year)
• How to trap incomplete records (use info vars, and / or catch)
• How to save data based on a non-numeric key (use arrays)
• How to sort an array (You can't - sort a list of the keys)
• How to sort a list of keys based on an array element (use the -command option to lsort)

I had with me data files for passenger journeys booked to and from stations in Great Britain, year by year, from 2004 to 2010, and a further file with various extra data about each station, and during the course I prototyped / demonstrated the way - in Tcl - that I would manipulate this data though all the criteria above to produce a combined file. After the course, I completed the example and added comments too - and it's now [here]. There are good solutions within the code to each of the questions above ... and some other nice demos too like a use of uplevel to run a command in the variable namespace of the calling proc rather than in your own namespace.

You'll see from the code that Tcl remains an excellent language for applications of this type, even though it's very different indeed to languages like PHP and Python, which I code in more frequently. The same exercise could, indeed, have been done in Ruby or Perl, PHP, Lua or Python ... but it was a pleasure to do it in Tcl for a change. See [here] for details of our Tcl courses.

If you should come to this page through "rail interest", I've uploaded all the data files - including the output - in our rail resource at http://www.wellho.net/resources/Z501.html. The output file (which is a correlation which researchers may find useful) is at http://www.wellho.net/resources/ex.php4?item=z501/railstats.txt.