Main Content

Handling Binary data in Tcl (with a note on C)

Archive - Originally posted on "The Horse's Mouth" - 2007-09-09 06:14:06 - Graham Ellis

In Tcl, all variables are held as strings, and most of the commands will split / divide strings at new line or space characters by default. However, there are a few commands that do NOT make that distinction and since a Tcl string (Unlike a C string) may contain any bit pattern at all, they provide a very useful tool for binary data handling. Here they are:

read. Read in a certain number of bytes from a file handle (up to agiven maximum or to the end of file, irrespective of the characters read).
Example: set header [read $stuff 10]

binary scan. Divide a string into a series of separate variables, using a format string that's given as a parameter.
Example: binary scan $header a3a3ss type version ecs why
Takes 3 chars from $header into the variable called type, the next three into the variable called version, the next two (as a 16 bit integer) into ecs and the next two as another 16 bit integer into why. Formats can include 16 and 32 bit, big and little endian, etc

binary format. Take a series of values and save them into a single string. This is the opposite of binary scan in many ways, but there are a few more differenced in the format, the inputs are given with $s (as you would expect) and the output is returned.
Example: binary format ss $hilda $stan
Returns a 4 character string having packed in the two decimal numbers in the Stan and Hilda variables as 16 bit integers

puts. Puts will output any string; remember to use the -nonewline option if you don't want to an extra c/r added. Then remember than you might need flush or fconfigure.

Putting it all together - code that reads the start of a .gif file and tells the user how tall and wide a clickable image in a web page would be, assuming a default 1 pixel border:

set stuff [open tongue.gif r]
set header [read $stuff 10]
binary scan $header a3a3ss type version ecs why
close $stuff
incr ecs 2
incr why 2
puts "It is a $type file version $version size $ecs by $why"
puts -nonewline "(Size allows 2 extra pixels for a clickable "
puts "border in this demo"


Fill code with comments: Here

Note on Strings in C. In C, you may store any bit pattern that you like in An array oc Chars. However, if you use the built in string handlers to manipulate that array of chars, you'll find that they all assume a null character (%body% or 0x00) as the end of string and will truncate at that point, or overrun if you char array doesn't contain a null.