Main Content

Splitting a record into individual data values in C

Archive - Originally posted on "The Horse's Mouth" - 2012-05-04 08:26:06 - Graham Ellis

Many data files consist of a number of records, each of which is divided into a number of fields. How do you handle such data records in a C program? You could use very low level string handling functions, but it's probably far better to step up one level and use something like the string tokeniser. There's an example [here] on our web site, written as a demonstrattion during a C Programming Course.

When you're going to split up (tokenise) a string in C, using strtok, you pass in the character string to be split (literally the address of the first element of the null terminated character array) to the first call:
  char *four_code = strtok(demo,"\t");
The second parameter is the separator that delimits the first token (resulting field).

Subsequent calls, using a NULL first parameter, will return subsequent tokens from the same incoming string, and you also have the opportunity to change the delimiter / separator at the end of each token if you wish. Here are the next lines in my example; my data (like most data files) uses the same delimiter throughout:
  char *tlc = strtok(NULL,"\t");
  char *postcode = strtok(NULL,"\t");


strtok returns string pointers, so if you want to treat the values as numbers you need to convert them with atof or atoi. Again from our sample, program, where the data line included six integer values (passenger counts):
  for (k=0; k<6; k++) {
    passengers[k] = atoi(strtok(NULL,"\t"));
  }


Splitting lines of data is, of course, only one element of the task of reading in data from a file [more about file reading in C], allocating memory [more about dynamic memory allocation in C] and storing it appropriatley [more about data structures in C]. There's also an example showing each of these elements in a single program [here].