Main Content

Sparse and Greedy matching - Tcl 8.4

Archive - Originally posted on "The Horse's Mouth" - 2007-10-27 07:09:34 - Graham Ellis

Problem Analysis Once you have done a sparse or greedy count, all your following counts will also be sparse or greedy - this is a well documented bug in Tcl 8.4!

Reminder ...
.* is a greedy match - any character and AS MANY AS POSSIBLE
.*? is a sparse match - any character but AS FEW AS POSSIBLE

set demo abc@def@ghi@jkl
#
# Correctly working greedy matches
regexp (.*)@(.*) $demo all first second
puts "$first and $second"
#
# Correctly working sparse matches
regexp (.*?)@(.*?) $demo all first second
puts "$first and $second"
#
# Both matches will be SPARSE here
regexp (.*?)@(.*) $demo all first second
puts "$first and $second"
#
# Both matches will be GREEDY here
regexp (.*)@(.*?) $demo all first second
puts "$first and $second"


Running that, we get:

earth-wind-and-fire:~/oct07/camb grahamellis$ tclsh spag
abc@def@ghi and jkl
abc and
abc and
abc@def@ghi and jkl
earth-wind-and-fire:~/oct07/camb grahamellis$


Suggested Work-around

You can emulate a sparse match by excluding the delimiter character from he group that you would be sparse matching, and using a greedy match.

Thus
regexp (.*?)@(.*) $demo all first second
can be rewritten as
regexp {([^@]*)@(.*)} $demo all first second
if you really needed sparse followed by greedy!

result:
abc and def@ghi@jkl