Global Regular Expression matching in Ruby (using scan)
Archive - Originally posted on "The Horse's Mouth" - 2015-01-08 07:13:16 - Graham EllisRegular expression 'engines' start at the left of an input string they're matching against, and counts are usually greedy, so by default they return "leftmost, longest" matches - which works for what users want in most cases.
However, sometimes you may want to work through all (non-overlapping) matches in a string - for example to pick up a series of email addresses or URLs from a line or block of text. In Ruby, you can achieve that through the scan method on a string. A single match:
if myStringRecord =~ /\s*(\S{1,})@(\S+)\s*/
or changing that to a multiple or global match:
myStringRecord.scan(/\s*(\S{1,})@(\S+)\s*/) do |tom,dick|
which is a loop populating the variables dick and tom.
Complete example [here]. Topic covered on our Ruby Courses.
Should you wish to match the shortest rather than longest alternatives, an additional ? character after the count(s) you want to vary should be added.
"banana" =~ /a(.*)a/
puts
"banana" =~ /a(.*?)a/
puts
outputs
nan
n