Main Content

Processing data line by line - iterator in Ruby with yield

Archive - Originally posted on "The Horse's Mouth" - 2016-05-19 14:27:02 - Graham Ellis

If you have a big file / flow of data in Ruby, you probably don't want to read it all in to an array before you do any processing. Not only would that mean you couldn't produce any output until you have read all the data, but it would also make for a big process - probably inefficient, possibly doing a lot of swapping, and perhaps even running out of memory.

Ruby's yield keyword allow you to suspend the operation of a method and pass control to a block defined in the calling code, which may in itself call another method. Using this technique, you can in effect write parallel code, where one method is the equivalent of a Python generator, and another method or methods perform as steps in the processing of each record of data released (one by one) by that generator.

Sample code [here].