Main Content

Efficient calls to subs in Perl - avoid duplication, gain speed

Archive - Originally posted on "The Horse's Mouth" - 2009-03-07 05:41:45 - Graham Ellis

When you call a sub in Perl (a subroutine - a named block of code), you pass in a list of parameters in the list called @_. It's easy and straightforward ... but as the length of your list increases, it starts to become less that efficient.

Say - for example - I had a 35 Mbyte log file (and I really did yesterday, when I was looking at our web server log) ... then passing it across to a sub would duplicate the list into @_, and if I copied it into a named list in the sub that would be a third copy - over 100 Mbytes of memory swallowed up. Here's how that code might look:

sub mygrep {
  ($look4,@within) = @_;
  @send_back = reverse(grep(/$look4/i,@within));
  return @send_back;
  }
open (FH,"shortie");
@stuff = <FH>;
@interesting = mygrep("horse",@stuff);
print @interesting;


But if you pass a REFERENCE to the list across to your sub, you can save yourself all the duplication - change @stuff to \@stuff in the call, collect the parameter in $within rather than @within inside the sub, then reference it via @$within rather than @within when you search through it. The code does not look much different:

sub mygrep {
  ($look4,$within) = @_;
  @send_back = reverse(grep(/$look4/i,@$within));
  return @send_back;
  }
open (FH,"shortie");
@stuff = <FH>;
@interesting = mygrep("horse",\@stuff);
print @interesting;


but the performance certain does change dramatically!

If you're a newcomer to Perl you may find this sort of thing perplexing .. but help is to hand! Subjects such as this are covered on our Perl Programming course. And if you're already somewhat into Perl, but want to go a little further, we run a Perl for larger projects course which covers handling of large data sets, object orientation, databases and more.


"Let me show you how that works ...."


Copying the list - lots of duplication
Passing a reference - only the one list