Main Content

Finding elements common to many lists / arrays

Archive - Originally posted on "The Horse's Mouth" - 2010-11-26 22:53:48 - Graham Ellis

"What do these lists have in common". That's a common question - but what does the person asking it actually mean? Is (s)he looking for items that occur in all of the lists, but perhaps at different positions? For items that occur in two or more of the lists? For items that occur at the same position in several or all of the lists? Of just to see which lists are the same length?

Questions about commonality are quite frequent in my email, and aren't always clearly thought through. And there's often a hope (which has to be dashed) from the writer that a one-liner using something like a single grep function cll will do the job. Alas, there's rather more to it than that. Even in the case of two incoming lists, each member of the first list needs to be compared to each member of the second list, and if the number of lists increases it gets all the more complex.

I've written an answer in Perl - source code [here] on our web site - that demonstrates a possible approach.

We use one of Perl's hashes to count the number of times each incoming value occurs across all the lists. And we use a second hash while we're doing so in order to avoid double-counting items which occur several times over in the same list. When the counting as been completed, we can identify all items which occur in all the lists (they have a count equal to the number of lists), items which are unique to a single list (they have a count equal to one), and so on.

Starting off with a list of pets that my we have at three houses in our street:

  @first = qw(cat dog hamster tortoise snake lizard);
  @second = qw(mouse cat dog cow horse shark);
  @third = qw(cow sheep goat dog snake cat rabbit rabbit rabbit);


We can find what's unique ... and what everyone has!

  In all lists - cat dog
  In several lists - snake cat cow dog
  In just one list - tortoise lizard horse shark rabbit hamster sheep mouse goat


Next Perl courses - start 13th December (beginners) and 20th December (advanced) - see [here] for an updated description and schedule if you're reading this in the archive.