Main Content

Python sets and frozensets - what are they?

Archive - Originally posted on "The Horse's Mouth" - 2011-10-20 07:56:54 - Graham Ellis

A Python set is like a dictionary, but without a value being held.

Quite often, I'll have a long list of names / places / skills / words and I want to make up a unique list - to produce a pulldown menu, for example, in which each occurs just once.

I could do that with a dictionary - either storing the value "1" into each pair's value as I add pairs to the dictionary, or by keeping an arbitary count of the number of times each name / place / skill / word appears, in the full knowledge that I won't actually be using the count. Example [here].

A much better solution is to use a set; A set is an unordered collection of keys, held without any values. If I use the add method on a set, a new element will indeed be added if the key requested is new, but there will be no change if the item is an already existing key.

Here's an example of the use of a set - to get me a unique list of places served in a list of public transport routes:

  records = """London Rugby Coventry Birmingham Wolverhampton
  Derby Birmingham Cheltenham Gloucester Bristol Taunton
  London Reading Swindon Stroud Gloucester Cheltenham
  London Reading Swindon Chippenham Bath Bristol Weston-super-Mare"""
 
  locations = set()
 
  for record in records.splitlines():
    for place in record.split():
      locations.add(place)
 
  print locations


Running that, I get:

  wizard:oct11 graham$ python pyset
  set(['Cheltenham', 'Wolverhampton', 'Rugby', 'Coventry', 'Stroud',
  'Weston-super-Mare', 'Chippenham', 'Taunton', 'Bath', 'London',
  'Bristol', 'Derby', 'Birmingham', 'Reading', 'Swindon', 'Gloucester'])
  wizard:oct11 graham$


So that's a unique list of all the places, but in no particular order. Full source [here].

My set is itereable - in other words I can use it as the incoming "list" to a for loop, I can use the keyword in to check whether an element exists, and I can use the len() function to find how many elements there are in my set ...

A frozenset is an immutable set - it's set up with the frozenset() function call, but once it's set up its contents cannot be altered. Operations which are non-intrusive (i.e. read only) work on a frozenset in the same way that they work on sets, but of course anything that writes to / modifies a set can't be used on a frozenset.




A Python set is similar to a Java HashSet - example [here]. And to complete that Java comparison, a python dictionary is similar to a java HashMap or Hashtable.