Python - two different splits
Archive - Originally posted on "The Horse's Mouth" - 2007-03-15 17:53:57 - Graham EllisIn Python, there are two different split methods you can use to break up a string into a number of substrings, based on a particular separator. If you know exactly what character(s) your separator will be - e.g. exactly one space - the you can use the method in the string class. By if your separator is less well defined - e.g. if it's one or more space characters - then you'll want to use the split within the re class.
import re
space = re.compile(r'\s+')
data ="Perl Python PHP Prolog Pascal"
langs = data.split(" ")
print langs
langs = space.split(data)
print langs
How does that run?
['Perl', 'Python', '', '', 'PHP', 'Prolog\tPascal']
['Perl', 'Python', 'PHP', 'Prolog', 'Pascal']
The first split looks fairly poor - we've split at single space characters BUT the input string had multiple spaces in one place, and a tab in another
The second split - on a regular expression "one or more white space characters" worked much better, and is typically what you might use for data that was user entered or user edited.
There's a new example - [source here] - added, January 2011. Subject covered on our Python Courses.