Main Content
Regular expressions made easy - building from components Archive - Originally posted on "The Horse's Mouth" - 2007-08-16 20:25:39 - Graham Ellis
There seems to be a certain macho desire in many programmer's minds to write a single complicated regular expression to match against an input line, ignorning the structured approach that everyone accepts quite cheerfully in almost every other case. Have a look at this Python line:
wholeline = r"\d\d-...-\d\d\d\d\s+(\d\d):(\d\d):(\d\d.\d\d),\s+(-?\d+\.\d+),\s+(-?\d+\.\d+),(-?\d+\.\d+),\s+(-?\d+\.\d+),(-?\d+\.\d+),\s+(-?\d+\.\d+)"
Impressive, isn't it?
Yes
Easy to follow, isn't it?
No!
Much better to build it up from a series of components:
date = r"\d\d-...-\d\d\d\d"
time = r"(\d\d):(\d\d):(\d\d.\d\d)"
whitespace = r"\s+"
floater = r"(-?\d+\.\d+)"
wholeline = date + whitespace + time + "," + whitespace + \
floater + "," + whitespace + floater + "," + floater + \
"," + whitespace + floater + "," + floater + "," + \
whitespace + floater
These examples are from the Python Course I have just concluded - the full example is here - where a log file was to be analysed and a short report generated to highlight any changes in readings of over 1% from one line of the data to the next in any of the data columns.
Some other articles
Y115 - Additional Python Facilities Some gems from Intermediate Python Command line parameter handling in Python via the argparse module Json load from URL, recursive display, Python 3.4 Running an operating system command from your Python program - the new way with the subprocess module Json is the new marshall, pickle and cPickle / Python Python - an interesting application Handling JSON in Python (and a csv, marshall and pickle comparison) JSON from Python - first principles, easy example Teaching dilemma - old tricks and techniques, or recent enhancements? A demonstration of how many Python facilities work together Python regular expressions - repeating, splitting, lookahead and lookbehind Joining a MySQL table from within a Python program Factory methods and SqLite in use in a Python teaching example Running operating system commands from your Python program Python decorators - your own, staticmethod and classmethod Model - View - Controller demo, Sqlite - Python 3 - Qt4 Connecting Python to sqlite and MySQL databases Regular Expressions in Python Python - what is going on around me? Python - how it saves on compile time Serialization - storing and reloading objects Testing code in Python - doctest, unittest and others Python Regular Expressions A series of tyre damages Ignore case in Regular Expression This article Turning objects into something you can store - Pickling (Python) Buffering output - why it is done and issues raised in Tcl, Perl, Python and PHP Sending an email from Python Python - listing out the contents of all variables Python 3000 - the next generation Keeping your regular expressions simple Python to MySQL Splitting the difference What and why for the epoch Examples - Gadfly, NI Number, and Tcl to C interface The elegance of Python T247 - Advanced Regular Expressions Regular Expression Substitution - Tcl Regular expression for 6 digits OR 25 digits Sparse and Greedy matching - Tcl 8.4 Tcl / regsub - changing a string and using interesting bits Ignore case in Regular Expression This article Regular Express Primer Matching within multiline strings, and ignoring case in regular expressions R109 - Strings and Regular Expressions Clarrissa-Marybelle - too long to really fit? Regular Expressions for the petrified - in Ruby Global Regular Expression matching in Ruby (using scan) Ruby - standard operators are overloaded. Perl - they are not Ruby - a teaching example showing many of the language features in short but useful program Matching regular expressions, and substitutions, in Ruby Divide 10000 by 17. Do you get 588.235294117647, 588.24 or 588? - Ruby and PHP Ruby - examples of regular expressions, inheritance and polymorphism Object Oriented Ruby - new examples Ruby collections and strings - some new examples Neatly formatting results into a table Search and replace in Ruby - Ruby Regular Expressions The dog is not in trouble Regular Expressions in Ruby Ruby to access web services Ruby Programming Course - Saturday and Sunday What are exceptions - Python based answer String interpretation in Ruby This article Regular Express Primer Ruby v Perl - interpollating variables puts - opposite of chomp in Ruby String duplication - x in Perl, * in Python and Ruby Q806 - Regular Expression Cookbook Getting more than a yes / no answer from a regular expression pattern match Matching a license plate or product code - Regular Expressions Regular Expression Myths Making a Lua program run more than 10 times faster First and last match with Regular Expressions Search and replace in Ruby - Ruby Regular Expressions Efficient debugging of regular expressions Making Regular Expressions easy to read and maintain Validating Credit Card Numbers This article Commenting a Perl Regular Expression Keeping your regular expressions simple P212 - More on Character Strings Binary data handling - Python and Perl First match or all matches? Perl Regular Expressions Converting codons via Amino Acids to Proteins in Perl Possessive Regular Expression Matching - Perl, Objective C and some other languages Serialsing and unserialising data for storage and transfer in Perl The difference between dot (a.k.a. full stop, period) and comma in Perl Single and double quotes strings in Perl - what is the difference? DNA to Amino Acid - a sample Perl script How much has Perl (and other languages) changed? Looking ahead and behind in Regular Expressions - double matching Object Orientation in an hour and other Perl Lectures Arrays v Lists - what is the difference, why use one or the other Further more advanced Perl examples Unpacking a Perl string into a list Teaching examples in Perl - third and final part Binary data handling with unpack in Perl Want to do a big batch edit? Nothing beats Perl! Making variables persistant, pretending a database is a variable and other Perl tricks Running a piece of code is like drinking a pint of beer Perl substitute - the e modifier Finding words and work boundaries (MySQL, Perl, PHP) Equality and looks like tests - Perl Handling Binary data (.gif file example) in Perl Ignore case in Regular Expression This article Substitute operator / modifiers in Perl Commenting a Perl Regular Expression Perl, the substitute operator s Matching within multiline strings, and ignoring case in regular expressions C++ and Perl - why did they do it THAT way? Coloured text in a terminal from Perl Don't expose your regular expressions Storing a regular expression in a perl variable Perl Regular Expressions - finding the position and length of the match Remember to process blank lines Commenting Perl regular expressions