Main Content

XML handling in Python - a new teaching example using etree

Archive - Originally posted on "The Horse's Mouth" - 2015-12-09 09:21:06 - Graham Ellis

From the Python course just completed - a new example of XML handling, through the xml.etree.ElementTree module which was first available in Python 2.5.

  import xml.etree.ElementTree as etree
  try:
    tree = etree.parse(args.sourcefile[0])
  except Exception as e:
    print('FAILED to parse - reason %s' % e)
    exit(2)


An exception is thrown if the file cannot be opened, or if it can but contains badly formed XML, for example:

  munchkin:cambx grahamellis$ ./xmlparserdemo 6_context.xml 
  FAILED to parse - reason not well-formed (invalid token): line 3, column 1


Run with a properly formed and readable XML file, you can then iterate through the tree, and indeed do so recuresively if you wish:

  root = tree.getroot()
  for thing in root:
    print(thing)
    print(thing.attrib)
    print(thing.tag)
    print(thing.text)
    kids = thing.getchildren()
    print(kids)
    print("------------------")


Here's a sample of the results you may get (from a very short piece of XML in this case!)

  munchkin:cambx grahamellis$ ./xmlparserdemo -r 6_context.xml 
  <Element 'WatchedResource' at 0x10a037e10>
  {}
  WatchedResource
  WEB-INF/web.xml
  []
  ------------------


That's a single tag within the root, with no attributes (hence the empty dict) and no children (hence the empty list).

Complete program using this code [here] ... the program starts with an example of command line handling - the sample XML file I used is [here]. It also includes code to explore all elements of the tree ... Finally, the xml.etree.ElementTree module also includes tree searchimg and amendment methods.