XML handling in Python - a new teaching example using etree
Archive - Originally posted on "The Horse's Mouth" - 2015-12-09 09:21:06 - Graham EllisFrom the Python course just completed - a new example of XML handling, through the xml.etree.ElementTree module which was first available in Python 2.5.
import xml.etree.ElementTree as etree
try:
tree = etree.parse(args.sourcefile[0])
except Exception as e:
print('FAILED to parse - reason %s' % e)
exit(2)
An exception is thrown if the file cannot be opened, or if it can but contains badly formed XML, for example:
munchkin:cambx grahamellis$ ./xmlparserdemo 6_context.xml
FAILED to parse - reason not well-formed (invalid token): line 3, column 1
Run with a properly formed and readable XML file, you can then iterate through the tree, and indeed do so recuresively if you wish:
root = tree.getroot()
for thing in root:
print(thing)
print(thing.attrib)
print(thing.tag)
print(thing.text)
kids = thing.getchildren()
print(kids)
print("------------------")
Here's a sample of the results you may get (from a very short piece of XML in this case!)
munchkin:cambx grahamellis$ ./xmlparserdemo -r 6_context.xml
<Element 'WatchedResource' at 0x10a037e10>
{}
WatchedResource
WEB-INF/web.xml
[]
------------------
That's a single tag within the root, with no attributes (hence the empty dict) and no children (hence the empty list).
Complete program using this code [here] ... the program starts with an example of command line handling - the sample XML file I used is [here]. It also includes code to explore all elements of the tree ... Finally, the xml.etree.ElementTree module also includes tree searchimg and amendment methods.