28/02/2011 Leave a comment
Processing XML files is a PITA. There. I said it. Processing XML files using the Java standard XML API, i.e. JAXP is even worse. There are some nicer APIs out there that alleviate this pain, like XPATH, XStream, JAXB, Castor, etc.
But still, they all fail in some point or another, ranging from being verbose and tedious to write to not being able to handle large XML streams.
The latter is a killer for many of the higher level approaches out there. Try using XPATH or any mapping library on a file weighing more than a couple of megabytes.
To handle such cases, we are usually left with StAX. Don’t get me wrong: StAX is not that bad an API, and I’d take it any day instead of JAXP even for small files. It still is a very low level API and parsing the simplest of files requires an impressive amount of code. Also, handling state with StAX is a painful exercice. You usually end up building a full-fledged state machine to do it.