TagSoup

John Cowan mentions Tag Soup which “parses HTML as it is found in the wild: nasty and brutish, though quite often far from short.”

It doesn’t fix HTML, but returns a SAX stream of properly nested elements and attributes you can catch and process.

Possibly Related posts (machine generated):

  1. The Monkey’s Already Gone to the Airport
  2. XMLC as an alternative to XSLT
  3. If blood be the price of XML
  4. Minesweeper in XSLT
  5. RSS Auto-Discovery Day

More like this: , .