John Cowan mentions Tag Soup which “parses HTML as it is found in the wild: nasty and brutish, though quite often far from short.”
It doesn’t fix HTML, but returns a SAX stream of properly nested elements and attributes you can catch and process.
John Cowan mentions Tag Soup which “parses HTML as it is found in the wild: nasty and brutish, though quite often far from short.”
It doesn’t fix HTML, but returns a SAX stream of properly nested elements and attributes you can catch and process.
Bill Humphries lives and works in the Silicon Valley. He can be reached at bill@whump.com
ISSN: 1533-8088
Opinions expressed here are mine alone, and don't reflect those of my employer.
© 1996-2008, Bill Humphries