Continuing the discussion on feed validity, and punishing users for the mistakes of producers, Brent Simmons writes:
The single most common cause of non-well-formedness that I see is unencoded ampersands. They appear in a feed as & rather than as &. This is most often in <title>s.
In my experience this most often afflicts larger publications, not weblogs using Movable Type or Radio or whatever. My guess is it’s because these larger publications have their own in-house systems. Those systems don’t get tested the way weblog systems get tested. A weblog system will have many thousands of users, but an in-house system has just one user (the publication). (I mean user in the sense of publisher.)
So, our charter is to evangelize large feed producers, who may have to build syndication tools on top of existing publishing systems.
Why should they bother? Here’s an example:
- A Big Pub creates an RSS feed to increase its exposure. For instance, they can use it to bypass the corporate firewall.
- Big Co doesn’t allow its employees out into the internet at large, but they have a corporate portal. Big Co’s IT department integrated a little feed reader into that project to display work-related headlines.
- Big Co’s IT department uses an off-the-shelf Perl/Java/PHP XML parser to read RSS feeds. They do it to save time and expense. They have many projects to manage.
- If Big Pub’s feed is broken, it doesn’t appear in Big Co’s portal.
- Big Pub loses Big Co’s readership.
I’m wondering what the commercial content management companies are doing for RSS.
