Noise Reduction Techniques for Reading on the Web

Keith Dawson outlines some techniques for working through large amounts of content, and some regular expressions you can use to transform news site URLs into their printer friendly versions (which have much less fat and ads.)

I have to wonder if one of the reasons we aren’t seeing style sheets take hold is that by intermingling content and structure, that sites beholden to advertisers make it that much more difficult to do simple filtering to remove ads and other noise.

At WHUMP dot COM, if you don’t want the pretty colors, just turn off stylesheets. No need to mess with HTML:Parser in your proxy server.

Link

Possibly Related posts (machine generated):

  1. What is ESI?
  2. Ads and Content [ via Infosift ]
  3. Social Networks Assume A Great Deal
  4. Editorial Guidelines for IDG Web Sites
  5. Notes from the 6th World Wide Web Conference

More like this: , , .