Is It Time For a Web Crawling Code of Conduct? (expanded)
We've got a guest post up over at ReadWriteWeb entitled "Is it Time for a Web Crawling Code of Conduct?" The RWW post provides a summary of how web crawling can be beneficial. Here are some more specifics on the items mentioned:
Listening to Customers Better:
We have several customers at 80legs that use our service to collect customer reviews from various shopping websites. This data is aggregated and scored by our clients to provide services such as media monitoring and one-stop shopping portals.
An interesting use-case for web crawling is discovering and analyzing potential ad channels. Ad networks crawl millions of web pages to find content relevant to their ad inventory.How it helps individuals: I’ll admit this is somewhat derivative, but I think everyone would prefer relevant ads or irrelevant on a web page, given that choice. It should also be noted that web crawling by ad networks means even tiny blogs by individuals can get better ads with higher CTRs.Building Better Data Sets:
Companies like Infochimps and Factual use web crawling to build better, more structured data sets from information scattered around the web. This can be anything from property data to sports data. Rather than having this data scattered around the web, it’s not centralized for easy consumption and analysis.How it helps individuals: Again, the benefit is not immediate, but it’s there. You’ll see Factual datasets being used inline with the content of various websites, enhancing your information experience. As Infochimps grows their dataset store, you’ll have a great resource for dataset searching.These are just three examples. At 80legs, we have dozens of customer verticals, and all of them contain customers that are building fascinating applications on top of web crawling.
