On Microsoft & Yahoo

As several sources reported yesterday, Microsoft and Yahoo put together a deal whereby Microsoft will use Bing to power Yahoo search, while Yahoo will take over the advertising on all search results.  Overall, we feel that this is a good thing, assuming it does create a more viable challenger to Google.  While we're fully vested in the success of companies using 80legs to power their own search technologies, we feel Bing represents the best opportunity to get consumers used to thinking of alternative when it comes to search.  Right now your average person only thinks of one thing when it comes to search (Google) and he doesn't even consider any alternatives (niche, deep web, semantic, etc.).  Bing has been able to put some good-sized chinks in Google's armor, and that will help in the entire industry. One interesting side issue is what will happen to all the efforts Yahoo has engaged in to help the search community, such as SearchMonkey, BOSS, etc.  From what we can gather, the fate of this products is sort of up in the air right now.  My guess is that they stay alive, as both Microsoft and Yahoo probably realize they need as many people taking on Google as possible.

Companies we wish we had met earlier: SearchMe

TechCrunch reported today that visual search engine SearchMe is (temporarily) closing its doors.  This is pretty disappointing to us.  From what we've heard, SearchMe was a pretty cool product with some real potential.  I find it especially ironic (in a bad way) that the site now redirects to Google.  Search is not a solved technology, and for companies like SearchMe to not be able to find enough traction to keep going hurts the industry in the long run. We never got the chance to talk to the folks at SearchMe (I've tweeted Adams to see if he'd be interested in us as way to cut down on crawling costs, though that seems unlikely since they appear to be re-focusing on a different market), but I'm guessing we could have helped them out.  I found the following quote particularly interesting:
So the plan now (unless a buyer or white knight jumps in at the last moment) is to significantly downsize, take the site down for a while (probably tomorrow) and refocus the tech in a space where we don’t have to have 3,000 servers costing a million a month to run on the back end.
I'm sure most of that 3,000 is not for things like crawling and processing web content, but I'm betting some of it is.  The infrastructure requirements of getting into the search game are so daunting that it makes it very difficult to just to throw your hat in.  I've told a lot of people that Google's main advantage is not its technology, but its operational costs.  It can deliver at a lower cost than anyone else, due to its sheer size. We're hoping 80legs puts more than a few dents in that barrier to entry.

Comparing 80legs to Yahoo! BOSS

Yahoo! recently announced a new pricing scheme for their BOSS platform, so we thought it would be a good idea to provide a comparison between 80legs and BOSS. Web-Scale Development vs. Re-packaged Yahoo! Search The biggest difference between 80legs and BOSS is that 80legs is a platform for developing your own web-scale applications while BOSS is an API for retrieving search results from Yahoo!.  In other words, with 80legs you can easily build any kind of web-scale app that accesses the entire Internet.  With BOSS, you are ultimately  re-packaging search results from Yahoo!. Query Types BOSS lets you make 4 types of queries:
  • Spelling
  • Web
  • News
  • Image
Each of these query types is logically the same type: keyword matching on text content.  The difference between the four is the result type you get with each one.  80legs has no limitations on query types.  With our service, you can do any of the following:
  • Keyword matching on text content (includes all 4 BOSS query 'types')
  • Visual matching on images (e.g., Is Image A similar to Image B?)
  • Programmatic queries (e.g., On which pages does the word 'Obama' appear 4 times?)
  • And any other query type you conceive
Because 80legs is an application development platform, you can create your own code to create any query type you want. Filtering Within some of the BOSS query types listed above, you can pass in a limited set of filter options to narrow down the result set your query returns.  For example, with web queries, you can choose from a set of 6 file types.  When filtering with 80legs, you pass in regular expressions instead of pre-defined options.  This gives the developer infinitely more freedom when it comes to filtering result sets. Pricing Here's the pricing table for BOSS:

Media_http80legsfiles_zzjzs

Each unit costs $0.10.  This table is a bit opaque, but with a little math we can break it down as follows (MRR = million results returned):
  • $0.10 per MRR: off-peak use
  • $3.00 per MRR: 1,000 results/query, on-peak use
  • $10.00 per MRR: 100 results/query, on-peak use
  • $12.00 per MRR: 50 results/query, on-peak use
  • $30.00 per MRR: 10 results/query, on-peak use
The cost to use 80legs is more straightforward (MPC = million pages crawled):
  • $2.00 per MPC: for crawling/accessing content
  • $0.03 per CPU-hr: for computing/analysis performed on content
Now, this comparison is admittedly a bit of an apples to oranges comparison (hopefully we've impressed upon you that 80legs is a different animal and has way more features), but it gives you some sense of the difference in pricing.  Companies interested in serious web-scale development could potentially save a lot by going with BOSS during off-peak hours, but I wonder if they would be trying BOSS at all due to limitations we mentioned above.  (Also, it's not clear what constitutes 'off-peak' at this point.)  Smaller users will be paying less on a per-unit basis.  Again, this is an apples-oranges scenario, so comparing the two pricing schemes is a bit odd, but we like to be thorough :). Conclusion 80legs and BOSS are two very different things.  80legs is a platform for making any kind of web-scale application.  BOSS is a way to query Yahoo!.  80legs allows much more functionality and enables a  much wider variety of service and products looking to do interesting things with Internet data.