Our launch experience, part 3: Now the real work begins

I ended my last post mentioning that I figured we could take a small break the day after DEMO.  Boy, was I wrong.  When I woke up the next day, I saw several hundred emails, about 300 tweets referring to 80legs and dozens of articles discussing us.  So instead of checking out the beach, we spent the morning responding to emails and catching up on all the 80legs discussion.

I think we did a really good job of getting the word out for 80legs.  Here are some quick stats showing how well we did on this front:

  • # of articles on 80legs: 16
  • # of times 80legs was mentioned as "Best of DEMOfall09": 2
  • # of re-tweets of articles: 700+

Here are just some our favorite articles:

I should also note that we got posted to Hacker News, Digg and Slashdot.  Here's what happened to our web traffic in the week following DEMO:

Media_http80legsfiles_zffyh

Interesting note: most of our web traffic came from Hacker News.  We check HN regularly and participate in the discussions from time to time, so it was awesome to get so much interest from our own community.  Of course, our main focus is not our web traffic (which I think is pretty good for non-consumer-facing service), but customer adoption.  Here are a few stats on that:

  • # of users that logged in since DEMO: 1554
  • # of jobs run since DEMO: 1557

Just as an aside, there were about 50 active beta users, and not every user that logged in has run a job.

Another interesting outcome from DEMO is that we've realized there's demand for customized services on top of 80legs.  In other words, people want to use our team to either build customized products for them that are powered by 80legs, or they want us to build the 80Apps that run within 80legs.  We originally expected third-party companies to build these services and products themselves over time as 80legs became more popular.  In the long-term, that is most likely the key to 80legs sustainable success.  In the short-term, however, we think it's prudent to pursue these engagement ourselves.  In fact, it makes sense to modify our business model somewhat and form 2 additional product/service lines: one for developing value-added services on top of 80legs and another for custom implementation of 80Apps.  Of course, we need to consider how to manage these two additional lines while still managing and improving the core service.

I feel our team's experience so far has been pretty awesome.  We spent about 2 years developing what we feel is a pretty cool technology and now we're starting to see the fruits of our labor.  That said, I'm a big believer that developing good technology is just the first step of many when it comes to finding commercial success.  Now we get to focus on execution, customer satisfaction, and delivering on what we've been promising.  Now the real work begins.

Our launch experience, part 2: DEMO

Around July, we started thinking about how to launch the live service.  We were fortunate that our plans lined up with DEMO.  Of course, they also lined up with TechCrunch50.  I imagine some companies have to think about which one is best for them, but for us it was pretty easy.  TC50 required a company to have no public exposure before their event, which of course made us ineligible.

We did have to think a bit about the cost of DEMO.  I talked to my friends that had demoed there and was ultimately convinced that it was a great place to launch a product, provided you took full advantage of it with the press, PR and other media outlets.

Again though, I wasn't sure we would even get into DEMO.  80legs was usable by this point, but again - here was a completely non-shiny service, void of any semblance of a bell or whistle.  Sure, any "big data"-nut is going to think what we do is the coolest thing since SSDs, but will anyone else?  We weren't sure.

Carla from Guidewire was the one that talked to me about our application.  I gave the 5-second spiel, and was excited to hear that she understood it and really liked the idea.  She did wonder about how we could make the demo interesting.  I assured her we could (while making a note to myself: "Figure out how to make demo interesting!").

A few weeks later...

Guys, I've got news.  We're going live in September.

We got into DEMO?

Yep.  So we'll be on stage.  Hundreds of people.  Thousands of Internet viewers.

So we have about 8 weeks to get everything stable, fully-tested, and scaled out.. oh and we need to make the web portal look a lot better.

Yep!

Now, it's not like we had been slacking off, but July to September was especially scrambly, particularly for our back-end guys.  On the business and marketing end, we wanted to make sure we take full advantage of not only DEMO itself, but the momentum it could generate after the event.

For that, I sought out a PR firm to help with the media.  I asked a bunch of tech/startup friends in Texas about who to go with, and almost all of them recommended Jones-Dilworth, run by the veteran Josh, who had just left Porter Novelli.  If every trusted source you have recommends the same firm, you should probably go with them!

Josh and his team met with us and mapped out a strategy to garner media attention for DEMO and keep momentum going afterward.  They also helped with training our team for handling interviews, which was a big help.  In the week leading up to DEMO, I did at least 1 interview almost every day.  It was pretty awesome talking to and being interviewed by the same folks I had been reading every day for the past few years.

We got into San Diego on Sunday.  The event and crew at DEMO were very nice and professional.  They definitely run a tight shift, but are also super-approachable.  Everyone on staff seemed to know all the details, where to be, etc.

On Monday, all the demoers went through a few introductory items and then we headed off to a happy hour by the bay.  Mingling with other startups, VCs, and press folks is pretty fun.  It's pretty awesome to be at a party where everyone is doing something interesting or has something engaging to say.  Can't say the same about most bars I go to :).

Media_httpfarm3static_mgupg

After that was the "CEO & Dealmakers" dinner, which was only attended by 1 member of each company as well as VCs and other such folks.  While the pre-dinner topic, "The Good, Bad and the Ugly of VC" is something I've read about ad nauseum, hearing it straight from guys like the president of the NVCA was pretty cool.  I got a chance to thank Matt Marshall and Chris Shipley for giving us the chance to DEMO.  Matt and Chris kind of seem like opposites.  Chris was cracking jokes about Pittsburgh (I went to CMU and she's from there), but Matt was like "But seriously, what are you demoing?

Media_httpfarm3static_fvjit

After dinner, I had a cool talk with Flip from Infochimps and Mike Olson from Cloudera about Hadoop and how we might use it for providing post-processing services of crawled data.  Yeah, that's the kind of after-dinner conversation you have at DEMO :)

The real show started on Tuesday, with the first group of presenters in the morning.  There did seem to be a few network issues, which was unfortunate.  Digsby actually ran an "offline" version of their chat client to demo their new Twitter capabilities.  All the data was cached locally.  Now that's what I call a backup plan!

After the presentations, the pavilion was open for a few hours.  Our booth traffic was a bit slow.  Although we had a fair number of people come by, it was nothing like Web 2.0, where a constant stream came by.  I think two factors contributed to this: 1) we hadn't yet presented and 2) we had already talked to almost all the press folks.

Wednesday came along, which meant it was time to demo!  Although people say I always seem uber-calm, I must admit I was just a touch nervous :).  The staff guy pulled me up.  Chris called me out.  I walk out - cameras, lights, hundreds of people before me, time to launch.  "Hi, my name's Shion Deysarkar and I'm here to show you a revolutionary new service called <dramatic pause> 80legs."  I wonder if I'll ever forget the lines?

http://link.brightcove.com/services/player/bcpid980795693?bctid=41312193001

I actually used a pretty cool semantic 80App written by a technology partner of ours and compared what positive and negative things people are saying about DEMO and TC50.  I thought this would be a fun demo for the audience, given the interesting history between the two shows.  I didn't actually show who came out on top though - people had to come by the booth to find out!  It turns out that DEMO just eked out, with a 95% to 91% positive rating over TC50.  If you want to learn more about the future of this app, check out these posts.

Side note:  Even though I poked a little fun at the TC crew, I thought they'd like the joke, given their sense of humor and attitude on DEMO.  Most of the audience cracked up at my joke, but a TC writer told me the joke was "lame".  Oh well, can't win them all.

The demo went pretty smoothly, which I was pretty happy about.  It was great to have it out of the way though.  About 2 hours after, I could feel my body crashing, as I could finally relax.  I don't drink a lot of soda, but I went through about 3 Pepsis (why San Diego doesn't have Coke is beyond me) before dinner to keep the energy levels up.

At the end of the show were the awards.  7 companies received DEMOGod awards, and 2 each received media prizes - 1 company in the consumer category and 1 in the enterprise category.  I'll admit that I was a bit miffed we didn't win the enterprise category, but c'est la vie.  Oh, we also got treated to a little dance by the DEMO staff.

Media_httpventurebeat_xhjgq

DEMO was finally over.  It was a great experience, but I was looking forward to a little relaxation the next day.  I figured we'd sleep in, check out San Diego for a bit, and enjoy the moment.  I couldn't have been more wrong...

Next-up, part 3: Post-DEMO, or "Now the Real Work Begins"

Our launch experience, part 1: beta

Wow, what a crazy few weeks it has been!  For those of you just tuning in, we just launched 80legs.  Since launching, we've been swamped with emails, press, tweets, and much more, but I thought I'd recap our experience, from beta to launch, including our experience at DEMO.

We announced our private beta at the Launch Pad event during the Web 2.0 Expo in San Francisco, back in April.  We had been working on 80legs since early 2008.  Around February, I decided we'd exhibit at the Web 2.0 Expo to get some early exposure.  When I signed the booth contract, we weren't thinking of making 80legs available in April.  But then I came across the Launch Pad event that they had.  Applying was pretty straightforward - all I had to do was fill out a form.  But the form asked for what kind of demo I could show right now, so that the judges could get a sense for what we did.  At that point, you could run a crawl through 80legs, but there was no pretty interface to it.  It was just command-line Java.  So in the form I said something like "Nothing to show now, but trust me - it will be really cool in April!!" and submitted it.

I was pretty sure nothing would come of it.  Surely they had several applications for products that looked shiny and sexy and would never accept anything as obtuse as a "distributed computing service designed for crawling and processing web content"... that wasn't even ready to show yet.  Then a few weeks later, I get an email welcoming us into Launch Pad.  Ohhh-k :)  I stood up from my desk (this is mid-February, I think) and said:

Guys, I've got news. We're launching our beta in April.

We are?

Yes.  At the Web 2.0 Expo.   In front of hundreds of people.  On stage.

We don't have an interface.  Or any way for people to setup an account.  And we're still making the crawling reliable.

Yeah.  I guess we have a month to do that!

So during March, we scrambled putting together the first version of the web portal, getting the crawling to an acceptable state, and a bunch of other stuff.  It was nose-to-the-ground, grind-away work, but at Launch Pad, we had something to show and it looked good (well, for a beta).  The Launch Pad garnered us some press as well.

We got about 300 sign-ups for the private beta - not bad for a technical product.  We decided on letting them into 80legs in periodic batches.  On retrospect, we could have handled this better.  The first couple of batches let in responded well and offered substantive feedback.  But later batches, which may have had to wait a few months, had forgotten about us.  The excitement had worn off.  It would have been better to let them all in at once, or to at least have sent them reminders.

During our beta period, we spent a ton of time on collecting feedback from users, quickly implementing suggestions we felt were important, and scaling up our crawling ability.  Every 2-3 weeks we worked on a major new feature, such as crawler improvements, 80Apps, the API and several others.  At the same time, we were implementing a ton of minor features to make the system more robust and usable.

Our beta was going well and was getting to the point where we were starting to think about going live.  But we wanted to make a splash with our live launch.  We needed something that would get the momentum going again.  Something big...

Stay tuned for part 2: DEMO ..!

80legs has launched!!

The day is finally here!  We are now live, beta has exited, 1.0 is a go!

Before I go any further, I want to thank the many beta users that helped us over the last several months by providing feedback, suggestions for improvements, and identifying bugs.  Without your help, we wouldn't have been able to get 80legs to where it is today.

During the private beta, we were working on several features, all of which are now ready for public use.  These features are:

  • True web-scale crawling: crawl up to 2 billion pages per day
  • Usability: easily and design your own crawls using an intuitive job form
  • 80Apps: write and run your own applications on over 50,000 computers
  • API: programmatically control 80legs to work for you

There's also one big change that comes with leaving beta - 80legs is no longer completely free to use.  Our pricing is now in effect.  You can still dip your toes in the water and run jobs that crawl less than 100 pages.

We are doing our official launch announcement at DEMO.  If you happen to be at the show, please come and visit us at pavilion station #2!

0.9 released! 10M page crawls, API, easy-to-use interface and more!

We just pushed out version 0.9, which is a big, big update to the system.  This release includes several upgrades to our back-end architecture (allowing larger jobs), a Java API (allowing programmatic access), an easy-to-use job form (allowing easier access), and a bunch of other cool things!

Here's a list of the specific features:

  • Large crawls are now supported.  Crawl up to 10 million pages per job!
  • The API is officially released.  Submit jobs, download results and much more using Java.
  • A much easier-to-use job form.  We realized the old job form was a bit clunky.  The new one is much easier to understand.
  • To go along with the new job form, we've updated the entire portal to be easier to navigate and use.
  • You can now load in external JARs into your 80Apps.  This lets developers use third-party code more easily.
  • Several improvements to the crawler, including:
    • Options to select your type of crawl.  Choose among fast, comprehensive, and breadth-first.
    • Crawler now crawls https:// pages.
    • Crawler tries to fetch a page more than once before giving up.

Since we just released 0.9, I suppose that technically makes us 0.1 from a beta exit!  Some of the upcoming features are:

  • Finalizing the payment system in preparation for beta exit and charging actual money.
  • Providing useful default 80Apps for all users (this is also in preparation for the app store model we'll be pursuing).

See full release log details at http://80legs.pbworks.com/Release-Log.

Testing out some improvements to our crawling back-end

Some of our users may have noticed we recently lowered the limit on # of pages crawled per job to just 10,000.  This is a temporary measure while we test out some major improvements to the crawling back-end.  If the tests go well, the limit will be pushed back up to 1 million (probably 10 million, actually) by the end of this week or early next week.  We think this will be the last major upgrade to our back-end.  It should allow 80legs to more-easily scale into the billions of pages crawled per job. So please bear with us as we continue to work on the service - thanks so much!!

Released 0.83 - performance improvements and large seed lists

We pushed out 0.83 today.  This release was mostly done to push out some improvements in our crawling and back-end data store, which should help the overall performance of 80legs. We also took the opportunity to push out some new functionality, including allowing users to upload very large seed lists (up to 1 GB!).  To upload these seed lists, you'll need to go to the new "Seed Lists" section in the portal.  The interface is still a bit on the "raw" side, so let us know if you encounter any problems. You can see the full list of changes at http://80legs.pbworks.com/Release-Log#Release0838July2009.

Released 0.82 - the improvements keep coming!

We've just pushed out 0.82.  Improvements and changes include:
  • Smarter URL selection for larger crawls
  • Sandbox jobs run automatically and the user gets access to stdout from their 80App
  • Domain throttling information in the portal
  • Time estimates shown in the portal
  • Crawled result files additions:
    • page size
    • parse time in milliseconds
    • process time in milliseconds
    • compute timeouts get COMPUTE_TIMEOUT_GOOD or COMPUTE_TIMEOUT_BAD
  • Several improvements for large job performance
  • User can specify data for the jar upload which gets passed into the initialize() during the validation test
  • Fixed problem with multiple Loading Code errors
  • Improved default link parsing
  • Better web portal login behavior
As usual, we've started working on the next release already, which will have things like:
  • Allowing larger crawls
  • Allowing larger seed lists
  • Creating result files on the fly
Check out http://80legs.pbworks.com/Release-Log for all the details!

You can now run custom code on 80legs - version 0.8 released!

We're very excited to announce that you can now run custom code on 80legs.  We have just released version 0.8, which gives users the ability to write their own content analysis logic using processDocument() and their own link extraction logic using parseLinks().  For more information on how to write and run code on 80legs, please visit http://80legs.pbworks.com/Custom-Code. The total list of changes in this release include:
  • Custom code initial release (first IWebAnalysisConnector release with parseLinks() and processDocument())
  • Option to analyze specific MIME types
  • Option to preserve query strings when crawling
  • Resulting crawl list shows status codes and other reasons for failing to crawl (e.g. robots.txt, DNS, etc)
  • Better handling of failed URLs
  • Sandbox server for testing custom code on your own machine using the 80legs framework.
  • Stop problem jobs automatically
We've also granted access to several more users on our private beta list.  If you haven't received access yet, but would really like to get access soon, please let us know, and we'll try and include you in the next set of beta users. We're already working on the new features, such as:
  • A web service for programmatically submitting and managing jobs
  • An "app store" that will allow users to run pre-built applications developed by trusted third-parties
  • Our payment system, which will be released first as a "demo", allowing users to get used to the system before actually requiring payment