The two days long RAMP Conf (organized by Prezi, Ustream, LogMeIn) in Budapest has just finished, and while it wasn't really a life changing experience, I still quite enjoyed to listen to stories of companies like Dropbox, Reddit or Flickr, about how to scale to hundreds of millions of users, requests, megabytes of data etc. After all it's not everyday that you can listen to lessons from companies who most successfully tackled such issues. Here are a few highlights, in this rather cartoonish form. Let me start with the most important message of the conference - it was taken from a presentation by Péter Boros (Percona):
I must admit that I don't remember much from the keynote, which doesn't mean that it wasn't interesting, but there was no takeaway for me (so no pictures of that). It was more or less about how open source and scalability go hand in hand.
The first talk after the keynote was by Rajiv Eranki (Dropbox), telling their story and the morals of the story:
Right after Dropbox, came the story of Flickr and several other companies, which Dathan Pattishall worked for. His advice was:
I'm very happy that he mentioned this:
More advice regarding this issue:
After a couple of talks by people from the U.S., came the team from Berlin, "the hipster capital of Europe": Sebastian Ohm and Tomás Senart (Soundcloud)
By that time there was a pattern emerging- simplicity, redundancy, monitoring etc. were pretty much the keywords. If I had to sum up everything so far, I would say: do nothing but monitor the shit out of your system, find the bottlenecks, eliminate them. The next slide is actually closely related to a problem Kinja is starting to face: how to distribute content (posts, activity feeds etc.) in a social graph. Btw interesting how a lot of presenters mentioned using Cassandra.
I got the impression that they didn't really like Skrillex. He's got a million followers on Soundcloud, so imagine those spikes caused by a new release.
One of the best presentations was Gareth Rushgrove's (gov.uk), who works for Government Digital Services, a team within the Cabinet Office, so basically he works for the British government. Everything he said was simply surreal, especially to someone living in Eastern Europe. The government in the UK has decided to actually serve the citizens and build something useful for them. A public IT project which' primary aim is not to canalize taxpayer money out somewhere, but to serve the public! He talked about things like providing API's, using only open source tools ("we're spending taxpayer money after all"), open sourcing most of this very project. Even their issue tracker was open to the public. I mean just look at this slide - when was the last time your government was concerned with your needs?
The second day of the conference also kicked off with a high profile company, Reddit. Jeremy Edberg (currently at Netflix) walked us through their journey:
Btw one way or another, Amazon was mentioned in almost all presentations. Another recurring theme was:
I was quite surprised that this is actually a dilemma:
Following that came Gergely Timar (Yahoo!), who summed up some of what they have learnt in the first ten years of building a leading analytics tool, Indextools, in ten lessons. Indextools was one of the first very successful Hungarian startups, and they were acquired by Yahoo! in 2008. I remember having a job interview with them back in 2006, when I was a very junior Java developer, so obviously they didn't hire me. Anyway, no hard feelings! It's no coincidence they've chosen this as the first lesson - they have built their own database engine!
Production is where you will first face the really hard issues, so
I like this next advice. They have clients all over the world, so they need to convert UTC times to other timezones a lot (like for all of their datapoints), while taking into consideration stuff like daylight saving and such. It's not a trivial thing to do, especially on the fly, so they ended up generating a table of offsets for all timezones for every hour of every day between the years of 1990 and 2020. Just a couple of days ago I suggested that what we should do at Gawker Media is to generate all possible articles, posts and comments (no challenge in the last one haha!) of all of our sites, as static HTML, to really boost performance. Okay, I don't wanna make fun of this lesson, it was a good advice indeed.
Danish geek Poul-Henning Kamp (Varnish Software) started with explaining how the HTTP protocol sucks in its current form, but pretty quickly we found ourselves in the middle of a human rights speech about privacy, where NSA and even Snowden was mentioned multiple times. He said for example, that in the UK, if you don't provide your password of your encrypted files to the authorities, when asked for, they can jail you for two years. If someone had put an encrypted file on your machine without you knowing, you're pretty much screwed. I have to say shocking statements regarding human rights are very captivating, even at a scalability conference.
He finished his talk with SPDY/HTTP/2.0, which doesn't really address any of the issues related to HTTP 1.1 being completely outdated and from a very different era. I guess this has a lot to do with scalability too after all.
The last presentation I sat through was a quite theoretical one, by Zoltan Toth-Czifra (Softonic). Lot of formulas:
The formula below is throughput or capacity (C) as a function of concurrency (N). Throughput here can be something like pages served per second, as a function of number of concurrent clients, or let's say db queries per second, as a function of connected threads. α is the contention coefficient, and β is the coherency delay coefficient. Examples of these scalability limits are locks, sync points, shared resources etc. These will define how scalable your system is, and in order to find out the value of these, you need to measure, so measure the shit out of your system. By the way this formula, not surprisingly, is something like a plateau curve.
And finally here's the best sponsor pitch ever at a conference. (Also check out my hungarian post about conference souvenirs!)