Posts Tagged ‘distributed systems’


QCon Notes Part 1: The Rise of Erlang

by Can Gencer

Last week most of my time was spent at QCon London. QCon is an annual international software development conference in London that covers a broad range of topics within the software development world. After looking through the schedule for this year, I ended up spending a good chunk of my Red Badger Training Budget on the conference, and it was totally worth it. The conference was 3 days with a massive amount of interesting content. I will try to do a series of blogs to cover the topics which I thought were highly interesting.

One of the most prominent programming languages at the conference was Erlang. As Damien Katz said in his talk, Erlang is a language from the future, built perfectly to scale reliably to multiple processors and cores at a time such needs did not exist. I attended several sessions relating to Erlang, driven by my curiosity for the language.

Building Highly Available Systems in Erlang

Joe Armstrong

Download Slides

Joe Armstrong is the creator of the Erlang language. He gave an excellent introduction to what highly available systems are, and how such a system can be built. Highly Available systems have six rules that they need to follow.These are:

  1. Isolation
  2. Concurrency
  3. Failure Detection
  4. Fault Identification
  5. Live Code Upgrade
  6. Stable Storage

Erlang is a programming language designed to satisfy all these six rules. It is not a coincidence that some of the world’s most reliable systems have been written in Erlang.

Erlang programs consist of many small processes, which correspond to something between an object and a thread in terms of size. An empty process is around 300 bytes, and the Erlang VM is capable of hosting millions of such processes. Each process is completely isolated and communicate with each other through only messages. A failing process does not effect the rest of the system, and can easily be restarted by a supervisor process. The lack of shared memory and mutable state ensures that it is easy to produce very reliable code.

One of the highlights of the talk was this quote from Alan Kay:

“Folks –
Just a gentle reminder that I took some pains at the last OOPSLA to try to remind everyone that Smalltalk is not only NOT its syntax or the class library, it is not even about classes. I’m sorry that I long ago coined the term “objects” for this topic because it gets many people to
focus on the lesser idea.

The big idea is “messaging” — that is what the kernel of Smalltalk/ Squeak is all about (and it’s something that was never quite completed in our Xerox PARC phase)….”

The key idea in OOP has always been messaging and encapsulation rather than classes or methods, which, unfortunately has been how OOP is being taught generally.

An interesting footnote from the session was when asked about his opinion about Node.js, Joe mentioned that he is not really fond of event based programming and the style of such programming is difficult.

Games for the Masses – How DevOps affects architecture design

Jesper Richter-Reichhelm (Wooga)

Download Slides

Jesper works for Wooga, a German social gaming developer. While their cute Facebook games might not be terribly interesting for software developers, a backend for a single game deals with more than 20 million requests in a day and more than 100,000 DB ops in a second, which makes things a little more interesting.

Jesper outlined the journey Wooga took in terms of evolving architecture, where each new game gave them an opportunity to try something new and evolve their technology.

Starting with a traditional technology stack (MySQL/PHP/Ruby on Rails), the engineers at Wooga eliminated their database bottleneck first by using Redis and ultimately by switching from a stateless server to a stateful one.

To build a robust stateful server, they used Erlang, which brought in other problems such as code readability, testability and maintanability. Their ultimate solution to this was to use Erlang for the core parts of their backend and handoff data to small workers in Ruby using a message queue, which gave them the best of two worlds.

Jesper emphasized how the Wooga’s focus on small teams, collaboration, generalists, effort reduction and innovation paid off in spades in their journey to become the 2nd biggest social media games development company.

Building Distributed Systems with Riak Core

Steve Vinos (Basho)

Download Slides

Riak is a distributed database which is roughly based on Amazon’s Dynamo paper. It is similar to NoSQL databases such as Cassandra, CouchDB and Voldemort. Riak Core is a seperate component from Riak DB, and deals with the distributed systems aspect of Riak DB.

The session was a deep dive into how Riak Core implements availability, eventual consistency and partition tolerance, which are the three key aspects of any distributed system. Possibly one of the most technical sessions that I’ve attended at QCon, it was an inside look into how a distributed system works and how Riak Core solves many of the problems such systems encounter.

Not surprsingly, Riak Core is written in Erlang, which makes messaging across distributed system easy since Erlang processes communicate the same with each other the same regardless of if they are residing on the same machine or not.

Lot of the times we as software developers take for granted that the systems we use should just “work”. This abstracts the underlying complexity away from us and makes easier to think in our problem domain. However, having a little insight into how a complex distributed system works under the hood is always interesting and good to know.

Erlang in the real world: CouchDB

Damien Katz (Couchbase)

Download Slides

Damien is the original creator of CouchDB, a document oriented database written in Erlang. Having worked with Erlang for a real application, he shared several of his observations.

  • Erlang is simple: the core of the language is small; there are very few types, no classes and no object orientation.
  • Erlang is weird: It has a syntax influenced by Prolog, which nobody uses and is nothing like other programming languages.
  • Erlang is extremely productive: You can be very productive with it and produce small, reliable code once you come to grips with the syntax.
  • Erlang is built for the current reality: The Erlang model of isolated memory and processes is closer to the current reality than the shared memory space most programming languages use for the current multi core architectures.

However, it has a caveat: Erlang performance is slow. The Erlang VM, while beautiful in design is not as fast as other VMs like the JVM. Damien linked the reason for this all the way to the strange syntax Erlang has. A language needs mass adoption and investment to be fast, and for this to happen it needs to be familiar to programmers. Erlang’s unfamiliar and “weird” syntax is preventing it from getting mass adoption.

Erlang is not a perfect fit for every problem, string processing being an example — but it is perfect for distributed systems that need to be reliabile.

A lot of the benefits of Erlang can be achieved in C/C++ by following certain practices, however will take as much as 5-10 times the coding effort, but 5-10 times the performance as well.

Damien’s new venture, Couchbase, uses a hybrid of Erlang and C/C++ because they simply cannot compete on pure performance with Erlang. However, an interesting point he made was that if you are running your own application, it might be cheaper to solve performance bottlenecks by simply spending more money on CPUs rather than on engineering time.

Wrap Up

Erlang, even with its quirks seems to be growing and in Damien’s session somebody mentioned that there is a new $5 million investment into the language to improve performance. Patterns found in Erlang can certainly be applied to other languages and help a software engineer approach problems differently. The programming language itself will no doubt continue to grow and be a major player in the concurrent future.