Last month I spent a week in a summer school in a castle in Bertinoro, Italy, learning about programming language implementation.
I have always been fascinated by languages, both natural and programming. My favourite courses at university were French, German, Compilers and Formal Semantics for Programming Languages, the last one taught by the main architect of the Lua language, Roberto Ierusalimschy. Since then, I have been very active in the Lua community and keeping informal ties with its research lab.
As soon as I saw friends talking about this summer camp my eyes were shining with excitement. It was a great opportunity for me to know more about research in programming languages and deepen my current knowledge on it. The camp had a focus on dynamic languages, which is great because that’s exactly the same kind as Lua and the same as the ones I use most at work, such as JS and Ruby.
Knowing more about programming languages internals has made me a better programmer in general. For example, it’s always good to know how the garbage collection actually works. It’s also useful when picking tools, specially when differentiating trade-offs among different interpreters. Important values such as readability, predictability and conciseness also become more valued and begin to extrapolate to other areas of software development, such as tweaking UX or defining an API. There are many more advantages to understanding more about the internals of programming languages, which is unsurprising since it’s one of the main tools of the trade. On the downside, it might have made me slightly smug, because once you start learning more about them, you suddenly realise that all programming languages suck. How awful programming languages can be was a recurring theme during the camp.
Using MicroVM to build better programming languages
Steve Blackburn, from Australian National University, presented a lecture on MicroVM, a project aiming to replace / complement LLVM and serve as a solid minimal foundation for building new programming languages. While he was pointing out that many of the design flaws found in programming languages are implementation-driven, he brought us many gems about infamous behaviours. All the JS jokes were known to me, but it turns out you can still learn more bad things about PHP even after reading the fractal post:
$a = array(5); $c = $a; // PHP always passes array by copy echo $c; // 5 $a = 1; // So, changing ‘a’ shouldn’t change ‘c’ echo $c; // 5, correct
Now let’s modify this code and create a ‘b’ referencing $a:
$a = array(5); $b = &$a; $c = $a; // copying echo $c; // 5 $a = 1; echo $c; // 1
In case you are wondering, this happens because by referencing a value in ‘a’, PHP unboxes that value somewhere else for performance reasons, making ‘a’ and ‘b’ point to it. So when you copy ‘a’ to ‘c’, you copy that reference, not the value. This behaviour was originally reported as a bug back in 2002, but it was never fixed as it would make PHP too slow. They decided to just document it and it remains like that today.
By using tools like MicroVM for building new programming languages, you do not own the responsibility for things like Just-In-Time Compilation or Garbage Collection. You maintain only parts higher up in the abstraction, reducing the probability of having implementation details influencing the design of your language.
Another highlight of the summer camp to me was the lecture with Matt Might, from the University of Utah, advisor to the White House, also known for writing the Illustrated guide to a PhD. He covered static analysis and abstract interpretation. What impressed me the most about his lecture is that I understood it, it was broken down really well. This is not exactly the same lecture, but since there are no videos from PLISS online yet, I recommend taking a look at this other presentation:
To sum up very roughly, these techniques are used to explore all possible execution paths of a program without actually executing it. It is very useful for engineering safer software by having predictive models for its behaviour, avoiding errors. It can also be very useful for building optimisations. By doing this, we bring software engineering closer to other engineering fields, as opposed to what we normally do today.
There were many other great lectures during PLISS. I also greatly enjoyed learning more about the Julia language with Jeff Bezanson, one of its co-creators. Julia is a great dynamic language for scientific computing, with speeds close to LuaJIT but a rather complex typing system. I could definitely see it being used in fields such as data science / machine learning, if it’s not already. It definitely looks like a better alternative to Python or R, but I would have to look up more details about it to make a fair comparison with Lua.
The summer camp was not however only hard work, after all, it was Italy with a bunch of students and researchers so there was a lot of chatting, wine, pizza and some hiking involved!
Overall it was a great experience! Special thanks go to the organisers of this amazing event, especially Jan Vitek, Laurence Tratt and Lucie Lerch, and Red Badger for organising my registration through the company’s training budget. If you decide to join us, you will also get your training budget and attend exciting camps such as this one!