Erlang Hot Code Swapping -> Hacking Nirvana
When I first heard of Erlang hot code swapping, I thought, “What a fantastic — no, wait — essential feature for systems that have five nines availability requirements. No wonder Erlang probably powers my phone company’s 911 switch. Too bad hot I won’t get to enjoy this powerful feature in my after-work Erlang hacking.”
I’m happy to say I was wrong.
In my free time, I’ve been hacking a haXe remoting adapter into Yaws, a very powerful and scalable Erlang web server. I picked this project because think haXe is a great web client language and Erlang is unbeatable on the server side for certain purposes. I mentioned some of the reasons in previous posts and will probably discuss this more in the future (haXe is also a very good server language, by the way, and is arguably better than Erlang for many applications). What could be better than integrating the two so I can use them both in future projects?
I’m still fairly new to Erlang, and since I only work on this project on my free time, it’s not going as fast as I would have wanted. Oh well.
I got the Yaws source code, and frankly I was a little lost at first. Where do I start? I decided that my first victim will be the Yaws JSON serializer/deserializer because it’s an independent module. I copied json.erl to haxe.erl, opened Emacs (which I haven’t used for programming since college) and a separate Erlang shell window, and modified the module’s functions while testing them in the shell. That was relatively straightforward. The most challenging parts were wrapping my head around Continuation Passing Style, which the JSON parser uses, figuring out the haXe binary format, which isn’t very documented, and mapping haXe types to Erlang types. haXe has class objects and Enums which Erlang doesn’t have. At first, I tried to simulate classes and Enums in Erlang, but later I realized that using such structures in Erlang code would be too laborious. I decided to remove support for such types, also because I believe that arrays and anonymous objects should suffice for most RPC needs.
Now that my serializer/deserializer was finished, I got back to hacking Yaws’s internals, and to my original state of confusion. I didn’t know exactly how all the Yaws modules interact with each other. All I knew was that yaws_jsonrpc.erl contained the JSON RPC handling logic into which I wanted to hook. I wasn’t sure how I would isolate this module from the rest of the system in order to test my implementation, which, at least initially, depended on a haXe client sending requests to the server.
My first approach was to stop Yaws, hack yaws_jsonrpc.erl (generally by adding logging statments in a few places to figure out the code flow), then run the Yaws build and install script, and restart Yaws. Needless to say, this was a very slow development effort, reminiscent of Java servlet hacking in the pre-Eclipse server integration days (a torture so horrible I wouldn’t wish it upon even the new landlord who won’t renew my lease :) ).
Then I had one of those earth-shattering, life-changing realizations that shook my foundations and elevated me to a higher plane of existence: This isn’t Java — it’s Erlang. I can hack the code while Yaws is running and hot-deploy my changes!
Yes, it works, and it’s wonderful. I run Yaws in interactive mode, where Yaws exposes an Erlang shell. Every time I make a change to a file, I simply recompile it by calling “c(FileName).” and the changes are deployed into Yaws while Yaws is running. This brings about such a speed-up in prototyping and development that any nostalgia I had left for some IDE-supported keyboard shortcut for a maddeningly slow server restart has gone in a puff of smoke.
This hot code-deploy trick is probably old news to experienced Erlang hackers, but for me it was exciting. Now that I’m armed with new knowledge, my challenge is to stop blogging about coding and to actually write code so I can get this haXe remoting adapter finished.
Why I Moved from Blogger to Wordpress
I used to use Blogger, but I recently decided to move my blog to Wordpress. The primary reason I decided to leave Blogger is Blogger’s pathetic security, mostly due to the lack of SSL access. I picked Wordpress for my blog’s new home because Wordpress has some of the best features and positive overall experience out of all blogging services I know. In fact, Wordpress’s only minor drawback in my mind is the lack of manual control over the templates, but I’m not a customization freak, so this isn’t a big concern for me.
Blogger doesn’t even let you log in over SSL, not to mention keeping your session over SSL while you’re editing your blog. When you change your password, Blogger doesn’t even send you a validation email. What does that mean? Every 12 year old hacker armed with Ethereal or tcpdump can steal your password by eavesdropping on your connection, and can then go ahead and change your password and thereby hijack your blog.
Your blog is a large part of your your online identity. It’s often the first thing that shows on search engines when people search for your name. It’s valuable. I’m not comfortable with the thought that my blog could be hijacked so easily and there’s nothing I can do to prevent it. (I did read that certain blogging applications let you use Blogger over SSL, but that’s one more hoop than I’m willing to jump.)
I dread the day when somebody stages a large scale attack on Blogger and hijacks thousands if not millions of blogs. Maybe such an event would kick Google’s butt into action, getting it to turn on the SSL switch on the Blogger servers. I suppose that if this happened, Blogger could mitigate the disaster by rolling back all changes that happened during the attack, and then resetting all passwords. The damage would be significant, but not irreversible. I’m actually more concerned about individual blogs getting hijacked without Blogger’s knowing or caring.
Wordpress has SSL access, so this problem largely doesn’t affect Wordpress users (I say “largely” because the Wordpress servers could always be cracked and the user data could be stolen, but the risk is very small). That’s a huge advantage for Wordpress, and is the primary reason I moved here. I must say I’m happy here so far. I may decide to host my blog on my own server eventually, which would have is downsides, but it’s likely that Wordpress will remain my blog’s permanent home.
More Erlang
Strange trends are taking place in the web progamming world. As new languages come and go, developers are overlooking a mighty beast whose unparalleled power is $0 plus a mental barrier away: Erlang.
I mentioned Erlang in previous posts. Here’s a quick recap on Erlang’s history: in the early 1980s, Ericsson assembled a team of computer scientists who were devise the best methods for developing scalable, fault tolerant systems with soft real-time performance requirements. After much experimentation and development, Erlang, a functional language with built in notions of concurrency, was born. This need for a new language was real: no existing language was suitable for solving Ericsson’s problems, and when you’re in the business of selling telephone switches to the world’s largest telcos, you can’t let a language with inadequate notions of concurrency and fault tolerance get in your way. The design decisions behind Erlang turned out to be very powerful, and this eventually gave Ericsson a solid market lead over the competition and positioned Ericsson as a dominant force in the telcom switch market.
Fortunately, the power of Erlang isn’t stashed away in some grey corporate computer lab. In the 1990’s, Ericsson released Erlang to the open source community, thereby giving every developer the power to build scalable distributed backends with (relative) ease.
Since its release, Erlang has been making headway in the open source world. An example of a recent convert is jabber.org, home of the Jabber Software Foundation (Jabber is the leading open IM standard, used by numerous organizations and IM providers, including Google Talk and Gizmo Project). jabber.org has recently switched its Jabber server from jabberd, which is written in C, to ejabberd, written in Erlang. This press release discusses jabber.org’s move. jabber.org operates an instant messaging service with very high requirement for reliability and for handling large numbers of simultaneous connections (just like a telephone exchange), so it’s no surprise that a server written in Erlang was jabber.org’s server of choice.
I think that Erlang’s strengths in the areas of concurrency, scalability and fault tolerance make it a good contender for being a more widely used web development language. The main reasons web developers haven’t adopted Erlang in large numbers yet are, in my opinion, 1) Erlang has different semantics, which will always discourage some developers 2) Erlang needs better PR and 3) Erlang doesn’t have an integrated web development framework like Ruby on Rails (I’m a huge Ruby on Rails fan, by the way). Efforts to build such a framework are apparently under way. Once they are mature, web developers will be able to tap into Erlang’s strengths more easily, and Erlang will in turn enjoy the best kind of marketing: word-of-mouth.
How does Erlang achieve much greater scalability with large numbers of concurrent processes than other programming languages? Erlang processes are very lightweight — much more than OS processes and threads — and the Erlang VM, BEAM, does the scheduling. BEAM is mostly event driven, and no lightweight process blocks the whole VM for very long. On multi-processor machines, BEAM launches (by default) one scheduler per processor. Erlang applications are normally designed from the ground up with concurrency in mind, so it’s easy for Erlang code to take advantage of most, if not all, available processors. In a recent posting on the Erlang mailing list, Joe Armstrong, described an expriment he conducted on a Sun Niagara box with 32 CPUs, in which changing a single function call from map() to pmap() made his application’s performance scale with up to 16 CPUs. With upcoming BEAM improvements, additional scalability is expected. Joe gives background to the experiment here. Quote:
Erlang also maps nicely onto multi-core CPUs – why is this? – precisely because we use a non-shared lots of parallel processes model of computation. No shared memory, no threads, no locks = ease of running on a parallel CPU.
Believe me, making your favourite C++ application run really fast on a multi-core CPU is no easy job. By the time the Java/C++ gang have figured out how to throw away threads and use processes and how to structure their application into small lightweight processes they will be where we were 20 years ago.
Does this work? – yes – we are experimenting with Erlang programs on the sun Niagara – the results are disappointing: our message passing benchmark only goes 18 times faster on 32 CPU’s – but 18 is not too bad – if any C++ fans want to try the Naigara all they have to do is make sure they have a multi-threaded version of their application, debug it -’cos it probably won’t work and they can compare their results with us (and I’m not holding my breath).
Turning a sequential program in a parallel program for the Niagara is really easy. Just change map/2 to pmap/2 in a few well chosen places in your program and sit back and enjoy.
Efficency comes from a correct underlying architecture, in this case being able to actually use all the CPUs on a multi-core CPU. The ability to scale and application, to make it very efficient, to distribute it depends upon how well we can slit the application up into chuncks that can be evaluated in parallel. Erlang programmers have a head start here.
The following graph shows the result of an experiment Joe and colleagues conducted to compare the performance of Yaws, an Erlang web server, and Apache, under very high load — in effect, a simulated DDOS attack:
Here’s Joe’s explanation:
Apache (blue and green) dies when subject to a load of c. 4000 parallel sessions. Yaws (red) works well even when subject to high load.
The red curve is yaws (running on an NFS file system). The blue curve is apache (running on an NFS file system). The green curve is apache (running on a local file system).
…
Our figure shows the performance of a server when subject to parallel load. This kind of load is often generated in a so-called “Distributed denial of service attack”.
Apache dies at about 4,000 parallel sessions. Yaws is still functioning at over 80,000 parallel connections.
You can read the full description of the experiment on Joe’s website.
Erlang is powerful, and once it has a good web development framework, I think it will become many more developers’ web language of choice. Interesting times are ahead for Erlang.
Helen OS
I just saw on Digg this interesting a link to an interesting new open source operating system, Helen OS. Among other things, HelenOS has support for SMP, Kernel threads, userspace threads, userspace pseudo-threads (”Userspace pseudo threads are very lightweight threads running in the context of one userspace thread”) and IPC (”the ability of userspace threads to communicate with other threads (possibly from different tasks) via sending and receiving, synchronously or asynchronously, short messages”). The full list is here.
Some of these features strike me as very similar to those that provided by Erlang and its virtual machine. I wonder if HelenOS developers took some cues from Erlang’s success at scaling to large numbers of concurrent processes by keeping them very lightweight. This raises the interesting question of whether such features, when provided by the OS, make it possible to write C/C++ programs with the same scalability characteristics as Erlang programs at large numbers of concurrent processes. I should stress that the operative word here is “possible” — not “easy”!
Let’s sit back and wait for the benchmarks.
Erlang
A few months ago, I discovered Erlang and quickly became fascinated by it. Erlang is a dynamically typed functional programming language that runs on a special virtual machine (called BEAM) and a set of libraries developed by Ericsson for the purpose of building large-scale distributed, faul-tolerant applications with soft real time peformance requirements.
Erlang used to be a proprietary technology developed and owned by Ericsson for building large telephone switches. In 1998, Ericsson released Erlang to the community under an open source license. I first heard about Erlang in the context of ejabberd, when I was looking at different Jabber servers for possible deployment at my company. I initially rejected the idea of deploying a server written in a weird, obscure language called Erlang, but as I dug deeper I discovered Erlang’s beauty and the power it gives developers for building scalable, robust distributed applications. (Although we didn’t end up deploying ejabberd, a good testament to its quality is that Jabber.org has recently made the switch from jabberd, written in C/C++, to ejabberd.)
Erlang’s support for distributed programming is unmatched by any other language I know. A core capability of Erlang is spawning lightweight processes that can send and receive messages to each other. A message can be any Erlang term (e.g. {foo, bar, 34, [4,5,6]}), and Erlang’s message sending and pattern matching syntax makes message processing a breeze. I know you may think you can imitate the same concurrency facilities in [your favorite language here] using its threading API, but you’re probably mistaken. Erlang processes are much more lightweight than OS threads, and hence Erlang scales much better with large numbers of concurrent processes. In addition, Erlang has capabilites such as hot code swapping and remote code deployment, which, in addition to lightweight processes, most languages are probably many years away from having.
Consider the following example. It shows how to spawn a process, send it messages that for printing to console, and then sending it a message to terminate.
-module(example).
-export([start/0, listen/0]).
listen() ->
receive
{msg, Text} ->
io:format("got message: ~s", [Text]),
listen();
stop ->
io:format("goodbye", [])
end.
start() ->
Pid = spawn(example, listen, []),
Pid ! {msg, "hello world"},
Pid ! stop.
I hope this gives you a sense of how easy Erlang makes concurrent programming. Of course, this only scratches the surface. There’s much more, including a super high performance web server called Yaws and a distributed transactional database called Mnesia, both written in Erlang.
I’ll write more about Erlang in the future. For now, I hope I’ve been able to pique your curiousity. In case you’re not fully convinced that Erlang is very powerful, consider the fact that Erlang powers the telephone system in the UK with 31ms downtime per year — that’s 99.9999999% availability. That’s very impressive.
MacBook: 1, Dual G5: 0
I knew the Intel Macs were fast, but I didn’t expect my (low end) MacBook to put my Dual G5 PowerMac to shame in a task that’s both CPU and IO intensive.
I timed the compilation time for a mid-size C/C++ project using xcodebuild, and here are my results:
So, the MacBook compiles about 40% faster than the Dual G5.
That’s pretty awesome.
Secure Portable Storage with OS X
If you’re an OS X user and you store sensitive files on your iPod or flash drive, you’re probably looking for ways to secure your data in case your portable storage device falls into the wrong hands. Some flash drives have proprietary data protection mechanisms, but they often don’t work with OS X. More importantly, the iPod doesn’t have such capability built-in. The best way mechanism I found was to create an encrypted disk image and use it as a virtual drive for your sensitive files. This disk image is safe to carry around because it protects your data with 128 bit AES encryption, which is uncrackable by all practical means.
Here’s how you do it:
Open the terminal and type
cd /Volumes/[name of portable storage device]
hdiutil create -fs HFS+ -encryption -type SPARSE -volname “My Drive” securedrive
This creates a new disk image on your portable storage device called securedrive.sparseimage. You can mount the disk image by executing “hdiutil mount securedrive.sparseimage” or by double clicking on the disk image in Finder. This will show the virtual drive in Finder as volume “My Drive” as well as in the /Volumes directory.
You can copy or drag and drop your files into the newly mounted virtual drive and your data will be safe. Just don’t forget to cleanly eject (unmount) the virtual drive (using the Finder eject button or by executing ‘hdiutil unmount “My Drive”‘), as well as you portable storage device, before you physically disconnect the portable storage device from your computer.
Keep in mind is that when you delete files from the virtual drive, the disk image doesn’t shrink automatically and the physical space taken by the files remains unavailable. To reclaim this space, unmount the virtual drive and type
cd /Volumes/[name of portable storage device]
hdiutil compact securedrive.sparseimage
That’ll give you those precious bytes back.

