Introducing Presidential Vimagi, Where Anyone Can Paint Political Cartoons for US Presidents
With elections around the corner and politics everywhere these days, I couldn’t resist throwing some Erlang into the mix. Don’t worry, I’m not about to start discussing my political opinions on this blog. You can find plenty of political commentary on other sites. Instead, I decided to create a sub-Vimagi for political cartoons: Presidential Vimagi.
On Presidential Vimagi, you don’t write or talk about your political views. You paint them. If a picture is worth a thousand words, a painting is worth, well, you do the math.
To give you inspiration, I assembled an all star cast of American presidents, vice presidents and presidential candidates: George Bush, Dick Cheney, Barak Obama, Hillary Clinton, Mike Huckabee, John McCain, Ron Paul, Bill Clinton and Al Gore. They each have a profile and you can paint on their profile’s v.boards. You can also become friends with them if you want to show your support.
Presidential Vimagi is still a bit rough around the edges, so please let me know if you find any bugs or issues. Feature suggestions are also welcome, of course.
Update: I decided to take the site down for now. It’s too prone to vandalism and I don’t have the time to constantly monitor it to remove the offensive stuff. It was a fun experiment while it lasted. Oh well :)
Lisp Flavored Erlang!
Erlang has a Lisp syntax! Robert Virding just released LFE, his Lisp syntax compiler for Erlang. Finally, Erlang hackers enjoy the full power Lisp-style macros. I suspect it won’t be long before you’ll be able to hack ErlyWeb apps in LFE :)
While you’re here, please answer this Wufoo poll:
Your Daily Dose of Erlang Evangelism
This is a good one I fished from the mailing list:
Using Erlang continually makes me both smile and cry at the same time. I smile because of the overall simplicity it brings to solving all those hard issues I mentioned above, but I also cry knowing how many hours, days, weeks, and months my former colleagues and I spent trying
to solve all those really hard issues.
http://www.nabble.com/Steve-Vinosky-interview-to15709698.html#a15720750
Seaside-Style Programming in ErlyWeb
The Arc Challenge started an interesting thread in the ErlyWeb mailing list about continuations-driven web frameworks. ErlyWeb doesn’t have built-in support for continuations, but Arc does and so does Seaside. I haven’t paid much attention to the use of continuations in web frameworks before the Arc challenge, but I became especially interested in experimenting with them after seeing Seaside solution.
In case you haven’t read it, this is the requirement of the Arc challenge:
Write a program that causes the url said (e.g. http://localhost:port/said) to produce a page with an input field and a submit button. When the submit button is pressed, that should produce a second page with a single link saying “click here.” When that is clicked it should lead to a third page that says “you said: …” where … is whatever the user typed in the original input field. The third page must only show what the user actually typed. I.e. the value entered in the input field must not be passed in the url, or it would be possible to change the behavior of the final page by editing the url.
This is the original Arc solution:
(defop said req
(aform [w/link (pr "you said: " (arg _ "foo"))
(pr "click here")]
(input "foo")
(submit)))
This is how you would write the same logic in Erlang, if it had an Arc-like web framework:
said(A) ->
form(
fun(A1) ->
link(fun(A2) -> ["you said ", get_var(A1, "foo")] end,
"click here")
end,
[input("foo"),
submit()]).
The Erlang code is a bit more verbose, mostly because Erlang macros don’t allow you to hide the “fun() -> … end” syntax the way Lisp macros let you hide the (lambda ..) keyword.
This is the Seaside solution:
| something |
something := self request: 'Say something'.
self inform: 'Click here'.
self inform: something
IMO, this solution cheats a bit by using high-level functions for generating the HTML tags whereas the Arc version generates them explicitly. However, putting minor complaints aside, I think the Seaside version is the winner in readability. As a reddit comment said, it reads like prose. It doesn’t even declare any closures explicitly — it says exactly what it does and nothing more.
Wouldn’t it be cool if we could use this style of programming in ErlyWeb applications?
Fortunately, we can! I hacked a continuations plugin for ErlyWeb that lets you write Seaside-style code so fans of this programming style would feel at home with ErlyWeb. (This is all done in 105 lines of code :) ) Before I explain how the plugin works, I’ll show you how to create an ErlyWeb controller that implements the the Arc challenge using this plugin:
-module(said_controller).
-compile(export_all).
-import(continuations, [ask/2, confirm/2, pr/1]).
index(A) ->
continuations:do(
A, fun(K) ->
Name = ask(K, "name"),
confirm(K, "click here"),
pr(["your name: ", Name])
end).
This may seem more verbose than the Seaside code because of the module declarations at the top, but the “meat” is about the same. (I could make this code even smaller by integrating continations.erl deep into ErlyWeb, which would remove the explicit call to continuations:do(). However, I didn’t want to go too far with this proof of concept.)
How does this work? Using concurrency, of course! For each “continuation”, the plugin spawns a process and registers it in a Mnesia table according to a randomly generated key. The key (K) is encoded in the URLs to which the <form> and <a> tags point. When a request arrives, continuations:do() looks up the process in Mnesia and sends it a message of the type {A, self()}. The process does some work and sends back in reply the HTML to be rendered, and waits for the next message. The web server process receives the rendered HTML and sends it to the client using the normal ErlyWeb mechanisms.
If a process doesn’t receive a message in 10 seconds, it dies and removes itself from the Mnesia table, which provides automatic garbage collection to stale sessions.
You can get the code for continuations.erl here. Just remember it’s a proof of concept and I don’t recommend using it in a production environment.
(Before you use it in your application, make sure you call continuations:start().)
Final word: IMO, although continuations help write more natural code in certain multi-page interactions, most of the logic in web applications involves rendering dynamic pages for RESTful URLs. So, if your web framework doesn’t support continuations, don’t worry about it too much. It’s likely the code for your application wouldn’t be dramatically smaller if you could write it with the use of continuations. (That said, take my advice with a grain of salt. I haven’t used a continuations based framework to build a real application, so I may be missing something.)
More Erlang Fun: Distributed, Fault Tolerant MapReduce
I the Erlang Challenge posting, I showed how to implement a simple distributed parallel map function. You can pass it a list of nodes, and it performs each application of F1 on a random node (”node” means a connected Erlang VM that may exist on a different machine) from the list. Finally, it collects the results in the order of the arguments.
What Erlang nailed beside concurrent and distributed programming is fault tolerance. When processes die or nodes go down, the VM notifies their supervisors so they can perform error recovery. My previous example didn’t demonstrate this capability, so in this article I’ll show how to implement a similar solution but also taking into account node failures. In addition, instead of doing just a simple parallel map, this example will also fold the results, making it a basic MapReduce implementation.
(The reason I’m adding the ‘reduce’ phase is because it allows me to not worry about the order at which the results are accumulated. This implies that the reduce operation must be commutative.)
The function mapreduce() takes a function F1 and applies to elements of list L. Each application is performed on a remote node from Nodes. As opposed to the pmap() example, this implementation chooses the next node from the list in a rotating fashion instead of randomly (i.e., if Nodes is [A, B, C], then F1 is applied on remote nodes with the following order: A, B, C, A, B, C, A …). The results are collected, and if any nodes went down, their computations are retried on the remaining nodes until all computations succeed or no nodes are left. If the computations all succeed, we call lists:foldl(F2, Acc, Vals), where Vals contains the results of the computations.
Implementation details:
- do_map() returns a list of {Pid, X} tuples. Pid is the process on the remote node in which F1 was applied to X. The reason we want to remember X is that if Pid’s node goes down, we can later retry to evaluate F1(X) on a different node.
- For each {Pid, X} tuple, collect() receives either an {ok, Pid, Val} message, indicating the computation was performed successfully, or {nodedown, Node}, indicating the node on which Pid lived went down. collect() folds the result into a tuple of type {Successes, Failures, RemainingNodes}. If Failures is not empty, collect() calls do_map() on the list of Failures and the remaining nodes, and then recursively calls itself on the result and appends the outcome to Successes.
The code is below. (Note that I didn’t test it, so it may have bugs :) )
-module(mapreduce).
-export([mapreduce/5]).
mapreduce(F1, F2, Acc, L, Nodes) ->
%% map F1 to the elements of L on nodes from Nodes
Results = do_map(F1, L, Nodes),
%% collect the results, retrying in the event of failures
Vals = collect(F1, Results, Nodes),
%% perform the reduce operation
lists:foldl(F2, Acc, Vals).
%% exit if we ran out of good nodes
do_map(_F1, _L, []) -> exit(no_nodes);
do_map(F1, L, Nodes) ->
Parent = self(),
%% apply F1 to values of L on remote nodes in a rotating fashion
{Results, _Nodes1} =
lists:foldl(
fun(X, {Acc, [Node | Rest]}) ->
erlang:monitor_node(Node),
Fun = fun() -> Parent ! {ok, self(), (catch F1(X))} end,
Pid = spawn(Node, Fun),
{[{Pid, X} | Acc], Rest ++ [Node]}
end, {[], Nodes}, L),
Results.
collect(F1, Results, Nodes) ->
collect(F1, Results, Nodes, []).
collect(F1, Results, Nodes, Acc) ->
{Successes, Failures, RemainingNodes}=
lists:foldl(
fun({Pid, X}, {Successes1, Failures1, Nodes1}) ->
Node = node(Pid),
receive
{ok, Pid, Val} ->
{[Val | Successes1], Failures1, Nodes1};
%% we may receive this message because of call to
%% monitor_node()
{nodedown, Node} ->
{Successes1, [X | Failures1], Nodes1 -- Node}
end
end, {[], [], Nodes}, Results),
if Failures =/= [] ->
%% retry the failed computations on the remaining nodes
%% and add the results to the current list of successes
Results2 = do_map(F1, Failures, RemainingNodes),
collect(F1, Results2, RemainingNodes, Successes ++ Acc);
true ->
Successes ++ Acc
end.
This is more complex than the pmap() example, but I think it packs a lot of functionality into a relatively small amount of code
Erlang Challenge Followup
After I published the Erlang Challenge, I saw a few comments on Hacker News and Reddit repudiating it with the following arguments, to which I’d like to respond:
- It was a snide response to the Arc challenge.
Maybe I should have read the article a couple of times over before I published it late last night, because if it came off as snide, I didn’t communicate it as I had intended. I’ve found the Arc challenge fun and interesting (I even submitted my own Erlang solution), and I wouldn’t intentionally respond to it in a snide way. The reason I said the Erlang challenge is in the spirit of the Arc challenge is that the Arc challenge highlights Arc’s strengths, whereas the Erlang challenge highlights Erlang’s strengths.
- It was rigged in favor of Erlang.
That was basically the point. I wanted the Erlang challenge to demonstrate the kinds of problems that Erlang excels at solving. Maybe I should have made this clearer when I wrote the article — it was more of an illustration of Erlang’s strengths than a “real” challenge. I wasn’t really expecting other languages to compete with Erlang on its home turf, the same way I wouldn’t expect Erlang to compete with, say, C or C++ at low-level systems programming or with ActionScript at Flash programming.
- It showed that Erlang is good at solving a narrow scope of problems, but I didn’t show that Erlang is a good general purpose tool.
I don’t think it’s possible to show in 10 lines of code how good (or bad) a language is at solving anything but problems that are very similar to those in the example. The only true way to judge whether a language is good at building certain types of systems is to build them using the language and evaluate the results.
I know that people have used Erlang successfully to build phone switches, a MMORPG, a massively scalable DBMS, web applications, a collaboration application, high performance messaging servers, ad servers, web servers, a scalable poker server, and a 3D modeling tool. When I say that Erlang is a good language for building certain applications, that’s what I base it on.
Edit: I forgot to list CouchDB, an open source document storage system, and Triggit, a widget server unlike any other.
The Erlang Challenge
In the spirit of Paul Graham’s Arc challenge, and in spite of the “controversy” around it, I devised a programming challenge of my own. It’s called the Erlang Challenge.
The goal is to implement an advanced “parallel map”: a function (F1) that applies another function (F2) to every element (E) of a list (L1) concurrently and then gathers the results in a new list (L2). The order of results in L2 corresponds to the order of elements in L1, and the elements of L1 are guaranteed to remain unchanged no matter what function F2 does. To make things more interesting, the F1 should also accept a list of cluster nodes (NL) onto which the computations can be distributed. Each application of F2 to an element of L1 should be done on a random node from NL, and the result should be send back to the node on which F1 was called (N). If NL is empty, all computations should occur locally on N.
This challenge isn’t just about succinctness, but, like most real-world systems, also about scalability and performance. Therefore, it has the following requirements:
- The implementation must be able to spawn at least 1 million concurrent processes on a modern server machine.
- On multi-core machines, the application’s performance must increase (almost) linearly with the number of available CPUs.
- The application as a whole must exhibit soft real-time performance. Specifically, this means that processes won’t freeze for a long time during global garbage collection sweeps, and that no process will be able to block the entire VM to do IO or call native code, regardless of what F2 does. In other words, it is impossible to create a function F2 that could block the scheduler while F2 executes.
Here’s how I would implement it in Erlang (this is loosely based on Joe Armstrong’s pmap implementation):
pmap(Fun, List, Nodes) ->
SpawnFun =
case length(Nodes) of
0 -> fun spawn/1;
Length ->
NextNode = fun() -> nth(random:uniform(Length), Nodes) end,
fun(F) -> spawn(NextNode(), F) end
end,
Parent = self(),
Pids = [SpawnFun(fun() -> Parent ! {self(), (catch Fun(Elem))} end)
|| Elem <- List],
[receive {Pid, Val} -> Val end || Pid <- Pids].
How would you implement it in your favorite language?
Update: please read the followup.
Debugging with Distel
Until recently, I’ve done my Erlang debugging either using the built-in debugger or not using a debugger at all and instead relying on the good old io:format() function. The reason I haven’t been keen on using the Erlang debugger is that it’s not integrated into my editor (I use Emacs). To debug a file, I’d have to open it separately in the debugger UI, where I could set breakpoints and step through the code. For simple bugs, this is more effort than adding a few io:format() statements, reloading the page, and figuring out the bug from the output.
I’m not an IDE fanatic (well, duh, I use Emacs) — I can do fine without most of the features in advanced IDEs — but I do expect any decent editor to have at least 3 features: syntax highlighting, auto-indentation, and integrated debugging.
(I’m ambivalent about IDEs with code completion. It adds convenience, but it also it lets you get by without really remembering your APIs. Without code completion, I remember most of the my application’s function names and their parameters simply because not remembering them slows me down too much. This is akin to the cellphone effect on remembering numbers: before I had a cellphone, I remembered all my close friends and family members’ numbers. Looking them up in an address book was too much work. Now, I remember just a fraction of them. My cellphone made me lazy.)
The Emacs mode for Erlang provides syntax highlighting and auto-indentation, but no integrated debugging. AFAIK, neither Erlide nor ErlyBird have integrated debuggers. I knew Distel was supposed to add debugging support, and I even tried setting it up a while ago but gave up after running into problems I couldn’t figure out. Last weekend, though, I finally got it working after following Bill Clemenson’s Distel tutorial. I must say that having a debugger integrated into Emacs is a huge improvement. I recommend it to every Erlang hacker.
My only peeve with this debugger is that you must mark each file you want to debug as interpreted (the Emacs shortcut is “C-c C-d i”) before you can set breakpoints in it and step into its functions. This seems unnecessary — Erlang .beam files usually contain metadata including the location of their source files. I wish the debugger used this information to auto-interpret all the files you’re trying to debug.
This is a relatively small peeve, though. The Distel debugger is definitely a step up from io:format().
I’m still hoping someone will create a more user-friendly Erlang IDE with a slick UI and an integrated debugger. I like the TextMate interface, but TextMate doesn’t do auto-indentation for Erlang or debugging. I’ll stick to Emacs+Distel for now.
Triggit: Why Didn’t I think of That?
Triggit received some great publicity last week. Triggit, a small startup in San Francisco, recently released a product makes it easy to add widgets and ads to any site by dragging and dropping — no HTML editing required. I think it can really catch on among bloggers who are disinclined to muck around in mounds of PHP code in order to add some widgets to their blogs. I can also see this kind of technology fitting very well in sites like MySpace, where non tech-savvy users could use Triggit to effortlessly soup up their profiles.
Ryan Tecco, Triggit’s CTO, told me Triggit uses Erlang for all the heavy lifting in the backend. You probably won’t find this shocking, but I think using Erlang is a smart choice. Low latency, scalability and high availability are all essential for this kind of product: it’s one thing to have performance/availability issues on *your* site, but it’s much worse to let those issues affect other people’s sites as well.
Finally, Triggit is an interesting data point on how very small teams can successfully use Erlang to achieve impressive results in a short time frame.
Update: I added this video with Triggit :)
How to Use Concurrency to Improve Response Time in ErlyWeb Facebook Apps
When I was building the Vimagi Facebook app, I came across a common scenario where using concurrency can make your application more responsive for its users.
The typical flow of responding to requests coming from Facebook looks like this:
1) request arrives
2) do some stuff (mostly DB CRUD operations)
3) call Facebook API to send notifications / update newsfeeds and profile FBML
4) send response
When you’re building a facebook app with ErlyWeb, you can instead do the following:
1) request arrives
2) do some stuff (mostly DB CRUD operations)
spawn(fun() ->
3) call Facebook API to send notifications / update newsfeeds and profile FBML
end)
4) send response
The Facebook API calls in step 3 are much more expensive than the typical ErlyWeb controller operations because these calls involve synchronous round trips to the Facebook servers plus XML processing for the responses. By performing the Facebook API calls in a new process we can return the rendered response to the browser immediately and let the Erlang VM schedule the Facebook API calls to happen leisurely in the background.
The only gotcha is that if an error occurs in the spawned process we can’t notify the user right away — but this isn’t really problem because we can log the errors and retry later, which is arguably a better approach anyway from a usability standpoint.
It’s really that easy! A simple call to spawn() makes our app *feel* much faster. This puts the debate around language performance comparisons in a new light: how do you take into account the observation that some languages make “cheating” so much easier? :)
