👨‍🔬Alpha #5: Compiling with Continuations

I spent most of the week re-studying the Compiling with Continuations book and planning/mapping continuations to Alpha. The main complications are multi-methods (that require elaborate closure representation) and the fact that Alpha is JIT-compiled (so compilation units are smaller and are added dynamically). Not much code has been written this week, and there are zero features added, so I’ll submit a book review and my musings about continuations as this week’s report 😁.

Continuation Passing Style (CPS)

Continuation Passing Style (CPS) is a programming style when functions do not return values. Instead, they take a continuation function as an additional argument and call it with the result when it’s ready. You can think of continuation passing style as “programming with callbacks.”

JavaScript developers, who are here long enough to remember the world before promises and async/await, surely know what this programming style leads to—callback hell. Back then, every time you called an asynchronous function in JavaScript, you had to pass a callback as an argument, which quickly leads to indentation going all the way to the right:

function getBalance(username, callback) {
  findUserByName(username, (err, userId) => {
    if (err) return callback(err);

    getUserProfile(userId, (err, profile) => {
      if (err) return callback(err);

      getAccount(profile.accountId, (err, account) => {
        if (err) return callback(err);

        callback(null, account.balance);
      })
    })
  })
}

In continuation passing style, every function work like this—even primitives.

# C = (F - 32) * (5/9)
fn fahrenheit_to_celsius(f, callback) =
  -(f, 32.0, (t1) =>
    /(5.0, 9.0, (t2) =>
*(t1, t2, callback)))

And that’s precisely what I’ll do with Alpha! The catch is to do this in the compiler, not the user code—the user doesn’t have to descend this hell, only I do 😅

Compiling with Continuations

The compiler can convert user code to CPS form and use it as an intermediate representation. The translation is pretty straightforward, and it turns out, this representation has many nice properties.

CPS form is friendly for inlining. This is important because LLVM JIT does not support cross-module optimizations/inlining, so I’ll have to do this in the frontend.

It’s easy to implement closure conversion in CPS form. In the getBalance example above, all callbacks have closures because they capture variables from the outer scope. In machine code, functions are just a sequence of machine instructions—they can’t have any data associated with them. Any captured variables must be passed as arguments.

So the above code compiles to something like this. This transformation is called “closure conversion.”

// Showing “this” as an explicit argument to emphasize it
// is an argument.
function getBalance(this, username, callback) {
  const closure = { callback };
  findUserByName(username, cb1.bind(closure));
  // You can think of .bind() as producing an object:
  // { this: closure, fn: cb1 }
  //
  // A call to a function can be imagined as:
  // const callback = { this: closure, fn: cb1 }
  // callback.fn(callback.this, ...other arguments)
}

// Note that all callbacks are top-level functions now that
// refer to their parameters and globals only.
//
// Their closure is passed as `this`.

function cb1(this, err, userId) {
  if (err) return this.callback(err);
  // Because all callbacks capture the same set of
  // variables, they share the same closure.
  getUserProfile(userId, cb2.bind(this))
}

function cb2(this, err, profile) {
  if (err) return this.callback(err);
  getAccount(profile.accountId, cb3.bind(this));
}

function cb3(this, err, account) {
  if (err) return this.callback(err);
  this.callback(null, account.balance);
}

CPS form is isomorphic to SSA form. SSA stands for “Static Single Assignment”—it is the form used by LLVM IR. When the compiler is done with Alpha-specific transformations, it can convert CPS to LLVM IR and let LLVM do generic optimizations and generate machine code.

Continuation is a reification of the program control state. In the context of programming, “reification” means representing something intangible as a first-class object. Continuation represents a point in program execution: it captures the position of the instruction pointer (function) and the state of execution (as variables in a closure).

This makes continuations a powerful mechanism to implement many control flow structures on top of: exceptions, loops, generators, coroutines, and async functions, to name a few. It’s most powerful when all program is represented with continuations.

One thing I want to implement is async functions.

We know that callback hell in JavaScript has been solved with promises, generators, and now async/await:

// getBalance with promises:
function getBalance(username) {
  return findUserByName(username)
    .then((userId) => getUserProfile(userId))
    .then((profile) => getAccount(profile.accountId))
    .then((account) => account.balance);
}

// ...and async/await:
async function getBalance(username) {
  const userId = await findUserByName(username);
  const profile = await getUserProfile(userId);
  const account = await getAccount(profile.accountId);
  return account.balance;
}

Async/await surely looks much better, but it’s still not ideal. Having promises and async/await introduces function color—the code is divided into two camps: synchronous and asynchronous. You can only call an async function from another async function, and you cannot “extract” a value from a Promise (as many newbies have tried).

So, what’s better than async/await? No async/await!

function getBalance(username) {
  const userId = findUserByName(username);
  const profile = getUserProfile(userId);
  const account = getAccount(profile.accountId);
  return account.balance;
}

Of course, this doesn’t work in JavaScript as the code becomes synchronous. But that will work in Alpha—every function in Alpha will be async, and every call is an await call, so there is no reason to write async/await everywhere.

fn get_balance(username) = {
  let user_id = find_user_by_name(username)
  let profile = get_user_profile(user_id)
  let account = get_account(profile.account_id)
  account.balance
}

Continuations also make it easy to implement green threads. Because continuations are reified program control state, writing a scheduler is just a matter of juggling continuations and calling them on multiple threads. So besides being async, Alpha code will be concurrent.

Compiling with Continuations, the book

Compiling with Continuations by Andrew W. Appel is one of the most insightful books on compilers I’ve ever read. It has become my desk book for this week as I studied it through and through.

You can take it as a compiler implementation guide (especially if you’re implementing a functional language), even though it skips the worn-out topics of lexing, parsing, and type checking.

It introduces a single abstraction—continuations—and takes it through the compilation process. The book shows how CPS form can be constructed, how to implement lambda lifting and closure conversion, how to do many optimizations, how to implement exceptions and garbage collection, how to do register allocation, and generate machine code.

It’s not an easy read, but it’s very rewarding. The book is packed with practicality as it follows full-featured real-world language and compiler (Standard ML of New Jersey)—no toy examples!

Current status and next steps

I’m in the middle of introducing CPS IR. It took me quite a while to adjust the book to Alpha (multi-methods and JIT). I have finished AST-to-CPS translation and implemented lambda lifting.

I skip all optimizations, for now, to get code to the working state sooner. But they are definitely on the table.

The next step is to implement closure conversion and then CPS-to-LLVM translation. I hope to finish with these two during the week.

Backlinks