šØāš¬Alpha #7: HIR, High-level Intermediate Representation
Well, I have decided to go without continuations. š And Iām finally done with the transition! after three weeks of struggling
Why? The following slide from the āCompiling with Continuations or without? Whatever.ā presentation clicked for me.
It says that direct style IR is better suited for the early stages of compilation and CPSāfor later ones. I then remembered that even in the āCompiling with Continuationsā book, the CPS is constructed from an already-simplified version of ML, with types resolved and operations lowered.
And thatās precisely the reason why I started adding an IR!āI needed a place to expand syntax sugar, lower types and methods.
Swift Intermediate Language (SIL)
I also watched Swift Intermediate Language (SIL) presentation. I liked it. Itās similar to LLVM IR but is simpler and preserves Swift-specific information. This allows performing Swift-specific analyses on SIL itself. (Clang generates a side CFG that mirrors LLVM IR in many ways and implements its own analysis. As you might imagine, thatās a lot of duplication.)
I took a couple of lessons from SIL:
Rich type system. SIL preserves most of Swift type system, which allows performing more optimizations, resolving some dynamic methods at compile time, etc. Iāll need this to implement method specialization and static method dispatch.
Requiring every expression to be assigned to a variable. This provides a place to attach information: source location, types. SIL goes even further than LLVM IR here, requiring constants to be assigned to variables as well. (This removes the Value/Constant divide from LLVM IR.)
Itās more heterogeneous than paper IRs. IRs in papers are only concerned with expressionsāthe whole compilation unit is an expression. SIL (like LLVM IR) has a notion of a module that has a name, type declarations, global variables and functions, interfaces, and whatnot. I have previously tried to cram Alpha into the neat uniform world of expressionsāitās possible, but having a Module type and top-level declarations is easier.
Basic blocks with parameters. SIL also has cool basic blocks with parameters. Instead of phi nodes, a basic block can accept parameters, and then branch instruction becomes ājump with arguments.ā This is easier to understand and reason about. (And basic blocks with parameters are continuations!) Though SILās basic blocks is a feature I wonāt use in Alpha because I donāt use CFG (control flow graph) representation.
Cranelift
Another interesting project I found is Cranelift. Itās another compiler framework (like LLVM) but designed with JIT compilation in mindāit is biased towards faster compilation times but does less optimization. This is usually a good trade-off for a JITāted language.
Cranelift is written in Rust, which is a nice bonus given that Alpha is written in Rust as well. Cranelift IR is similar to LLVM IR, although simpler. (And it also uses basic blocks with parameters!)
It might be a good target to consider in the future. Either as a second target side-by-side with LLVM or as a replacement. But not now.
Current status
The new IR is called HIR (High-level Intermediate Representation). It is defined in src/hir/hir.rs.
The old compiler has been replaced with ASTāHIR and HIRāLLVM IR translations. The code is much simpler now, and there is less duplicationāI like it. You can check it at rasendubi/alpha#1 refactor: use HIR.
Everything works except built-in functions (print, multiplication, etc.)āthatās why CI is failing.
Previously, ExecutionSession manipulated Alpha objects to define datatypes and attach methods. This required quite a few actions for every built-in function addedāand thatās why Alpha had very few of them (print
, *
, and type_of
were the only built-in functions).
Now Iām thinking of adding Alpha syntax to reference Rust functions. This way, I can write more of the standard library in Alpha itself. Iāll focus on that this week.
Backlinks
- šØāš¬ Alpha
- šØāš¬ Alpha #8: Let bindings and recovery from HIR refactoring
- šØāš¬ Alpha #6: Compiling with Continuations, continued