đ¨âđŹAlpha #3: dynamically-sized types and almost-finished garbage collection
This week in Alpha was an exhausting one. Alpha got new dynamically sized types (Symbol, String, SVec) and an almost-finished garbage collection. The types were a quick add, but then most of the week was spent debugging garbage collection, segmentation faults, and memory crashes.
SVec
SVec is a small vectorâan immutable length-prefixed array of pointers.
Itâs awful performance-wiseâadding a single element is (compared to amortized for vectors in other languages). Lookup is though.
The main advantage is that itâs simple, persistent, and allows building more sophisticated data structures on top. (I will cover some cool immutable vectors in the future.)
String
The String is similar to SVec, but it is an immutable length-prefixed array of bytes encoding a UTF-8 string. The string is both length-prefixed and NUL-terminatedâthis will allow easy interop with C in the future.
Symbol
Symbols are very much like strings. The only difference is that symbols are interned (aka hash-consed)âtwo symbols with the same name will always resolve to the same pointer. It provides for a fast comparisonâitâs just a pointer equality check.
Interning is implemented by storing all symbols in a binary search tree sorted by string hash. The tree is never re-balanced, but that is fine because hashes are uniformly distributed, so the search is still (because the tree is expected to be balanced for ).
These new structures allowed me to update definitions of DataType and Method, so they no longer contain Rust-specific types (Vec, String). This, in turn, allows them to be allocated on the GC heap and be traversed by the garbage collector as normal values.
This leads us to the next topicâŚ
Garbage collection
Regardless of the garbage collection algorithm, the first step is determining which objects are still reachable and therefore alive. The search starts with GC rootsâpointers into GC-managed heap that are not themselves stored on the heap. These are global and stack variables.
Two major approaches to reachability analysis are conservative and precise.
Conservative algorithms are simpleâthey traverse all global variables and the stack and treat anything that could be a pointer as a live reference. It might produce false positives (e.g., an integer holding a value similar to a pointer), but it never misses live references. The primary benefit is that conservative algorithms can be implemented without any compiler backend cooperationâthe algorithm doesnât need to know the layout of the stack or where variables are. Boehm GC, the most popular plug-n-play garbage collector, is conservative.
The downside is that conservative garbage collectors cannot relocate objects. Therefore, you cannot make a conservative compacting or copying garbage collector.
Precise algorithms only traverse known pointers. Therefore, they need exact information on where pointers are located in the stack and the objectsâ layout (where pointers are stored in each object).
Precise algorithms are harder to implement, but they have no false positives and allow objects relocation.
Traditionally, precise algorithms required compiler backend to provide memory layout for stack framesâwhere pointers are located in the stackâwhich is hard to set up when using third-party or multiple backends. However, there is a backend-independent technique that can provide this informationâshadow stack.
Shadow stack
Shadow stack is a technique of maintaining a linked list of frames that mimics the call stack but contains the information you need (e.g., location of pointers). This adds runtime overhead, but the technique is cross-platform, doesnât require backend support, and is easy to implement.
The idea is simpleâwhenever you have local variables or intermediary results that point into the GC heap, you store their addresses in an array and push it to a thread-local list of frames. When you leave the function, you pop the frame from the stack. This way, the GC can get addresses of all variables thatâll need patching when values are moved. (The following paper describes the technique in more detail: Accurate Garbage Collection in an Uncooperative Environment.)
â Alpha #4: garbage collection and golden testing for updated picture.
: This is a bad representation for shadow stack. SeeLLVM claims to support shadow stack, but I couldnât get it working. I get the following error:
LLVM ERROR: unsupported GC: shadow-stack (did you remember to link and initialize the CodeGen library?)
After a bit of googling, Iâve found zero examples and basically zero usage in other languages. Even Julia (which I heavily copy draw inspiration from) does not use LLVM GC support and reimplements shadow stack from scratch.
GC implementation process and debugging tips
Introducing garbage collection is toughâyou have to carefully annotate all your functions to properly maintain the shadow stack, and you have to debug the garbage collection algorithm so it does not leave broken pointers.
Debugging garbage collection is tough as errors manifest as spooky action at a distanceâsegmentation faults, random memory changing its value, assertions failing, etc.âthings youâre not accustomed to seeing in Rust (or any language, really).
Thatâs why you want to add GC as early as possible. Maintaining it afterwards is an incremental effort.
Here are some GC debugging tips. They fall into two categories: getting more info on whatâs going on or catching errors earlier.
Use debugger to see backtraces
Segmentation faults are not your regular language exceptionsâthey terminate the application without showing you a backtrace or anything useful. Use a debugger to get the backtrace and examine variables around.
Tracing and memory dumps
Print backtraces for every allocation (so you know which function is allocating the memory), trace the garbage collection process, and dump memory buffers to debug whatâs going on.
TRACE alpha::gc > from@0x7ffff3855000: Length: 472 (0x1d8) bytes
0000: 18 32 98 57 55 55 00 00 00 00 00 00 00 00 00 00 d8 30 98 57 55 55 00 00 08 00 00 00 00 00 00 00
0020: a8 68 f8 57 55 55 00 00 18 60 86 f3 ff 7f 00 00 00 00 00 00 00 00 00 00 08 50 85 f3 ff 7f 00 00
0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0060: d8 30 98 57 55 55 00 00 ff ff ff ff ff ff ff ff 78 6a c4 57 55 55 00 00 38 31 98 57 55 55 00 00
0080: 01 7c ff ff ff 7f 00 04 08 50 85 f3 ff 7f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 51 85 f3 ff 7f 00 00 d8 30 98 57 55 55 00 00
00c0: 08 00 00 00 00 00 00 00 28 73 f8 57 55 55 00 00 18 40 86 f3 ff 7f 00 00 00 00 00 00 00 00 00 00
00e0: 08 50 85 f3 ff 7f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0100: 00 00 00 00 00 00 00 00 d8 30 98 57 55 55 00 00 00 00 00 00 00 00 00 00 38 6a f8 57 55 55 00 00
0120: 38 31 98 57 55 55 00 00 00 00 00 00 00 00 00 00 08 50 85 f3 ff 7f 00 00 00 00 00 00 00 00 00 00
0140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 18 32 98 57 55 55 00 00
0160: 01 00 00 00 00 00 00 00 98 51 85 f3 ff 7f 00 00 18 32 98 57 55 55 00 00 02 00 00 00 00 00 00 00
0180: 38 31 98 57 55 55 00 00 38 31 98 57 55 55 00 00 98 32 98 57 55 55 00 00 b0 51 85 f3 ff 7f 00 00
01a0: 00 e0 85 f3 ff 7f 00 00 18 32 98 57 55 55 00 00 01 00 00 00 00 00 00 00 58 c1 85 f3 ff 7f 00 00
01c0: 98 32 98 57 55 55 00 00 78 51 85 f3 ff 7f 00 00 00 b4 aa 55 55 55 00 00
TRACE alpha::gc > to @0x7ffff3854000: Length: 472 (0x1d8) bytes
0000: 18 32 98 57 55 55 00 00 00 00 00 00 00 00 00 00 d8 30 98 57 55 55 00 00 08 00 00 00 00 00 00 00
0020: a8 68 f8 57 55 55 00 00 18 60 86 f3 ff 7f 00 00 00 00 00 00 00 00 00 00 08 40 85 f3 ff 7f 00 00
0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0060: d8 30 98 57 55 55 00 00 ff ff ff ff ff ff ff ff 78 6a c4 57 55 55 00 00 38 31 98 57 55 55 00 00
0080: 01 7c ff ff ff 7f 00 04 08 40 85 f3 ff 7f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 10 41 85 f3 ff 7f 00 00 d8 30 98 57 55 55 00 00
00c0: 08 00 00 00 00 00 00 00 28 73 f8 57 55 55 00 00 18 40 86 f3 ff 7f 00 00 00 00 00 00 00 00 00 00
00e0: 08 40 85 f3 ff 7f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
0100: 00 00 00 00 00 00 00 00 d8 30 98 57 55 55 00 00 00 00 00 00 00 00 00 00 38 6a f8 57 55 55 00 00
0120: 38 31 98 57 55 55 00 00 00 00 00 00 00 00 00 00 08 40 85 f3 ff 7f 00 00 00 00 00 00 00 00 00 00
0140: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 18 32 98 57 55 55 00 00
0160: 01 00 00 00 00 00 00 00 90 41 85 f3 ff 7f 00 00 98 32 98 57 55 55 00 00 a8 41 85 f3 ff 7f 00 00
0180: 00 b4 aa 55 55 55 00 00 98 32 98 57 55 55 00 00 c8 41 85 f3 ff 7f 00 00 00 e0 85 f3 ff 7f 00 00
01a0: 18 32 98 57 55 55 00 00 02 00 00 00 00 00 00 00 38 31 98 57 55 55 00 00 38 31 98 57 55 55 00 00
01c0: 18 32 98 57 55 55 00 00 01 00 00 00 00 00 00 00 58 c1 85 f3 ff 7f 00 00
Itâs hard to read these, but itâs better than nothing.
If you use Rust, backtrace and pretty-hex are some useful crates here. tracing was also helpful to get more structured logs that show call nesting. (Itâs funny because it basically maintains a shadow stack of its own.)
Run GC as often as possible
Instead of running GC when a threshold is reached, run in before every allocation. This way, youâll quickly catch what variables are not properly rooted. (Thatâs a performance kill, but thatâs for debugging only.)
Use page protection to catch bugs even sooner
If you develop a compacting/copying garbage collector, itâs helpful to use page protection.
After you move values out of a page, you can use mprotect to set permissions to noneâeven reading from a page is forbidden. This might seem uselessâwho the fuck needs a memory that they cannot read?âbut this helps catching errors earlier. Instead of reading values from invalid memory, processing them, and crashing somewhere down the line, your app will crash as soon as it tries to access the old memory. If you see SIGSEGV: address access protected
, you know some pointer didnât get updated by the GC.
I use region for this.
Write tests
It might seem complicated to write tests for a garbage collector, but that is still possible. You donât have to be clever to cover all casesâsimply allocating different types, and testing invariants was good enough for me.
This is the test that caught the last bug Iâve been tracking down for days:
#[test]
#[serial]
fn type_t() {
crate::types::init();
unsafe {
let ty = DataType::new(symbol("TestDataType"), ANY_T.load(), 0, &[]);
gc_box!(ty);
let t = Type::new(ty.load());
gc_box!(t);
collect_garbage();
assert_eq!((*t.load()).t, ty.load());
}
}
Take breaks
Iâve implemented many memory allocators before, but copying garbage collector is much more challengingâthe memory is constantly moving, pointers change, a typical trace is 50k+ lines long, and itâs hard to grasp whatâs going on. Itâs very exhausting, and tracking down all bugs took 4 days.
Taking breaks is a rather universal tip when youâre stuck in debugging for long, but I am still surprised how well it worksâmost of the progress was made right after a nightâs sleep or a good long break.
Current status and next steps
Garbage collection is almost finished: the collector works, global variables are rooted, and Rust code maintains the shadow stack (I hope). The last bit is modifying the compiler, so generated Alpha code maintains the shadow stack, too.
After that is done, it will be an excellent opportunity to refocus on code quality. The previous weeks were rather expansiveâIâve added new features fast, the code quality has suffered and needs a cleanup. After GC is done, Alpha wonât be limited by a fixed-size memory, so I could write more tests and refactor aggressively.
It might also be a good time to provide some overview of the project structure.
Backlinks
- đ¨âđŹ Alpha
- đ¨âđŹ Alpha #4: garbage collection and golden testing
- đ¨âđŹ Alpha #2: multi-methods, type hierarchy, and dot desugaring