Implementing C#-style Async/Await in raw x86-64 Assembly: Lessons from building the FluxSharp compiler

Hi everyone,

I’ve spent the last few months building FluxSharp, a systems programming language that brings C# syntax (Classes, Inheritance, Async) to a bare-metal environment. The compiler is written in Rust using the Pest PEG parser, but instead of targeting LLVM or a VM, it generates raw x86-64 Assembly (NASM).

I wanted to share some technical insights and challenges I faced, specifically regarding the async/await implementation without a heavy runtime.

1. The "No-Runtime" Async Challenge Unlike the .NET CLR, FluxSharp doesn't have a managed thread pool. To make async/await work, I had to implement a custom State Machine at the assembly level.

  • When a function is marked async, the compiler transforms its AST into a series of states.
  • The "Promise" system is actually a struct in memory that holds a function pointer (the continuation) and a context pointer.
  • The Event Loop: I wrote a minimal event loop in ASM that polls these promises. When an I/O operation (or a timer) completes, the loop restores the registers and jumps to the continuation address.

2. Direct x86-64 Code Generation Since I'm not using LLVM, I have to handle register allocation and stack frames manually.

  • Calling Convention: I'm following a modified version of the System V AMD64 ABI.
  • Safety: To implement null-safety and bounds checking, the compiler injects CMP and JE/JNE instructions before every array access or object dereference. If it fails, it triggers a syscall to exit with a specific error code.

3. Parsing with Pest (Rust) Using PEG (Parsing Expression Grammars) via the Pest crate was a game changer for the C#-like syntax. However, mapping the concrete syntax tree (CST) to a lean AST for ASM generation required a recursive descent pass that collapses nested expressions to avoid stack overflows during compilation of complex logic.

Technical Stack:

  • Frontend: Rust + Pest (Grammar)
  • Backend: Custom x86-64 ASM Generator
  • Assembler/Linker: NASM & LD
  • Security: Built-in overflow protection and bounds checking at the instruction level.

I'm particularly interested in hearing from anyone who has worked on Assembly-level coroutines. My current approach saves the minimal set of registers (RBX, RSP, RBP, R12-R15), but I'm curious if there are more efficient ways to handle the "State Machine" transition in raw ASM without bloating the binary size.

The project is open-source (MIT), and you can see the compiler source and the generated ASM output in the repo:https://github.com/Yvan4001/FluxSharp

Full documentation on the architecture is here:https://flux-sharp.sivagames.eu/docs/

Looking forward to your technical feedback!

submitted by /u/Drenfa
[link] [comments]

Orijinal Kaynağa Git

Comments

Bir yanıt yazın

E-posta adresiniz yayınlanmayacak. Gerekli alanlar * ile işaretlenmişlerdir