Tuesday, September 30, 2014

Durst progress

These last two weeks have been quite slow on the code side of things, primarily because I've spent my time reading up on LLVM and secondarily because I have been less productive. Hopefully, this reasearch will allow me to actually get code generation working. This post is somewhat of a haphazard collection of notes about what I have read.

Unfortunately, the compilation process is very complicated. To create an executable in a scalable way, I must transform the code to LLVM IR, emit an object file, track which object files need to be used, and link them together. The IR creation step seems thankfully quite simple, but object files and linking can be quite confusing. For now, though, I plan on just generating a single object and executable, so I shouldn't have to worry too much.

The bindings to LLVM that I plan on using in the rustc_llvm crate are unstable and badly documented, but they seem to mainly correspond directly to LLVM functions, so I should hopefully be able to just follow the Kaleidoscope tutorial and this implementation of it. As of yet, durst has not control flow, which makes things easier.

Rustc, whose source I've been reading and modifying in hopes of understanding compilers better, is very complex. Unfortunately, I haven't really managed to get much about compiler design yet, since I'm having trouble even getting started with the whole compilation system.

Hopefully, I'll actually have significant code changes to talk about next week.

Monday, September 15, 2014

Durst progress

My new language is moving along slowly but steadily. The parser is now complete, and though I am tempted to rewrite it after reading about Marpa parsing, this would probably be counter-productive. Adding an interpreter was fairly simple, but this is running on an expression AST, not any sort of CFG.

The current plan is to first extend the language with functions, including some basic IO, then move on to compilation. Compilation will be done by first converting to an SSA CFG similar that that of LLVM, then converting to LLVM and letting it do the hard work of actually producing machine code.

Coming up with a good representation for the CFG is a bit difficult, because I want to have want operation to point to its operands, but naively done this is impossible in Rust, which forbids cycles of pointers. As such, I plan to create an (internally unsafe) module for a CFG using raw pointers, and use that.

Once I have basic compilation working, the plan is to play around with optimization by creating a framework similar to Hoopl. I have a number of ideas of how to improve upon that work, including interleaving the backwards and forwards passes and immediately applying transformations that are known to be valid.

Tuesday, September 2, 2014

Durst: Creating a new language in Rust

I've started a new major project, with high and lofty goals that will probably never be achieved. I am creating my own programming language, with the goal to be similar to Rust in many aspects, but ensuring greater safety using dependent types.

So far, I've built a basic compiler executable that calls into a parsing library that I wrote. The executable is fairly simple, but the parsing library was interesting, as Rust gave me goals and restrictions that weren't present in other parsing libraries I've written.

First, I wanted everything to be allocated on the stack - this allows for a compact representation and allows the compiler to perform more optimizations. This was actually pretty easy, as Rust has very simple and easy to use structs and doesn't box anything without explicit requests.

Second, I wanted the parsing framework to support recursion. This was where I started to run into problems, because Rust requires all memory to be initialized. As such, it is impossible to easily create cyclic data structures. Worse, even if I were to use something like RefCell to get around these restrictions, cycles just don't fit well into Rust's ownership model - it is hard to ask who owns whom and to prove that backreferences will only live as long as the forward references.

To get around this problem, I simply required users of recursion to actually create their own implementation of the Parser trait, meaning that they can just create the cyclic parsers on the fly.

By next week, I hope to have a full AST defined and parsing, so that I can move on to interpretation.