Feb. 2nd, 2013

A while ago, Andy Keep defended his thesis here at IU about the nanopass framework, a super-slick (to say the least) DSL for writing compilers in Scheme. One of the cool things about it is that you only have to define local parts of passes: if there's a peephole optimization that you want to do, you only have to specify the conversation for the part of the AST that you want to change; everything else stays the same.

This reminded me a little bit of substructural logic programming; there's a lot going on there, but the idea is that you use a substructural logic (such as linear logic) for the semantics of your logic programming. This gives you the ability to write what are essentially state machine transitions as a forward chaining logic program. I can imagine a sufficiently stringy AST definition that "flattens" the AST,  putting everything in the context at once and using something like pointers (or even ordered contexts, perhaps, if using ordered logic programming) to tell what the structure of the term is.

Such a representation raises questions of its own: is flattening the AST like that a good idea? Is there a way to do it that doesn't feel like munging around in C? Could it be done behind-the-scenes so that none of the ensuing ugliness would be exposed to the programmer? It seems like it might make a cool backend for nanopass or something similar, for example. But I just want to ask these questions, not answer them; the point is, if we have this kind of representation, it seems like we then automatically get free parallel peephole optimizations. Write the transition rule and run it, and it will automatically apply wherever it can.

But can we do better than just myopic, local transformations? I was talking to Aaron Todd, another grad student here, and he came up with an idea to parallelize compilation:

<toddaaro> it seems like in a piece of code the syntax monotonically gets closer to machine code, and the steps along the way are triggered by reaching an information threshold
<toddaaro> which could let you easily parallelize compilers
His idea was to write passes in terms of the pieces of syntax themselves, rather than the entire AST. Ideally, this would result in abstracting away from information dependencies and allowing the code to run in parallel, to some extent; some dependencies exist, of course, but it seems like it would be possible to block on that information becoming available in a sufficiently expressive setting.

I think substructural logic programming might provide such a setting. With well-defined interfaces between each part of a pass, you could write everything declaratively and trust the logic programming engine to do the rest. This would require some good mechanisms for modularity in SSLP, but there's at least been some thought put into this already; if this kind of approach is feasible, I think the main question is whether to push SSLP as a compiler implementation paradigm or to use it as a backend for some kind of (potentially nanopass-like) DSL.

This is, of course, assuming the approach I've outlined above works out, but I think it should. A compiler is, at its core, a series of translations, and translations can be though of as rewrite rules. In this case they'll be stateful, but SSLP is already good at dealing with state (it is, if anything, even more convenient than using the State monad in haskell, which has already been a lifesaver for me in this course so far!). If anything, it seems like a perfect fit! But anyway, this is a pretty fluffy blog post, so what do y'all think? I might try to come up with a few code examples, if there's interest.



April 2013

14151617 181920

Style Credit

Expand Cut Tags

No cut tags
Page generated Oct. 21st, 2017 03:22 am
Powered by Dreamwidth Studios