Boilerplate Compiler

I want to finish my boilerplate CPU, but the simulator and editing environment are way too slow to handle big machines. The simulator currently floodfills from every engine every tick, and I'm currently rendering using canvas. Both are fine answers for trivial machines, but my little macbook air gets about 4 FPS if I zoom out while looking at my CPU.

I really love performance tweaking, but modern CPUs are so fast that you so rarely get meaty performance problems to solve. And if you start tweaking performance before you hit a bottleneck, you'll optimize for the wrong things.

So I've started writing both a compiler and a webgl renderer. I don't think the compiler is actually needed yet, but its more interesting so I'm doing it first.

I'm still planning to make a complete boilerplate introduction at some point, but as a teaser check this out - so given my partially complete CPU:

cpu

I can now generate a connectivity graph showing how regions are connected to each other:

connectivity (Click to get a bigger view)

Each node in the graph is a connected region in the boilerplate program which can never have a shuttle in it. There'll be a region each of the pink, green, grey and white lines.

I connect regions together if air can pass between them sometimes. Shuttles are often used to block air sometimes and allow it to pass at other times. I assign a number to each of the positions a shuttle can be in, and then connect the regions together when shuttles allow air to pass in some states.

Compilers

I've never written a compiler before, but as I understand it most compilers first tokenize the text, then parse it into an abstract syntax tree (AST). So, if you were compiling console.log(123), the text would first be transformed into a list of tokens: IDENTIFIER('console'), DOT, IDENTIFIER('log'), OPEN_PARENS, NUMBER_CONSTANT(123), CLOSE_PARENS.

The token stream is then transformed into a syntax tree based on the rules of the language:

InvokeMethod:  
  Method:
    PropertyAccess:
      LHS: Identifier('console')
      RHS: Property('log')
  Arguments:
    1: NumberConstant(123)

For normal sourcecode, I'm always surprised how big the AST ends up compared to the original program. I'm not sure if ASTs are an inefficient representation, or if text-based programming languages are a really compressed representation or something.

Compiling boilerplate

I'm not sure what the equivalent to a tokenizer is for boilerplate, but the graph I'm generating is the AST. Here's a simpler example:

4 way spinner

(Aside: What do you think will happen next step?)

Take a look at the top right corner. That corner currently has a negative value because the shuttle to the left is in the left position. When that shuttle moves to the right, the corner will go back to having no pressure.

There's a region for each of the 4 corners in the grid. Each corner will have a negative value if the shuttle next to it is pushed back, like it is right now in the top right corner.

My compiler also makes a region for each edge of each engine. This is because its important that if a shuttle is right next to an engine, we need a region to push the shuttle! So there's actually 8 regions here we care about - the other 4 regions are adjacent to each of the negative engines.

Here's what my compiler produces:

4 way spinner AST

The black ovals are the corners, and the red ovals correspond to the edges of the engines which touch the shuttles.

The numbers are all arbitrary - but the cool thing about this is that it (correctly!) says that each corner will share 1 negative pressure based on the state of one of the shuttles.

I'm generating a lot more information than this that I'm not showing on the graph. For each shuttle, I also know where it can move and how it will be pushed by pressure in the adjacent regions.

Anyway, I'm really excited by this. The next step is to figure out how to generate code based on the AST. For this 4-way ticker I want to generate something like this:

var shuttleState = [1,1,1,0];

var pressure = [];  
pressure[4] = shuttleState[1] == 1 ? -1 : 0;  
pressure[5] = shuttleState[0] == 1 ? -1 : 0;  
pressure[14] = shuttleState[3] == 1 ? -1 : 0;  
pressure[15] = shuttleState[2] == 0 ? -1 : 0;

shuttleState[0] = successor[0][pressure[14]]; // Or something like that.  
shuttleState[1] = successor[1][...];  
...

The big problem is that the graph can have cycles, and I'm not sure how to handle them. But its a deliciously interesting problem.