Monday, February 2, 2009

Join Points

There were no updates last week on account of it being the Australia Day public holiday on monday, and me going to fp-syd later in the week.

Although the SPARC NCG now passes almost all of the testsuite, and can compile the Haskell source of GHC itself, it still has some problems when compiling the RTS. Unlike the Cmm code emitted when compiling Haskell source, the hand written Cmm code in the RTS contains loops as well as join points.

Mid last week I hit a bug in the linear register allocator, but didn't finish fixing it. Here is some emitted assembler code when compiling rts/PrimOps.cmm

 .Lch:
....
st %l0,[%l6]
add %l6,4,%l6
b .Lci <--- JUMP to .Lci
nop

.Lci:
st %l0,[%i6-84]
ld [%i6-76],%l0
st %l0,[%i6-76]
sll %l0,2,%l0
....

.Lcj:
....
st %l6,[%i6-76]
add %l0,8,%l6
b .Lci <--- JUMP to .Lci
nop

.Ln1l:
b .Lci <--- JUMP to .Lci
mov %l7,%l0
ld [%i6-100],%l7 <--- ??? slot 100 never assigned to
b .Lci <--- unreachable code??
nop


Label .Lci is a join point because it is the target of several jumps. The code in block .Ln11 is supposed to be fixup code. Fixup code is used to make sure that register assignments match up for all jumps to a particular label.

This code is badly broken though
1) .Ln11 is unreachable, nothing ever jumps to it.
2) There should probably be a .nop after the first branch instruction in .Ln11 to fill the SPARC branch delay slot. That's if that first jump is supposed to be there at all.
3) The instruction ld[%i6-100],%l7 loads from spill slot 100, but that slot is not stored to anywhere else in the code.
4) The three instructions after the first are also unreachable.

This is a bit funny considering I spent July-Sept 2007 at GHC HQ writing a new graph colouring register allocator for GHC, specifically because the linear allocator was known to be dodgy.

Alas, the graph allocator suffered the fate of many such "replacement projects". It is a little slower on some examples, and it comes with an ambient FUD cloud because it is fresh code. Sure, I could just use the graph allocator for SPARC, but that would still leave known bugs in the linear allocator for someone else to trip over. Until such time as we ditch the linear allocator entirely, for all architectures, we still need to maintain it.

The reason why .n1l wasn't being referenced was that the code to change the destination of the jump instruction at the end of block .Lch was missing. ... finally got sick enough of the organisation of RegAllocLinear.hs that I spent most of the day splitting it out into its own set of modules, which was well over due.

Some time later...

It seems as though the code in block .Ln1l is supposed to be swapping the values in regs %l7 and %l0. In block .Lch vreg %vI_co is in %l7 and %vI_cp is in %l0, but the block .Lci wants them the other way around, what a nightmare.

More time later...

From handleComponents in RegAllocLinear.hs

 -- OH NOES!
(_, slot) <- spillR (RealReg sreg) spill_id

The _ is supposed to match the instruction, generated by spillR, that stores the value in sreg into the spill slot at [%i6-100]. The comment is mine.

No comments:

Post a Comment