GHC on SPARC: Liveness lies

Yesterday I fixed the linear allocator to handle floating point register twinning, or at least I thought I did. The output code looked ok, but the programs I tried still crashed. I ended up spending the rest of the day writing a tool (mayet) to compare the -fasm and -fvia-c versions. Mayet takes the two .s files and splits them up into parts belonging to the individual closures. It then slowly substitutes the dubious -fasm sections for the known good -fvia-c sections.

Last night I got enough of it working to find a bad -fasm closure in cg034, which tests out floating point math. Curiously, the closure itself didn't do any float or double math. This morning I hand adjusted the -fvia-c version to look like the -fasm one until it exhibited the same problem.

After some wibbling around found this:

             ld [%l1+12],%vI_s1GO
                    # born:    %vI_s1GO
                     
             cmp %vI_s1GO,0
                    # r_dying: %vI_s1GO       <--------- LIES
                     
             bne .Lc2cm

                ......
                ......
        c2cm:
             ld [%l1+8],%vI_n2cH
                    # born:    %vI_n2cH
                     
             st %vI_n2cH,[%i0-12]
                    # r_dying: %vI_n2cH
                     
             sethi %hi(base_GHCziFloat_a_closure),%l2
                     
             or %l2,%lo(base_GHCziFloat_a_closure),%l2
                     
             or %g0,%vI_s1GO,%l3
                    # r_dying: %vI_s1GO       <---------

This is a dump of register liveness information. The line marked LIES shows that the allocator thinks that variable %vI_s1G0 isn't used after the cmp instruction. Unfortuntately, after the branch, it's used in an or. The vreg %vI_n2cH got allocated to the same register as %vI_s1G0, clobbering the contained value and causing the crash.

Turns out the register liveness determinator wasn't treating BI and BF as though they were branch instructions, so liveness information wasn't being propagated across the basic blocks properly.

Fixing that problem stopped cg034 from crashing, though it still gave the wrong answer. During debugging, noticed that if ghc is executed with -v or -ddump-reg-liveness then the top level labels emitted in the .s file change - which confuses mayet. Hmm.. let that be a lesson to all of us: changing compiler flags should not change top level names, if at all possible.

More digging

        (_s1Ri::F32,) = foreign "ccall" 
               __encodeFloat((_c2sm::I32, `signed'), (_c2sn::I32, PtrHint),
                             (_c2so::I32, `signed'))[_unsafe_call_];
        F32[Sp] = _s1Ri::F32;

Is translated to:

        call __int_encodeFloat,2
        nop

        st %f28,[%i0]         <- BOGUS %f28

A floating point return value should be placed in %f0, but for some reason the GHC code that does just that was missing. Fixed that, and it almost works... just gives the wrong answer.

Loading of doubles looks broken.

via-c says:


        ld [%l1+3], %f8
        fitod %f8, %f2

but the NGC does:


        ld [%l1+3],%l0
        st %l0,[%o6-8]
        ld [%o6-8],%f10
        fitos %f10,%f10

Hmm.

Remember that comment from a few days ago:

-- ToDo: Verify correctness

Turns out it wasn't correct.. Who would have known :P

That fixed cg034 and cg035. Now we're down to:


Unexpected failures:
   2080(optasm)   -- segv
   cg015(optasm)  -- unknown unary match op
   cg021(optasm)  -- segv
   cg022(optasm)  -- segv
   cg026(optasm)  -- segv 
   cg044(optasm)  -- segv
   cg046(optasm)  -- segv
   cg054(optasm)  -- genSwitch 
   cg058(optasm)  -- segv
   cg060(optasm)  -- segv

GHC on SPARC

Wednesday, January 14, 2009

Liveness lies

1 comment:

Blog Archive