Changed the instruction sequence for procedure return (and computed
authorStephen Adams <edu/mit/csail/zurich/adams>
Fri, 17 Oct 1997 01:25:41 +0000 (01:25 +0000)
committerStephen Adams <edu/mit/csail/zurich/adams>
Fri, 17 Oct 1997 01:25:41 +0000 (01:25 +0000)
commite4ab9662d1d554e2fb9e1a49e565dcf471c660be
tree8bb0d3b03c67d8d689d28f64cde7360d0c2b68e1
parentf6ac78a356717413255ca374875675b12bd30e67
Changed the instruction sequence for procedure return (and computed
jump).  The code for clearing the type code from a continuation now
loads the value into a register instead of modifying it in-place on
the stack.

I have left the code using an indirect jump.  An alternative is to
push the value back on the stack and do a RET.  The indirect jump
seems faster, especially when returning to the same address as the
previous jump, but the branch prediction mechanisms for RET and JMP
seem quite different.

Speeds up the modified Gabriel Benchmark Suite (/scheme/8.0/src/bench)
by 10% overall!  I guess this is because the Pentium Pro really
doesn't like the old read-modify-write instruction.

Test       Old    New   Ratio
ctak      11.59  11.54  0.996
conform    0.62   0.50  0.806
traverse   1.57   0.92  0.586
takl       0.23   0.20  0.870
peval      0.40   0.35  0.875
browse     0.59   0.56  0.949
tak        0.28   0.25  0.893
wttree     1.61   1.49  0.925
deriv      0.34   0.29  0.853
boyer      0.47   0.42  0.894
div        0.42   0.39  0.929
dderiv     0.44   0.38  0.864
cpstak     0.42   0.41  0.976
matmul1    0.27   0.27  1.000
fib        0.68   0.55  0.809
fcomp      0.61   0.54  0.885
triangle   2.89   2.36  0.817
puzzle     0.47   0.47  1.000
matmul2    0.66   0.69  1.045
destruct   0.28   0.28  1.000
~a.mean       -      -  0.899
~g.mean       -      -  0.892
v7/src/compiler/machines/i386/rules3.scm