Use another indirection for representation of promises.
The pairs never change, so merely loading the pair object from the
cell indirection gives us an atomic snapshot of it. This way, there
is no need for without-interrupts in promise-forced?.
This makes each promise cost one more word (previously: one word to
represent plus three words of heap space; now one word to represent
plus four words of heap space), but reducing without-interrupts is a
big win -- this halves the time of test-promise.scm on my machine.
Of course, on a parallel system, the without-interrupts in %force is
still not enough (and we'll need the cell-contentses to be
load-acquire operations, not just loads).