Use CALL/RET for pushing and returning to continuations on amd64.
Calls now look like:
;; (assign (register #x123) (cons-pointer tag (entry:continuation cont)))
(CALL (@PCR pushed))
(JMP (@PCR cont))
pushed:
(OR Q (@R ,rsp) (&U ,tag))
...
(JMP (@PCR uuo-link))
Returns now look like:
;; (pop-return)
(AND Q (@R ,rsp) (R ,regnum:datum-mask))
(RET)
These should happen in pairs, so that we can take advantage of the
CPU's return address branch target prediction stack rather than
abusing the indirect jump branch target predictor.
WARNING: This changes the amd64 compiled code interface, so new
compiled code requires a new microcode. (A new microcode might be
able to handle existing compiled code just fine.)