mit-scheme.git
5 years agoCheck at runtime whether feenableexcept does anything.
Taylor R Campbell [Wed, 23 Jan 2019 07:50:38 +0000 (07:50 +0000)]
Check at runtime whether feenableexcept does anything.

The bits are defined on aarch64, but apparently some CPUs are
fabricated without support for them so they just read back as zers.
Bummer!

5 years agoFix excessive sign bits in uuo link instruction.
Taylor R Campbell [Wed, 23 Jan 2019 07:49:00 +0000 (07:49 +0000)]
Fix excessive sign bits in uuo link instruction.

5 years agoFix displaced byte load rule.
Taylor R Campbell [Wed, 23 Jan 2019 05:33:39 +0000 (05:33 +0000)]
Fix displaced byte load rule.

The offset is not a machine register!  Yikes.

5 years agoFix FIXNUM-NOT rule: don't set the low bits.
Taylor R Campbell [Wed, 23 Jan 2019 05:21:52 +0000 (05:21 +0000)]
Fix FIXNUM-NOT rule: don't set the low bits.

5 years agoImplement aarch64 logical immediate encoding.
Taylor R Campbell [Wed, 23 Jan 2019 05:21:34 +0000 (05:21 +0000)]
Implement aarch64 logical immediate encoding.

5 years agoFix far uuo links.
Taylor R Campbell [Wed, 23 Jan 2019 03:02:31 +0000 (03:02 +0000)]
Fix far uuo links.

Apparently ADRP really does do Rd <- (PC & ~0xfff) + (imm << 12), not
PC + (imm << 12), which means it's gonna cause some trouble for the
assembler in LIAR, since it means the code needs to know its own
offset within a page of memory and the target's offset within a page
of memory.

5 years agoFix loading complemented immediates.
Taylor R Campbell [Wed, 23 Jan 2019 03:02:16 +0000 (03:02 +0000)]
Fix loading complemented immediates.

5 years agoFix UNSIGNED-LESS-THAN-FIXNUM? branch condition.
Taylor R Campbell [Wed, 23 Jan 2019 03:01:43 +0000 (03:01 +0000)]
Fix UNSIGNED-LESS-THAN-FIXNUM? branch condition.

Add some condition code aliases while here and clarify comments.

5 years agoFix indexing of remote links.
Taylor R Campbell [Tue, 22 Jan 2019 09:01:33 +0000 (09:01 +0000)]
Fix indexing of remote links.

5 years agoFix byte ordering in GENERATE/NSECTS.
Taylor R Campbell [Tue, 22 Jan 2019 03:47:47 +0000 (03:47 +0000)]
Fix byte ordering in GENERATE/NSECTS.

5 years agoFix register choices in GENERATE/REMOTE-LINKS.
Taylor R Campbell [Tue, 22 Jan 2019 03:47:35 +0000 (03:47 +0000)]
Fix register choices in GENERATE/REMOTE-LINKS.

5 years agoFix sense of INVOCATION-PREFIX:DYNAMIC-LINK choice.
Taylor R Campbell [Tue, 22 Jan 2019 03:47:03 +0000 (03:47 +0000)]
Fix sense of INVOCATION-PREFIX:DYNAMIC-LINK choice.

5 years agoFix reference to constant section in GENERATE/REMOTE-LINKS.
Taylor R Campbell [Tue, 22 Jan 2019 01:55:18 +0000 (01:55 +0000)]
Fix reference to constant section in GENERATE/REMOTE-LINKS.

5 years agoSign-extend PC-relative branch target.
Taylor R Campbell [Mon, 21 Jan 2019 23:37:32 +0000 (23:37 +0000)]
Sign-extend PC-relative branch target.

5 years agoFix indexing in MOVE-FRAME-UP code: objects, not bytes, here.
Taylor R Campbell [Mon, 21 Jan 2019 22:39:29 +0000 (22:39 +0000)]
Fix indexing in MOVE-FRAME-UP code: objects, not bytes, here.

And with this, the cold load completes on aarch64!

5 years agoFix large application setup.
Taylor R Campbell [Mon, 21 Jan 2019 22:39:11 +0000 (22:39 +0000)]
Fix large application setup.

5 years agoTeach cmpintmd to flush the instruction cache on aarch64.
Taylor R Campbell [Mon, 21 Jan 2019 20:59:02 +0000 (20:59 +0000)]
Teach cmpintmd to flush the instruction cache on aarch64.

5 years agoFix argument to PUSH_D_CACHE_REGION.
Taylor R Campbell [Mon, 21 Jan 2019 20:53:14 +0000 (20:53 +0000)]
Fix argument to PUSH_D_CACHE_REGION.

Takes startptr/count, not startptr/endptr.

This was not an issue before because until aarch64, the only extant
port that even used this, i386, ignored the argument as a macro and
flushed the entire cache.

5 years agoFix branch instruction in uuo link stub.
Taylor R Campbell [Mon, 21 Jan 2019 19:06:38 +0000 (19:06 +0000)]
Fix branch instruction in uuo link stub.

5 years agoTweak read/write_compiled_closure_target for clarity and assertions.
Taylor R Campbell [Mon, 21 Jan 2019 19:06:20 +0000 (19:06 +0000)]
Tweak read/write_compiled_closure_target for clarity and assertions.

5 years agoFix cache-assignment code generation.
Taylor R Campbell [Mon, 21 Jan 2019 19:06:02 +0000 (19:06 +0000)]
Fix cache-assignment code generation.

5 years agoFix case.
Taylor R Campbell [Mon, 21 Jan 2019 19:05:51 +0000 (19:05 +0000)]
Fix case.

5 years agoFix LSR instruction encoding.
Taylor R Campbell [Mon, 21 Jan 2019 01:20:14 +0000 (01:20 +0000)]
Fix LSR instruction encoding.

5 years agoFix scale->shift.
Taylor R Campbell [Mon, 21 Jan 2019 00:37:29 +0000 (00:37 +0000)]
Fix scale->shift.

5 years agoFix read/write_compiled_closure_target.
Taylor R Campbell [Sun, 20 Jan 2019 21:36:42 +0000 (21:36 +0000)]
Fix read/write_compiled_closure_target.

Byte offsets, not object or instruction word offsets.

5 years agoFix comment.
Taylor R Campbell [Sun, 20 Jan 2019 20:10:39 +0000 (20:10 +0000)]
Fix comment.

5 years agoFix PC-relative calculations to work entirely in newspace.
Taylor R Campbell [Sun, 20 Jan 2019 00:19:13 +0000 (00:19 +0000)]
Fix PC-relative calculations to work entirely in newspace.

5 years agoFix read/write_compiled_closure_target offsets.
Taylor R Campbell [Sun, 20 Jan 2019 00:18:55 +0000 (00:18 +0000)]
Fix read/write_compiled_closure_target offsets.

5 years agoAllow non-branch in cc_return_address_to_entry_address.
Taylor R Campbell [Sat, 19 Jan 2019 23:57:34 +0000 (23:57 +0000)]
Allow non-branch in cc_return_address_to_entry_address.

This happens for trampolines.  Maybe this should be a special case.

5 years agoFix scaling of PC offsets: they're byte offsets, not word offsets.
Taylor R Campbell [Sat, 19 Jan 2019 23:57:08 +0000 (23:57 +0000)]
Fix scaling of PC offsets: they're byte offsets, not word offsets.

5 years agoFix some symbol sizing.
Taylor R Campbell [Sat, 19 Jan 2019 23:56:55 +0000 (23:56 +0000)]
Fix some symbol sizing.

5 years agoTidy up interface_to_C.
Taylor R Campbell [Sat, 19 Jan 2019 23:56:45 +0000 (23:56 +0000)]
Tidy up interface_to_C.

5 years agoNote there is a way to do negative offsets.
Taylor R Campbell [Sat, 19 Jan 2019 23:56:31 +0000 (23:56 +0000)]
Note there is a way to do negative offsets.

5 years agoMake C_to_interface go through interface_to_scheme.
Taylor R Campbell [Sat, 19 Jan 2019 22:43:03 +0000 (22:43 +0000)]
Make C_to_interface go through interface_to_scheme.

This way C_to_interface sets up VAL, which is necessary in case it is
invoking a continuation.

5 years agoFix encoding of ROR and EXTR instructions.
Taylor R Campbell [Sat, 19 Jan 2019 21:20:47 +0000 (21:20 +0000)]
Fix encoding of ROR and EXTR instructions.

5 years agoLoad UARG2, don't clobber UARG1, in apply hooks.
Taylor R Campbell [Sat, 19 Jan 2019 20:51:56 +0000 (20:51 +0000)]
Load UARG2, don't clobber UARG1, in apply hooks.

5 years agoFix calculation of hook instruction address.
Taylor R Campbell [Sat, 19 Jan 2019 20:51:44 +0000 (20:51 +0000)]
Fix calculation of hook instruction address.

5 years agoFix order of arguments to load-tagged-immediate.
Taylor R Campbell [Sat, 19 Jan 2019 18:33:01 +0000 (18:33 +0000)]
Fix order of arguments to load-tagged-immediate.

5 years agoFix reversed byte order branches in read_uuo_frame_size.
Taylor R Campbell [Sat, 19 Jan 2019 08:03:54 +0000 (08:03 +0000)]
Fix reversed byte order branches in read_uuo_frame_size.

5 years agoFix extraction of PC offset from branch instruction.
Taylor R Campbell [Sat, 19 Jan 2019 08:03:41 +0000 (08:03 +0000)]
Fix extraction of PC offset from branch instruction.

5 years agoFix format word padding and tweak block offsets.
Taylor R Campbell [Sat, 19 Jan 2019 08:02:50 +0000 (08:02 +0000)]
Fix format word padding and tweak block offsets.

We already arranged for all entries to be 64-bit aligned, so we might
as well take advantage of that in block offsets.

5 years agoFix uuo link and trampoline instructions.
Taylor R Campbell [Fri, 18 Jan 2019 08:15:28 +0000 (08:15 +0000)]
Fix uuo link and trampoline instructions.

5 years agoMake interface_to_scheme match reality, not sensibility.
Taylor R Campbell [Fri, 18 Jan 2019 07:13:32 +0000 (07:13 +0000)]
Make interface_to_scheme match reality, not sensibility.

Should change cmpint.c so we pass a separate dispatch routine in for
entries and continuations, but that requires changing all the
cmpauxen at once.

5 years agoCompiler oughta agree cmpauxmd about what register is stack pointer.
Taylor R Campbell [Fri, 18 Jan 2019 07:13:15 +0000 (07:13 +0000)]
Compiler oughta agree cmpauxmd about what register is stack pointer.

5 years agoSimplify format words: make them always be instruction words.
Taylor R Campbell [Fri, 18 Jan 2019 07:03:11 +0000 (07:03 +0000)]
Simplify format words: make them always be instruction words.

No need for endianness conditionalization.

5 years agoFix passage of dynamic-link. Only machine register, not regblock.
Taylor R Campbell [Fri, 18 Jan 2019 06:23:00 +0000 (06:23 +0000)]
Fix passage of dynamic-link.  Only machine register, not regblock.

5 years agoAssert block offset is zero.
Taylor R Campbell [Fri, 18 Jan 2019 06:22:18 +0000 (06:22 +0000)]
Assert block offset is zero.

5 years agoAdd a TODO.
Taylor R Campbell [Wed, 16 Jan 2019 04:48:27 +0000 (04:48 +0000)]
Add a TODO.

5 years agoTeach ucode identify about aarch64.
Taylor R Campbell [Wed, 16 Jan 2019 04:47:27 +0000 (04:47 +0000)]
Teach ucode identify about aarch64.

Also make this always return a string here, so it doesn't crash on
boot if it hasn't been taught about new compiled code types.

5 years agoSave an instruction in multiplication with CSETM.
Taylor R Campbell [Wed, 16 Jan 2019 04:47:13 +0000 (04:47 +0000)]
Save an instruction in multiplication with CSETM.

5 years agoTweak some register numbering to reduce a bit of code.
Taylor R Campbell [Wed, 16 Jan 2019 04:47:00 +0000 (04:47 +0000)]
Tweak some register numbering to reduce a bit of code.

5 years agoFix register block indexing: no hooks in the register block here.
Taylor R Campbell [Wed, 16 Jan 2019 04:46:17 +0000 (04:46 +0000)]
Fix register block indexing: no hooks in the register block here.

5 years agoFix add/sub immediate syntax and criterion.
Taylor R Campbell [Tue, 15 Jan 2019 17:27:45 +0000 (17:27 +0000)]
Fix add/sub immediate syntax and criterion.

5 years agoUse a temporary if necessary in AFFIX-TYPE.
Taylor R Campbell [Tue, 15 Jan 2019 16:37:11 +0000 (16:37 +0000)]
Use a temporary if necessary in AFFIX-TYPE.

5 years agoDraft aarch64 cmpauxmd.
Taylor R Campbell [Tue, 15 Jan 2019 16:29:02 +0000 (16:29 +0000)]
Draft aarch64 cmpauxmd.

5 years agoFix push order in move-frame-up / dynamic-link.
Taylor R Campbell [Tue, 15 Jan 2019 03:48:25 +0000 (03:48 +0000)]
Fix push order in move-frame-up / dynamic-link.

5 years agoFix some instruction syntax bugs.
Taylor R Campbell [Tue, 15 Jan 2019 03:20:21 +0000 (03:20 +0000)]
Fix some instruction syntax bugs.

- Specify target _and_ source -- we're not x86 here.
- Specify operand size.
- Specify multipliers correctly.

5 years agoAvoid REGISTER-COPY-IF-AVAILABLE and TEMPORARY-COPY-IF-AVAILABLE.
Taylor R Campbell [Tue, 15 Jan 2019 03:19:18 +0000 (03:19 +0000)]
Avoid REGISTER-COPY-IF-AVAILABLE and TEMPORARY-COPY-IF-AVAILABLE.

These give out register references, which are a pain.  Just use
REUSE-PSEUDO-REGISTER-IF-AVAILABLE! to get the machine register
number.

5 years agoDisable floating-point vector primitives too.
Taylor R Campbell [Tue, 15 Jan 2019 03:18:32 +0000 (03:18 +0000)]
Disable floating-point vector primitives too.

Until we have open-coded floating-point arithmetic.

5 years agoMake RTL:CONSTANT-COST always return positive.
Taylor R Campbell [Tue, 15 Jan 2019 03:17:35 +0000 (03:17 +0000)]
Make RTL:CONSTANT-COST always return positive.

Otherwise CSE might substitute constants for registers where at best
it's not helpful and at worst we don't have rules for it.

5 years agoFix up some instruction decriptions.
Taylor R Campbell [Tue, 15 Jan 2019 03:15:35 +0000 (03:15 +0000)]
Fix up some instruction decriptions.

- Migrate some things with citations and updates to instr1.scm.
- No need for `(evaluation ,terms) in fixed-width instructions.
- Fix some missing or duplicated bits.
- Add some more instructions.

5 years agoUmptuple-check that instruction widths sum to multiples of 32 bits.
Taylor R Campbell [Tue, 15 Jan 2019 03:14:40 +0000 (03:14 +0000)]
Umptuple-check that instruction widths sum to multiples of 32 bits.

5 years agoPut something in these stub files so they compile as code.
Taylor R Campbell [Tue, 15 Jan 2019 03:12:46 +0000 (03:12 +0000)]
Put something in these stub files so they compile as code.

Otherwise the portable fasdumper barfs trying to fasdump a pathname.

5 years agoUpdate config.guess and config.sub so they recognize aarch64.
Taylor R Campbell [Tue, 15 Jan 2019 03:12:25 +0000 (03:12 +0000)]
Update config.guess and config.sub so they recognize aarch64.

5 years agoFix configure goo for aarch64 with byte order specified.
Taylor R Campbell [Tue, 15 Jan 2019 03:11:36 +0000 (03:11 +0000)]
Fix configure goo for aarch64 with byte order specified.

5 years agoBlock offset units are instructions, not bytes, so we get two more bits.
Taylor R Campbell [Tue, 15 Jan 2019 03:09:58 +0000 (03:09 +0000)]
Block offset units are instructions, not bytes, so we get two more bits.

5 years agoVarious work to get this going.
Taylor R Campbell [Mon, 14 Jan 2019 07:43:42 +0000 (07:43 +0000)]
Various work to get this going.

Enough to compile and assemble advice.scm, the first file in the
runtime.  Still a ways from doing anything.

5 years agoTeach assembler about MODULO.
Taylor R Campbell [Mon, 14 Jan 2019 07:44:17 +0000 (07:44 +0000)]
Teach assembler about MODULO.

XXX Should maybe do EUCLIDEAN-REMAINDER or the full gamut of division
operators, but this is all I need for now.

5 years agoReport bad expressions more clearly.
Taylor R Campbell [Mon, 14 Jan 2019 07:44:05 +0000 (07:44 +0000)]
Report bad expressions more clearly.

5 years agoFill in some more files, add some build goo, fix some bugs.
Taylor R Campbell [Sun, 13 Jan 2019 22:52:06 +0000 (22:52 +0000)]
Fill in some more files, add some build goo, fix some bugs.

Invent a way to do assembler macros so we can do legible branch
tensioning rules and reuse ADRP/ADD patterns.

5 years agoDraft aarch64 back end.
Taylor R Campbell [Sun, 13 Jan 2019 06:08:23 +0000 (06:08 +0000)]
Draft aarch64 back end.

Nowhere near completion yet, long TODO list, not compile-tested, &c.
Not sure if I'll find any more copious spare time to work on this for
a while.

5 years agoFix multiplication and division by purely imaginary numbers.
Taylor R Campbell [Tue, 20 Aug 2019 03:40:24 +0000 (03:40 +0000)]
Fix multiplication and division by purely imaginary numbers.

That is, complex numbers whose real part is exact zero.

5 years agoTest multiplication and division by +i and -i.
Taylor R Campbell [Tue, 20 Aug 2019 03:13:51 +0000 (03:13 +0000)]
Test multiplication and division by +i and -i.

We do not currently follow Kahan's recommenations that when the real
part is exactly zero, the arithmetic be done by negation rather than
multiplication.

5 years agoFix edge cases in ANGLE.
Taylor R Campbell [Tue, 20 Aug 2019 03:03:25 +0000 (03:03 +0000)]
Fix edge cases in ANGLE.

5 years agoExpand edge cases for ANGLE.
Taylor R Campbell [Tue, 20 Aug 2019 02:51:27 +0000 (02:51 +0000)]
Expand edge cases for ANGLE.

Based on Kahan's `Much Ado about Nothing's Sign Bit' paper.  We screw
up some zero edge cases.

5 years agoFix references incorrectly marked with EVR().
Chris Hanson [Mon, 19 Aug 2019 22:33:00 +0000 (15:33 -0700)]
Fix references incorrectly marked with EVR().

5 years ago`x ... ...' is busted in syntax-rules.
Taylor R Campbell [Sat, 17 Aug 2019 13:54:34 +0000 (13:54 +0000)]
`x ... ...' is busted in syntax-rules.

5 years agoMerge branch 'riastradh-20181220-closentry-v12'
Taylor R Campbell [Fri, 16 Aug 2019 05:02:00 +0000 (05:02 +0000)]
Merge branch 'riastradh-20181220-closentry-v12'

5 years agoTweak logit1/2+ condition number plot for clarity.
Taylor R Campbell [Fri, 16 Aug 2019 04:59:52 +0000 (04:59 +0000)]
Tweak logit1/2+ condition number plot for clarity.

5 years agoFactor out common PostScript code for plotting.
Taylor R Campbell [Fri, 16 Aug 2019 03:54:49 +0000 (03:54 +0000)]
Factor out common PostScript code for plotting.

Should make this a little more maintainable.

5 years agoUniform code and style for plots.
Taylor R Campbell [Fri, 16 Aug 2019 02:54:44 +0000 (02:54 +0000)]
Uniform code and style for plots.

Tweak line widths a little bit to roughly match cmmi10 (Computer
Modern Math Italic 10pt) rule widths for axes, and a little thicker
for the plots themselves, for the printed manual.

5 years agoProduce 300dpi, not 72dpi, PNGs for HTML output.
Taylor R Campbell [Fri, 16 Aug 2019 02:51:41 +0000 (02:51 +0000)]
Produce 300dpi, not 72dpi, PNGs for HTML output.

5 years agoUse TLS/SSL for links to <srfi.schemers.org>.
Arthur A. Gleckler [Thu, 15 Aug 2019 20:17:00 +0000 (13:17 -0700)]
Use TLS/SSL for links to <srfi.schemers.org>.

5 years agoAdd release note.
Taylor R Campbell [Thu, 15 Aug 2019 14:24:35 +0000 (14:24 +0000)]
Add release note.

5 years agoBump COMPILER_INTERFACE_VERSION.
Taylor R Campbell [Thu, 15 Aug 2019 05:19:18 +0000 (05:19 +0000)]
Bump COMPILER_INTERFACE_VERSION.

Make attempts to use old .com files fail a little more obviously.

5 years agoSet default target to all for cross-builds too.
Taylor R Campbell [Thu, 15 Aug 2019 04:57:56 +0000 (04:57 +0000)]
Set default target to all for cross-builds too.

No need to make it default to cross-host.  If you want to separate
the cross-host/cross-target stages, you'll know to do cross-host
anyway.

5 years agoAvoid spurious fallthrough (fortunately harmless here).
Taylor R Campbell [Thu, 15 Aug 2019 04:45:27 +0000 (04:45 +0000)]
Avoid spurious fallthrough (fortunately harmless here).

5 years agoTest fma exceptions too.
Taylor R Campbell [Wed, 14 Aug 2019 01:31:56 +0000 (01:31 +0000)]
Test fma exceptions too.

5 years agoAdd fma, fused-multiply/add.
Taylor R Campbell [Tue, 13 Aug 2019 23:25:14 +0000 (23:25 +0000)]
Add fma, fused-multiply/add.

Not yet open-coded anywhere.  Will be a huge pain on x86.  No aarch64
flonum open-coding at all yet.

(Maybe flo:fast-fma? should return false if it's not open-coded...)

5 years agoUse a different reflect code number for compiled invocations. origin/riastradh-20181220-closentry-v12
Taylor R Campbell [Sun, 6 Jan 2019 03:59:31 +0000 (03:59 +0000)]
Use a different reflect code number for compiled invocations.

Teach the continuation parser about it.

Turns out this doesn't actually coincide with the format the v8
microcode used for APPLY-COMPILED, which also has a frame size,
presumably so arity dispatch could be done in the callee.

(Not that the v8 stuff matters these days; maybe we should just flush
those parts of conpar.scm.)

5 years agoOpen-code WITH-STACK-MARKER too.
Taylor R Campbell [Sat, 5 Jan 2019 15:53:23 +0000 (15:53 +0000)]
Open-code WITH-STACK-MARKER too.

Saves a trip through reflect-to-interface, which would break the
return address branch target prediction stack.

5 years agoShare closure interrupt labels.
Taylor R Campbell [Sat, 5 Jan 2019 06:31:35 +0000 (06:31 +0000)]
Share closure interrupt labels.

The interrupt-handling subroutine just uses the tagged entry on the
stack, so no need for a separate call for each closure.  If nothing
else this should save some code size.

Also, in open-coding of with-interrupt-mask, reuse pop-return with
interrupt checks.

5 years agoTidy up compiler utility return addresses.
Taylor R Campbell [Sat, 5 Jan 2019 03:36:51 +0000 (03:36 +0000)]
Tidy up compiler utility return addresses.

Use compiled returns for the ones that are likely to return to Scheme
like lookups and assignments, and compiled entries for the ones that
are likely to return to microcode like interrupts.

Architectures on which compiled entries and compiled returns have the
same format will see no difference: compiled code passes in an
untagged return address either way.

On amd64, where compiled entries and compiled returns are different:

- For hooks that act like leaf subroutines and never return to
  microcode, use plain CALL/RET in pairs.

- For hooks that are subroutines likely to return to Scheme
  immediately but might return to microcode in screw cases, use

        (CALL ,hook)                    ; Invoke hook with untagged ret addr...
        (JMP (@PCR ,continuation))      ; ...which jumps to formatted entry.
        (WORD ...)
        (BLOCK-OFFSET ,continuation)
        (QUAD U 0)
       (LABEL ,continuation)
        ...                             ; continuation instructions

  For the non-screw cases this keeps CALL/RET paired.

- For hooks that always defer to microcode, namely to handle
  interrupts, use

        (LEA Q (R ,rbx) (@PCR ,continuation))
        (JMP ,hook)

  Here it doesn't really whether the CALL/RET is paired because we're
  going to wreck the return address branch prediction stack no matter
  what, but it is convenient to have the entry address rather than
  the return address in the compiled utility.

5 years agoUse ret for returns from interface and from generic arithmetic hooks.
Taylor R Campbell [Fri, 4 Jan 2019 04:58:51 +0000 (04:58 +0000)]
Use ret for returns from interface and from generic arithmetic hooks.

Let's take advantage of the return address stack branch target
predictor rather than unceremoniously trash it, shall we?

5 years agoOpen-code with-interrupt-mask, with-interrupts-reduced.
Taylor R Campbell [Thu, 3 Jan 2019 19:10:45 +0000 (19:10 +0000)]
Open-code with-interrupt-mask, with-interrupts-reduced.

Not open-coded at the RTL level, but at the LAP level.

This way we avoid going through a return trampoline, which wrecks the
return address stack branch target predictor as long as we transition
between Scheme and C to handle trampolines.

Most of the work, of munging MEMTOP and STACK_GUARD, is relegated to
an assembly hook subroutine so the code doesn't expand too much.  The
format of the stack still uses reflect-to-interface so that this
should require no changes to the continuation parser to get the
interrupt masks right, but with an intermediate empty-frame
continuation that actually calls the assembly hook and then pops
reflect-to-interface off.

5 years agoAllow return_to_compiled_code to return to compiled entries.
Taylor R Campbell [Thu, 3 Jan 2019 03:19:54 +0000 (03:19 +0000)]
Allow return_to_compiled_code to return to compiled entries.

The earlier compiled entry/return split left various utility calls
pushing compiled entries, rather than compiled return addresses, for
continuations on the stack -- notably interrupt routines, the linker
utility, and interpreter calls.

I arranged for these to all to use RETURN_TO_SCHEME_ENTRY (or
JUMP_TO_CC_ENTRY), but missed one spot: the continuations constructed
by STACK-FRAME->CONTINUATION, which use return_to_compiled_code,
which in turn expected a compiled return rather than a compiled entry
and choked.

The interrupt routines, linker utility, and interpreter calls should
all be adapted to take returns rather than entries (which is another
ABI-breaking flag day), but this will do for now.

5 years agoSave interpreter result too before anything in continuation.
Taylor R Campbell [Wed, 2 Jan 2019 23:44:09 +0000 (23:44 +0000)]
Save interpreter result too before anything in continuation.

On x86, the interpreter call result register is eax/rax, register 0,
which is also the first register we hand out for register allocation.
The continuation for an interpreter call result uses register 0, but
if the caller uses a dynamic link, the continuation first pops its
frame via the dynamic link...using a temporary register that is
guaranteed to be register 0 since it's the first one the register
allocator hands out.  The code sequence looks something like this:

;; (interpreter-call:cache-reference label-10 (register #x24) #f)
(mov q (r 2) (r 1))
(call (@ro 6 #xd0))
;; (continuation-entry label-10)
(word u #xfffc)
(block-offset label-10)
label-10:
;; (assign (register #x25) (post-increment (register 4) 1))
(pop q (r 0))
;; (assign (register #x26) (object->address (register #x25)))
(and q (r 0) (r 5))
;; (assign (offset (register 6) (machine-constant 4)) (register #x26))
(mov q (@ro 6 #x20) (r 0))
;; (assign (register #x23) (register 0))
(jmp (@pcr label-13))

On entry to the continuation, register 0 holds the value we want,
chosen as a machine alias for pseudo-register #x23 in the procedure
body, but the first thing the continuation does is pop the dynamic
link into register 0, ruining the party.

This is rather tricky to trigger because it turns out in _non-error_
cases, compiled code never asks the interpreter to evaluate a cache
reference that will return a value.  But you can trigger this by
referencing an unassigned variable and invoking a restart, which does
cause the cache reference to return a value:

;; Unassigned, so compiled code will ask interpreter for help.
(define null)

;; Recursive procedure for which the compiler uses a dynamic link.
(define (map f l)
  (let loop ((l l))
    (if (pair? l)
        (cons (f (car l)) (loop (cdr l)))
        null)))

;; Invoke the restart that will return from the cache reference with
;; a value.
(bind-condition-handler (list condition-type:unassigned-variable)
    (lambda (condition)
      condition
      (use-value '()))
  (lambda ()
    (map + '(1 2 3))))
;Value: (1 2 3 . #[false 15 #xea9c18])

Here #[false 15 #xea9c18] is the (detagged) dynamic link, a pointer
into the stack, not the result we wanted at all.

5 years agoMake entries point to _after_ the PC offset.
Taylor R Campbell [Mon, 31 Dec 2018 21:08:22 +0000 (21:08 +0000)]
Make entries point to _after_ the PC offset.

This saves a jump in closure headers, and makes non-closure entries
have a nice PC offset of 0 rather than an awkward PC offset of 8.
However, this causes all indirect calls to have an additional offset
of -8 in the addressing mode -- not clear yet how much this hurts.

WARNING: This changes the amd64 compiled code interface so that new
compiled code requires a new microcode and vice versa.  Further, you
must set compiler:cross-compiling? to #t to compile the system,
because compiled code block offsets are now in a different place
relative to compiled entries, so the native fasdumper of an old
microcode can't handle compiled entries produced by a new compiler.

5 years agoLoad the fallback into rax so caller needs no conditional branch.
Taylor R Campbell [Wed, 2 Jan 2019 06:10:52 +0000 (06:10 +0000)]
Load the fallback into rax so caller needs no conditional branch.

WARNING: This changes the amd64 compiled code interface so that new
compiled code requires a new microcode.  (However, a new microcode
should handle old compiled code without trouble, since old compiled
code treats rax as garbage at this point, and LEA does not affect
flags.)

5 years agoUse BTS to affix single-bit type tags.
Taylor R Campbell [Mon, 31 Dec 2018 20:32:37 +0000 (20:32 +0000)]
Use BTS to affix single-bit type tags.