mit-scheme.git
7 years agoAdd a bunch of unit tests swiped from Larceny.
Chris Hanson [Sun, 19 Feb 2017 01:42:37 +0000 (17:42 -0800)]
Add a bunch of unit tests swiped from Larceny.

7 years agoImplement ustring-{lower,upper}-case?.
Chris Hanson [Sun, 19 Feb 2017 01:42:09 +0000 (17:42 -0800)]
Implement ustring-{lower,upper}-case?.

7 years agoFirst draft of NFD normalization.
Chris Hanson [Sat, 18 Feb 2017 10:39:40 +0000 (02:39 -0800)]
First draft of NFD normalization.

7 years agoRefactor the converter to separate the value mapping from the dispatcher.
Chris Hanson [Sat, 18 Feb 2017 09:14:09 +0000 (01:14 -0800)]
Refactor the converter to separate the value mapping from the dispatcher.

7 years agoAdd "NFD_QC" table.
Chris Hanson [Sat, 18 Feb 2017 07:46:30 +0000 (23:46 -0800)]
Add "NFD_QC" table.

7 years agoClean up code generators a bit. Add "dm" property.
Chris Hanson [Sat, 18 Feb 2017 07:44:51 +0000 (23:44 -0800)]
Clean up code generators a bit.  Add "dm" property.

7 years agoAdd mappings for _QC properties.
Chris Hanson [Sat, 18 Feb 2017 07:43:48 +0000 (23:43 -0800)]
Add mappings for _QC properties.

7 years agoAdd ucd-table-ccc.
Chris Hanson [Sat, 18 Feb 2017 06:20:14 +0000 (22:20 -0800)]
Add ucd-table-ccc.

7 years agoAnother round of substitutions.
Chris Hanson [Sat, 18 Feb 2017 04:40:26 +0000 (20:40 -0800)]
Another round of substitutions.

7 years agoBegin process of replacing string operations with ustring equivalents.
Chris Hanson [Sat, 18 Feb 2017 03:58:14 +0000 (19:58 -0800)]
Begin process of replacing string operations with ustring equivalents.

7 years agoImplement "slices", which provide a restricted view of a string.
Chris Hanson [Sat, 18 Feb 2017 03:42:05 +0000 (19:42 -0800)]
Implement "slices", which provide a restricted view of a string.

This helps avoid the need for providing substring arguments everywhere.
Also, implement vector->ustring.

7 years agoCollapse ustring implementations together to save space and time.
Chris Hanson [Sat, 18 Feb 2017 01:26:23 +0000 (17:26 -0800)]
Collapse ustring implementations together to save space and time.

7 years agoReorder code in ustring; plus a few small tweaks.
Chris Hanson [Sat, 18 Feb 2017 00:15:51 +0000 (16:15 -0800)]
Reorder code in ustring; plus a few small tweaks.

7 years agoRename make-legacy-string to legacy-string-allocate.
Chris Hanson [Fri, 17 Feb 2017 22:58:04 +0000 (14:58 -0800)]
Rename make-legacy-string to legacy-string-allocate.

7 years agoGuarantee that incoming characters don't have bucky bits.
Chris Hanson [Fri, 17 Feb 2017 06:48:01 +0000 (22:48 -0800)]
Guarantee that incoming characters don't have bucky bits.

7 years agoChange full-width string to use 3 bytes instead of 4.
Chris Hanson [Fri, 17 Feb 2017 06:43:25 +0000 (22:43 -0800)]
Change full-width string to use 3 bytes instead of 4.

7 years agoReorganize ustring around operations.
Chris Hanson [Fri, 17 Feb 2017 06:27:03 +0000 (22:27 -0800)]
Reorganize ustring around operations.

7 years agoMove all legacy-string definitions into ustring.
Chris Hanson [Fri, 17 Feb 2017 06:17:15 +0000 (22:17 -0800)]
Move all legacy-string definitions into ustring.

This is preparation for moving all the old string code elsewhere.

7 years agoruntime/char.scm (unicode-char?): unreferenced bound variable: cp
Matt Birkholz [Thu, 16 Feb 2017 20:17:05 +0000 (13:17 -0700)]
runtime/char.scm (unicode-char?): unreferenced bound variable: cp

7 years agoFix number-of-event-types constant; add missing event types.
Matt Birkholz [Thu, 16 Feb 2017 18:45:22 +0000 (11:45 -0700)]
Fix number-of-event-types constant; add missing event types.

7 years agox11, x11-screen: Remove references to deprecated bindings.
Matt Birkholz [Thu, 16 Feb 2017 18:43:31 +0000 (11:43 -0700)]
x11, x11-screen: Remove references to deprecated bindings.

7 years agoUpdate EDITION, UPDATED, and regenerate detailed menu.
Chris Hanson [Thu, 16 Feb 2017 07:39:23 +0000 (23:39 -0800)]
Update EDITION, UPDATED, and regenerate detailed menu.

7 years agoRewrite the Characters chapter to reflect the implementation.
Chris Hanson [Thu, 16 Feb 2017 06:59:12 +0000 (22:59 -0800)]
Rewrite the Characters chapter to reflect the implementation.

7 years agoClean up the character abstraction to be more consistent.
Chris Hanson [Thu, 16 Feb 2017 06:55:36 +0000 (22:55 -0800)]
Clean up the character abstraction to be more consistent.

* Change unicode-char? correspond to unicode-scalar-value?.
* Rename base-char? to bitless-char?.
* Eliminate char-integer-limit, unicode-char-code?, and char->scalar-value.

7 years agoDefine xml-char? and use it in favor of unicode-char?.
Chris Hanson [Thu, 16 Feb 2017 02:52:17 +0000 (18:52 -0800)]
Define xml-char? and use it in favor of unicode-char?.

7 years agoAdd support for deprecated bindings.
Matt Birkholz [Wed, 15 Feb 2017 20:43:20 +0000 (13:43 -0700)]
Add support for deprecated bindings.

7 years agoPunt warning about "copying large block"s (e.g. Edwin buffers).
Matt Birkholz [Wed, 15 Feb 2017 20:46:09 +0000 (13:46 -0700)]
Punt warning about "copying large block"s (e.g. Edwin buffers).

7 years agoref-manual: Use an @xref to make Subprocess Options easier to find.
Matt Birkholz [Wed, 15 Feb 2017 20:48:30 +0000 (13:48 -0700)]
ref-manual: Use an @xref to make Subprocess Options easier to find.

7 years agoruntime/parse: typo/thinko
Matt Birkholz [Wed, 15 Feb 2017 23:00:37 +0000 (16:00 -0700)]
runtime/parse: typo/thinko

7 years agoFix predicate relationship.
Chris Hanson [Wed, 15 Feb 2017 09:34:27 +0000 (01:34 -0800)]
Fix predicate relationship.

7 years agoFix broken test.
Chris Hanson [Wed, 15 Feb 2017 09:31:59 +0000 (01:31 -0800)]
Fix broken test.

7 years agoChange character sets to be defined over code points.
Chris Hanson [Wed, 15 Feb 2017 09:27:38 +0000 (01:27 -0800)]
Change character sets to be defined over code points.

7 years agoAccount for the fact that UCD procedure accept all code points.
Chris Hanson [Wed, 15 Feb 2017 09:23:32 +0000 (01:23 -0800)]
Account for the fact that UCD procedure accept all code points.

7 years agoChange the UCD converter to pay attention to the remaining elements.
Chris Hanson [Wed, 15 Feb 2017 09:02:10 +0000 (01:02 -0800)]
Change the UCD converter to pay attention to the remaining elements.

This guarantees that every code point is represented by the generated tables.

7 years agoAlways use names for separator:space characters.
Chris Hanson [Wed, 15 Feb 2017 06:15:08 +0000 (22:15 -0800)]
Always use names for separator:space characters.

7 years agoAdd support for R7RS string \<newline> escape.
Chris Hanson [Wed, 15 Feb 2017 05:16:52 +0000 (21:16 -0800)]
Add support for R7RS string \<newline> escape.

7 years agoFix thinko in recent change.
Chris Hanson [Wed, 15 Feb 2017 05:16:24 +0000 (21:16 -0800)]
Fix thinko in recent change.

7 years agoSpaces should be considered normal printing characters.
Chris Hanson [Wed, 15 Feb 2017 05:07:07 +0000 (21:07 -0800)]
Spaces should be considered normal printing characters.

7 years agoChange make-signal-combiner to be iterative.
Chris Hanson [Wed, 15 Feb 2017 04:16:49 +0000 (20:16 -0800)]
Change make-signal-combiner to be iterative.

7 years agoSimplify make-signal-combiner interface.
Chris Hanson [Wed, 15 Feb 2017 04:06:37 +0000 (20:06 -0800)]
Simplify make-signal-combiner interface.

7 years agoFix unit test broken by recent change.
Chris Hanson [Wed, 15 Feb 2017 04:03:41 +0000 (20:03 -0800)]
Fix unit test broken by recent change.

7 years agoChange char-set-invert to be iterative.
Chris Hanson [Wed, 15 Feb 2017 02:14:50 +0000 (18:14 -0800)]
Change char-set-invert to be iterative.

7 years agoFix missing tail section in make-signal-combiner.
Chris Hanson [Wed, 15 Feb 2017 02:08:54 +0000 (18:08 -0800)]
Fix missing tail section in make-signal-combiner.

Also some no-op tweaks.

7 years agoEliminate unused and incorrectly implemented ustring-capitalize.
Chris Hanson [Tue, 14 Feb 2017 08:05:40 +0000 (00:05 -0800)]
Eliminate unused and incorrectly implemented ustring-capitalize.

7 years agoTweak comment.
Chris Hanson [Tue, 14 Feb 2017 07:56:49 +0000 (23:56 -0800)]
Tweak comment.

7 years agoRewrite make-signal-combiner to take advantage of signal structure.
Chris Hanson [Tue, 14 Feb 2017 07:54:02 +0000 (23:54 -0800)]
Rewrite make-signal-combiner to take advantage of signal structure.

7 years agoChange char-set implementation to use "signals" instead of "ranges".
Chris Hanson [Tue, 14 Feb 2017 06:28:04 +0000 (22:28 -0800)]
Change char-set implementation to use "signals" instead of "ranges".

7 years agoMajor refactor to minimize size of character sets.
Chris Hanson [Tue, 14 Feb 2017 05:17:52 +0000 (21:17 -0800)]
Major refactor to minimize size of character sets.

7 years agoEliminate unused binding.
Chris Hanson [Mon, 13 Feb 2017 10:12:36 +0000 (02:12 -0800)]
Eliminate unused binding.

7 years agoFix typos in previous change.
Chris Hanson [Mon, 13 Feb 2017 10:12:24 +0000 (02:12 -0800)]
Fix typos in previous change.

7 years agoChange is-X-of from compound to parametric predicates.
Chris Hanson [Sun, 12 Feb 2017 22:12:59 +0000 (14:12 -0800)]
Change is-X-of from compound to parametric predicates.

7 years agoRewrite unparser to pass context rather than use parameters.
Chris Hanson [Sun, 12 Feb 2017 20:13:32 +0000 (12:13 -0800)]
Rewrite unparser to pass context rather than use parameters.

Also eliminate unparser-table abstraction.

7 years agoReduce the size of character sets by computing the old format on demand.
Chris Hanson [Sun, 12 Feb 2017 09:25:56 +0000 (01:25 -0800)]
Reduce the size of character sets by computing the old format on demand.

7 years agoChange printer to be smarter about when quoting is needed.
Chris Hanson [Sun, 12 Feb 2017 06:06:50 +0000 (22:06 -0800)]
Change printer to be smarter about when quoting is needed.

7 years agoAdd some additional useful character sets.
Chris Hanson [Sun, 12 Feb 2017 05:51:34 +0000 (21:51 -0800)]
Add some additional useful character sets.

7 years agoFix bug: missed package name change in cold load.
Chris Hanson [Sun, 12 Feb 2017 05:50:52 +0000 (21:50 -0800)]
Fix bug: missed package name change in cold load.

7 years agoAllow conjoin and disjoin to be used with unregistered predicates.
Chris Hanson [Sun, 12 Feb 2017 05:31:04 +0000 (21:31 -0800)]
Allow conjoin and disjoin to be used with unregistered predicates.

7 years agoAdd tables for CWCF, CWL, and CWU.
Chris Hanson [Sun, 12 Feb 2017 01:21:13 +0000 (17:21 -0800)]
Add tables for CWCF, CWL, and CWU.

7 years agoChange code generator for boolean sets to use standard names.
Chris Hanson [Sun, 12 Feb 2017 01:20:17 +0000 (17:20 -0800)]
Change code generator for boolean sets to use standard names.

7 years agoRename ucd-table-glue to ucd-glue.
Chris Hanson [Sun, 12 Feb 2017 00:41:07 +0000 (16:41 -0800)]
Rename ucd-table-glue to ucd-glue.

7 years agoChange pattern-white-space to pattern-whitespace for consistency.
Chris Hanson [Sun, 12 Feb 2017 00:37:10 +0000 (16:37 -0800)]
Change pattern-white-space to pattern-whitespace for consistency.

7 years agoRename port/char-set to textual-port-char-set.
Chris Hanson [Sat, 11 Feb 2017 23:42:52 +0000 (15:42 -0800)]
Rename port/char-set to textual-port-char-set.

Make it work on all textual ports and default to iso-8859-1.

7 years agoAdd character sets to textual ports.
Chris Hanson [Sat, 11 Feb 2017 23:37:47 +0000 (15:37 -0800)]
Add character sets to textual ports.

This will help the printer decide what characters it should emit.

7 years agoImplement char-set:unicode.
Chris Hanson [Sat, 11 Feb 2017 22:41:01 +0000 (14:41 -0800)]
Implement char-set:unicode.

7 years agoImplement unicode-char-code?.
Chris Hanson [Sat, 11 Feb 2017 22:40:18 +0000 (14:40 -0800)]
Implement unicode-char-code?.

7 years agoClean up char->digit and digit->char.
Chris Hanson [Sat, 11 Feb 2017 22:39:47 +0000 (14:39 -0800)]
Clean up char->digit and digit->char.

7 years agoImplement digit-value.
Chris Hanson [Sat, 11 Feb 2017 21:56:03 +0000 (13:56 -0800)]
Implement digit-value.

7 years agoChange generated tables to use characters instead of integers.
Chris Hanson [Sat, 11 Feb 2017 21:03:44 +0000 (13:03 -0800)]
Change generated tables to use characters instead of integers.

7 years agoRename "WSpace" full name to "whitespace".
Chris Hanson [Sat, 11 Feb 2017 21:02:57 +0000 (13:02 -0800)]
Rename "WSpace" full name to "whitespace".

7 years agoRemove timestamp from generated files.
Chris Hanson [Sat, 11 Feb 2017 20:39:25 +0000 (12:39 -0800)]
Remove timestamp from generated files.

It forces a new check-in when nothing else has changed.

7 years agoChange implementation of #\<char> to show all "graphic" characters.
Chris Hanson [Sat, 11 Feb 2017 08:32:54 +0000 (00:32 -0800)]
Change implementation of #\<char> to show all "graphic" characters.

This isn't quite right -- it doesn't support Unicode very well -- but will do
for now.

7 years agoFix bug: use atom delimiters instead of symbol-constituents.
Chris Hanson [Sat, 11 Feb 2017 08:32:12 +0000 (00:32 -0800)]
Fix bug: use atom delimiters instead of symbol-constituents.

Proper handling of parser character sets needs review.

7 years agoImplement proper handling of symbol quoting and case folding in parser.
Chris Hanson [Sat, 11 Feb 2017 07:52:59 +0000 (23:52 -0800)]
Implement proper handling of symbol quoting and case folding in parser.

Disallows use of | in symbols except at beginning and end.
Disallows use of \ in symbols unless in ||.

7 years agoImplement char-{down,fold,up}case-full and use in ustring.
Chris Hanson [Sat, 11 Feb 2017 07:52:19 +0000 (23:52 -0800)]
Implement char-{down,fold,up}case-full and use in ustring.

7 years agoUse correct case-folding algorithm for symbols.
Chris Hanson [Sat, 11 Feb 2017 06:42:30 +0000 (22:42 -0800)]
Use correct case-folding algorithm for symbols.

7 years agoChange ustring implementation to simplify to 8-bit legacy strings.
Chris Hanson [Sat, 11 Feb 2017 06:40:58 +0000 (22:40 -0800)]
Change ustring implementation to simplify to 8-bit legacy strings.

This was happening anyway given the previous definition of char-ascii?.

7 years agoFix char-ascii? to be 7-bit instead of 8.
Chris Hanson [Sat, 11 Feb 2017 06:06:34 +0000 (22:06 -0800)]
Fix char-ascii? to be 7-bit instead of 8.

Also create char-8-bit?.

7 years agoFix bug: typo meant value of utfX->string was wrong.
Chris Hanson [Sat, 11 Feb 2017 05:20:28 +0000 (21:20 -0800)]
Fix bug: typo meant value of utfX->string was wrong.

Also, consistently use the char decoding procedures.

7 years agoCharacter case mappers should preserve the bits.
Chris Hanson [Sat, 11 Feb 2017 04:54:35 +0000 (20:54 -0800)]
Character case mappers should preserve the bits.

7 years agoFix parser case-folding to use ustring-foldcase.
Chris Hanson [Sat, 11 Feb 2017 04:40:57 +0000 (20:40 -0800)]
Fix parser case-folding to use ustring-foldcase.

7 years agoImplement char-foldcase and ustring-foldcase.
Chris Hanson [Sat, 11 Feb 2017 04:40:46 +0000 (20:40 -0800)]
Implement char-foldcase and ustring-foldcase.

Also fix implementations of ustring-{up,down}case.

7 years agoAdd tables and support for case folding and string case conversion.
Chris Hanson [Sat, 11 Feb 2017 04:39:03 +0000 (20:39 -0800)]
Add tables and support for case folding and string case conversion.

7 years agoUse non-pointer hash tables for UCD tables.
Chris Hanson [Fri, 10 Feb 2017 08:14:02 +0000 (00:14 -0800)]
Use non-pointer hash tables for UCD tables.

7 years agoImplement non-pointer hash tables.
Chris Hanson [Fri, 10 Feb 2017 08:11:39 +0000 (00:11 -0800)]
Implement non-pointer hash tables.

These are like strong eq? hash tables but they don't rehash after gc.

7 years agoImplement much smarter code generation for UCD tables.
Chris Hanson [Fri, 10 Feb 2017 08:03:24 +0000 (00:03 -0800)]
Implement much smarter code generation for UCD tables.

New generator generates character sets for binary-valued properties.
For code-point valued properties, it uses fixnum hash tables.
It also uses fixnum hash tables for the numeric-type property.

The end result of this is a considerable reduction in code size.

7 years agoAdd header and explanatory comment to names.
Chris Hanson [Fri, 10 Feb 2017 06:18:45 +0000 (22:18 -0800)]
Add header and explanatory comment to names.

7 years agoAdd metadata to all of the XML properties.
Chris Hanson [Fri, 10 Feb 2017 06:14:53 +0000 (22:14 -0800)]
Add metadata to all of the XML properties.

7 years agoCorrectly implement character case conversions and R7RS char sets.
Chris Hanson [Thu, 9 Feb 2017 08:12:52 +0000 (00:12 -0800)]
Correctly implement character case conversions and R7RS char sets.

7 years agoOptimize the ucd tables a bit.
Chris Hanson [Thu, 9 Feb 2017 08:10:50 +0000 (00:10 -0800)]
Optimize the ucd tables a bit.

Need to reconsider the boolean tables, which will be smaller and might be faster
as char sets.

7 years agoChange the ucd converter to store raw prop files in a standard place.
Chris Hanson [Thu, 9 Feb 2017 07:47:57 +0000 (23:47 -0800)]
Change the ucd converter to store raw prop files in a standard place.

These files are being checked in, so it shouldn't be necessary to regenerate
them until the UCD is updated to a new version.

7 years agoFix typo in previous change.
Chris Hanson [Wed, 8 Feb 2017 08:27:07 +0000 (00:27 -0800)]
Fix typo in previous change.

7 years agoImplement "computed" character sets.
Chris Hanson [Wed, 8 Feb 2017 08:21:45 +0000 (00:21 -0800)]
Implement "computed" character sets.

Also define Unicode symbol characters.

7 years agoAdd value conversions to the UCD property code generator.
Chris Hanson [Wed, 8 Feb 2017 06:29:17 +0000 (22:29 -0800)]
Add value conversions to the UCD property code generator.

This translates the string values into something more sensible for Scheme.

7 years agoImplement char-general-category.
Chris Hanson [Wed, 8 Feb 2017 04:39:08 +0000 (20:39 -0800)]
Implement char-general-category.

7 years agoAdd in the first Unicode property table: gc.
Chris Hanson [Wed, 8 Feb 2017 04:35:19 +0000 (20:35 -0800)]
Add in the first Unicode property table: gc.

7 years agoChange the way boot inits work to accomodate packages with multiple files.
Chris Hanson [Wed, 8 Feb 2017 04:34:37 +0000 (20:34 -0800)]
Change the way boot inits work to accomodate packages with multiple files.

7 years agoRefactor both the stratifier and the code generator.
Chris Hanson [Wed, 8 Feb 2017 04:30:02 +0000 (20:30 -0800)]
Refactor both the stratifier and the code generator.

The stratifier now avoids the use of bit strings and just manipulates the ranges
appropriately as it groups them.  At the end it expands all the ranges so that
the nodes have minimum structure.  The code generator was modified to accept the
new input form.

The code generator has been changed to put all the terminal nodes at the
beginning of the table, and to hash-cons new non-terminal nodes.  It turns out
that there was a lot of duplication in the nodes, so this saves a bunch of
space.

7 years agoFix nasty bug: modifying a hash table could scramble its buckets.
Chris Hanson [Wed, 8 Feb 2017 04:23:41 +0000 (20:23 -0800)]
Fix nasty bug: modifying a hash table could scramble its buckets.

7 years agoFix bug: typo broke linear dispatch coding.
Chris Hanson [Tue, 7 Feb 2017 05:49:15 +0000 (21:49 -0800)]
Fix bug: typo broke linear dispatch coding.

7 years agoSome efficiency and layout improvements.
Chris Hanson [Mon, 6 Feb 2017 05:39:36 +0000 (21:39 -0800)]
Some efficiency and layout improvements.