mit-scheme.git
7 years agoMove char->string to ustring.
Chris Hanson [Sun, 19 Feb 2017 09:29:04 +0000 (01:29 -0800)]
Move char->string to ustring.

7 years agoEliminate now-unused ascii-string-copy.
Chris Hanson [Sun, 19 Feb 2017 09:26:04 +0000 (01:26 -0800)]
Eliminate now-unused ascii-string-copy.

7 years agoEliminate a bunch of operations that modify strings in place.
Chris Hanson [Sun, 19 Feb 2017 09:21:31 +0000 (01:21 -0800)]
Eliminate a bunch of operations that modify strings in place.

These generally save a little memory but are difficult to implement with Unicode
strings.  It's not worth the trouble to keep them since the copying procedures
can be used instead.

7 years agoFix typo.
Chris Hanson [Sun, 19 Feb 2017 09:09:13 +0000 (01:09 -0800)]
Fix typo.

7 years agoMove split/join code and string-null?.
Chris Hanson [Sun, 19 Feb 2017 09:05:52 +0000 (01:05 -0800)]
Move split/join code and string-null?.

7 years agoEliminate now-unused code.
Chris Hanson [Sun, 19 Feb 2017 09:00:26 +0000 (01:00 -0800)]
Eliminate now-unused code.

7 years agoHuge wave of changes to rename remaining "ustring" to "string".
Chris Hanson [Sun, 19 Feb 2017 08:49:55 +0000 (00:49 -0800)]
Huge wave of changes to rename remaining "ustring" to "string".

With single exception of make-ustring which needs some thought.

7 years agoImplement multiple args for char comparisons.
Chris Hanson [Sun, 19 Feb 2017 01:52:10 +0000 (17:52 -0800)]
Implement multiple args for char comparisons.

7 years agoFix bugs exposed by unit tests.
Chris Hanson [Sun, 19 Feb 2017 01:42:53 +0000 (17:42 -0800)]
Fix bugs exposed by unit tests.

7 years agoAdd a bunch of unit tests swiped from Larceny.
Chris Hanson [Sun, 19 Feb 2017 01:42:37 +0000 (17:42 -0800)]
Add a bunch of unit tests swiped from Larceny.

7 years agoImplement ustring-{lower,upper}-case?.
Chris Hanson [Sun, 19 Feb 2017 01:42:09 +0000 (17:42 -0800)]
Implement ustring-{lower,upper}-case?.

7 years agoFirst draft of NFD normalization.
Chris Hanson [Sat, 18 Feb 2017 10:39:40 +0000 (02:39 -0800)]
First draft of NFD normalization.

7 years agoRefactor the converter to separate the value mapping from the dispatcher.
Chris Hanson [Sat, 18 Feb 2017 09:14:09 +0000 (01:14 -0800)]
Refactor the converter to separate the value mapping from the dispatcher.

7 years agoAdd "NFD_QC" table.
Chris Hanson [Sat, 18 Feb 2017 07:46:30 +0000 (23:46 -0800)]
Add "NFD_QC" table.

7 years agoClean up code generators a bit. Add "dm" property.
Chris Hanson [Sat, 18 Feb 2017 07:44:51 +0000 (23:44 -0800)]
Clean up code generators a bit.  Add "dm" property.

7 years agoAdd mappings for _QC properties.
Chris Hanson [Sat, 18 Feb 2017 07:43:48 +0000 (23:43 -0800)]
Add mappings for _QC properties.

7 years agoAdd ucd-table-ccc.
Chris Hanson [Sat, 18 Feb 2017 06:20:14 +0000 (22:20 -0800)]
Add ucd-table-ccc.

7 years agoAnother round of substitutions.
Chris Hanson [Sat, 18 Feb 2017 04:40:26 +0000 (20:40 -0800)]
Another round of substitutions.

7 years agoBegin process of replacing string operations with ustring equivalents.
Chris Hanson [Sat, 18 Feb 2017 03:58:14 +0000 (19:58 -0800)]
Begin process of replacing string operations with ustring equivalents.

7 years agoImplement "slices", which provide a restricted view of a string.
Chris Hanson [Sat, 18 Feb 2017 03:42:05 +0000 (19:42 -0800)]
Implement "slices", which provide a restricted view of a string.

This helps avoid the need for providing substring arguments everywhere.
Also, implement vector->ustring.

7 years agoCollapse ustring implementations together to save space and time.
Chris Hanson [Sat, 18 Feb 2017 01:26:23 +0000 (17:26 -0800)]
Collapse ustring implementations together to save space and time.

7 years agoReorder code in ustring; plus a few small tweaks.
Chris Hanson [Sat, 18 Feb 2017 00:15:51 +0000 (16:15 -0800)]
Reorder code in ustring; plus a few small tweaks.

7 years agoRename make-legacy-string to legacy-string-allocate.
Chris Hanson [Fri, 17 Feb 2017 22:58:04 +0000 (14:58 -0800)]
Rename make-legacy-string to legacy-string-allocate.

7 years agoGuarantee that incoming characters don't have bucky bits.
Chris Hanson [Fri, 17 Feb 2017 06:48:01 +0000 (22:48 -0800)]
Guarantee that incoming characters don't have bucky bits.

7 years agoChange full-width string to use 3 bytes instead of 4.
Chris Hanson [Fri, 17 Feb 2017 06:43:25 +0000 (22:43 -0800)]
Change full-width string to use 3 bytes instead of 4.

7 years agoReorganize ustring around operations.
Chris Hanson [Fri, 17 Feb 2017 06:27:03 +0000 (22:27 -0800)]
Reorganize ustring around operations.

7 years agoMove all legacy-string definitions into ustring.
Chris Hanson [Fri, 17 Feb 2017 06:17:15 +0000 (22:17 -0800)]
Move all legacy-string definitions into ustring.

This is preparation for moving all the old string code elsewhere.

7 years agoruntime/char.scm (unicode-char?): unreferenced bound variable: cp
Matt Birkholz [Thu, 16 Feb 2017 20:17:05 +0000 (13:17 -0700)]
runtime/char.scm (unicode-char?): unreferenced bound variable: cp

7 years agoFix number-of-event-types constant; add missing event types.
Matt Birkholz [Thu, 16 Feb 2017 18:45:22 +0000 (11:45 -0700)]
Fix number-of-event-types constant; add missing event types.

7 years agox11, x11-screen: Remove references to deprecated bindings.
Matt Birkholz [Thu, 16 Feb 2017 18:43:31 +0000 (11:43 -0700)]
x11, x11-screen: Remove references to deprecated bindings.

7 years agoUpdate EDITION, UPDATED, and regenerate detailed menu.
Chris Hanson [Thu, 16 Feb 2017 07:39:23 +0000 (23:39 -0800)]
Update EDITION, UPDATED, and regenerate detailed menu.

7 years agoRewrite the Characters chapter to reflect the implementation.
Chris Hanson [Thu, 16 Feb 2017 06:59:12 +0000 (22:59 -0800)]
Rewrite the Characters chapter to reflect the implementation.

7 years agoClean up the character abstraction to be more consistent.
Chris Hanson [Thu, 16 Feb 2017 06:55:36 +0000 (22:55 -0800)]
Clean up the character abstraction to be more consistent.

* Change unicode-char? correspond to unicode-scalar-value?.
* Rename base-char? to bitless-char?.
* Eliminate char-integer-limit, unicode-char-code?, and char->scalar-value.

7 years agoDefine xml-char? and use it in favor of unicode-char?.
Chris Hanson [Thu, 16 Feb 2017 02:52:17 +0000 (18:52 -0800)]
Define xml-char? and use it in favor of unicode-char?.

7 years agoAdd support for deprecated bindings.
Matt Birkholz [Wed, 15 Feb 2017 20:43:20 +0000 (13:43 -0700)]
Add support for deprecated bindings.

7 years agoPunt warning about "copying large block"s (e.g. Edwin buffers).
Matt Birkholz [Wed, 15 Feb 2017 20:46:09 +0000 (13:46 -0700)]
Punt warning about "copying large block"s (e.g. Edwin buffers).

7 years agoref-manual: Use an @xref to make Subprocess Options easier to find.
Matt Birkholz [Wed, 15 Feb 2017 20:48:30 +0000 (13:48 -0700)]
ref-manual: Use an @xref to make Subprocess Options easier to find.

7 years agoruntime/parse: typo/thinko
Matt Birkholz [Wed, 15 Feb 2017 23:00:37 +0000 (16:00 -0700)]
runtime/parse: typo/thinko

7 years agoFix predicate relationship.
Chris Hanson [Wed, 15 Feb 2017 09:34:27 +0000 (01:34 -0800)]
Fix predicate relationship.

7 years agoFix broken test.
Chris Hanson [Wed, 15 Feb 2017 09:31:59 +0000 (01:31 -0800)]
Fix broken test.

7 years agoChange character sets to be defined over code points.
Chris Hanson [Wed, 15 Feb 2017 09:27:38 +0000 (01:27 -0800)]
Change character sets to be defined over code points.

7 years agoAccount for the fact that UCD procedure accept all code points.
Chris Hanson [Wed, 15 Feb 2017 09:23:32 +0000 (01:23 -0800)]
Account for the fact that UCD procedure accept all code points.

7 years agoChange the UCD converter to pay attention to the remaining elements.
Chris Hanson [Wed, 15 Feb 2017 09:02:10 +0000 (01:02 -0800)]
Change the UCD converter to pay attention to the remaining elements.

This guarantees that every code point is represented by the generated tables.

7 years agoAlways use names for separator:space characters.
Chris Hanson [Wed, 15 Feb 2017 06:15:08 +0000 (22:15 -0800)]
Always use names for separator:space characters.

7 years agoAdd support for R7RS string \<newline> escape.
Chris Hanson [Wed, 15 Feb 2017 05:16:52 +0000 (21:16 -0800)]
Add support for R7RS string \<newline> escape.

7 years agoFix thinko in recent change.
Chris Hanson [Wed, 15 Feb 2017 05:16:24 +0000 (21:16 -0800)]
Fix thinko in recent change.

7 years agoSpaces should be considered normal printing characters.
Chris Hanson [Wed, 15 Feb 2017 05:07:07 +0000 (21:07 -0800)]
Spaces should be considered normal printing characters.

7 years agoChange make-signal-combiner to be iterative.
Chris Hanson [Wed, 15 Feb 2017 04:16:49 +0000 (20:16 -0800)]
Change make-signal-combiner to be iterative.

7 years agoSimplify make-signal-combiner interface.
Chris Hanson [Wed, 15 Feb 2017 04:06:37 +0000 (20:06 -0800)]
Simplify make-signal-combiner interface.

7 years agoFix unit test broken by recent change.
Chris Hanson [Wed, 15 Feb 2017 04:03:41 +0000 (20:03 -0800)]
Fix unit test broken by recent change.

7 years agoChange char-set-invert to be iterative.
Chris Hanson [Wed, 15 Feb 2017 02:14:50 +0000 (18:14 -0800)]
Change char-set-invert to be iterative.

7 years agoFix missing tail section in make-signal-combiner.
Chris Hanson [Wed, 15 Feb 2017 02:08:54 +0000 (18:08 -0800)]
Fix missing tail section in make-signal-combiner.

Also some no-op tweaks.

7 years agoEliminate unused and incorrectly implemented ustring-capitalize.
Chris Hanson [Tue, 14 Feb 2017 08:05:40 +0000 (00:05 -0800)]
Eliminate unused and incorrectly implemented ustring-capitalize.

7 years agoTweak comment.
Chris Hanson [Tue, 14 Feb 2017 07:56:49 +0000 (23:56 -0800)]
Tweak comment.

7 years agoRewrite make-signal-combiner to take advantage of signal structure.
Chris Hanson [Tue, 14 Feb 2017 07:54:02 +0000 (23:54 -0800)]
Rewrite make-signal-combiner to take advantage of signal structure.

7 years agoChange char-set implementation to use "signals" instead of "ranges".
Chris Hanson [Tue, 14 Feb 2017 06:28:04 +0000 (22:28 -0800)]
Change char-set implementation to use "signals" instead of "ranges".

7 years agoMajor refactor to minimize size of character sets.
Chris Hanson [Tue, 14 Feb 2017 05:17:52 +0000 (21:17 -0800)]
Major refactor to minimize size of character sets.

7 years agoEliminate unused binding.
Chris Hanson [Mon, 13 Feb 2017 10:12:36 +0000 (02:12 -0800)]
Eliminate unused binding.

7 years agoFix typos in previous change.
Chris Hanson [Mon, 13 Feb 2017 10:12:24 +0000 (02:12 -0800)]
Fix typos in previous change.

7 years agoChange is-X-of from compound to parametric predicates.
Chris Hanson [Sun, 12 Feb 2017 22:12:59 +0000 (14:12 -0800)]
Change is-X-of from compound to parametric predicates.

7 years agoRewrite unparser to pass context rather than use parameters.
Chris Hanson [Sun, 12 Feb 2017 20:13:32 +0000 (12:13 -0800)]
Rewrite unparser to pass context rather than use parameters.

Also eliminate unparser-table abstraction.

7 years agoReduce the size of character sets by computing the old format on demand.
Chris Hanson [Sun, 12 Feb 2017 09:25:56 +0000 (01:25 -0800)]
Reduce the size of character sets by computing the old format on demand.

7 years agoChange printer to be smarter about when quoting is needed.
Chris Hanson [Sun, 12 Feb 2017 06:06:50 +0000 (22:06 -0800)]
Change printer to be smarter about when quoting is needed.

7 years agoAdd some additional useful character sets.
Chris Hanson [Sun, 12 Feb 2017 05:51:34 +0000 (21:51 -0800)]
Add some additional useful character sets.

7 years agoFix bug: missed package name change in cold load.
Chris Hanson [Sun, 12 Feb 2017 05:50:52 +0000 (21:50 -0800)]
Fix bug: missed package name change in cold load.

7 years agoAllow conjoin and disjoin to be used with unregistered predicates.
Chris Hanson [Sun, 12 Feb 2017 05:31:04 +0000 (21:31 -0800)]
Allow conjoin and disjoin to be used with unregistered predicates.

7 years agoAdd tables for CWCF, CWL, and CWU.
Chris Hanson [Sun, 12 Feb 2017 01:21:13 +0000 (17:21 -0800)]
Add tables for CWCF, CWL, and CWU.

7 years agoChange code generator for boolean sets to use standard names.
Chris Hanson [Sun, 12 Feb 2017 01:20:17 +0000 (17:20 -0800)]
Change code generator for boolean sets to use standard names.

7 years agoRename ucd-table-glue to ucd-glue.
Chris Hanson [Sun, 12 Feb 2017 00:41:07 +0000 (16:41 -0800)]
Rename ucd-table-glue to ucd-glue.

7 years agoChange pattern-white-space to pattern-whitespace for consistency.
Chris Hanson [Sun, 12 Feb 2017 00:37:10 +0000 (16:37 -0800)]
Change pattern-white-space to pattern-whitespace for consistency.

7 years agoRename port/char-set to textual-port-char-set.
Chris Hanson [Sat, 11 Feb 2017 23:42:52 +0000 (15:42 -0800)]
Rename port/char-set to textual-port-char-set.

Make it work on all textual ports and default to iso-8859-1.

7 years agoAdd character sets to textual ports.
Chris Hanson [Sat, 11 Feb 2017 23:37:47 +0000 (15:37 -0800)]
Add character sets to textual ports.

This will help the printer decide what characters it should emit.

7 years agoImplement char-set:unicode.
Chris Hanson [Sat, 11 Feb 2017 22:41:01 +0000 (14:41 -0800)]
Implement char-set:unicode.

7 years agoImplement unicode-char-code?.
Chris Hanson [Sat, 11 Feb 2017 22:40:18 +0000 (14:40 -0800)]
Implement unicode-char-code?.

7 years agoClean up char->digit and digit->char.
Chris Hanson [Sat, 11 Feb 2017 22:39:47 +0000 (14:39 -0800)]
Clean up char->digit and digit->char.

7 years agoImplement digit-value.
Chris Hanson [Sat, 11 Feb 2017 21:56:03 +0000 (13:56 -0800)]
Implement digit-value.

7 years agoChange generated tables to use characters instead of integers.
Chris Hanson [Sat, 11 Feb 2017 21:03:44 +0000 (13:03 -0800)]
Change generated tables to use characters instead of integers.

7 years agoRename "WSpace" full name to "whitespace".
Chris Hanson [Sat, 11 Feb 2017 21:02:57 +0000 (13:02 -0800)]
Rename "WSpace" full name to "whitespace".

7 years agoRemove timestamp from generated files.
Chris Hanson [Sat, 11 Feb 2017 20:39:25 +0000 (12:39 -0800)]
Remove timestamp from generated files.

It forces a new check-in when nothing else has changed.

7 years agoChange implementation of #\<char> to show all "graphic" characters.
Chris Hanson [Sat, 11 Feb 2017 08:32:54 +0000 (00:32 -0800)]
Change implementation of #\<char> to show all "graphic" characters.

This isn't quite right -- it doesn't support Unicode very well -- but will do
for now.

7 years agoFix bug: use atom delimiters instead of symbol-constituents.
Chris Hanson [Sat, 11 Feb 2017 08:32:12 +0000 (00:32 -0800)]
Fix bug: use atom delimiters instead of symbol-constituents.

Proper handling of parser character sets needs review.

7 years agoImplement proper handling of symbol quoting and case folding in parser.
Chris Hanson [Sat, 11 Feb 2017 07:52:59 +0000 (23:52 -0800)]
Implement proper handling of symbol quoting and case folding in parser.

Disallows use of | in symbols except at beginning and end.
Disallows use of \ in symbols unless in ||.

7 years agoImplement char-{down,fold,up}case-full and use in ustring.
Chris Hanson [Sat, 11 Feb 2017 07:52:19 +0000 (23:52 -0800)]
Implement char-{down,fold,up}case-full and use in ustring.

7 years agoUse correct case-folding algorithm for symbols.
Chris Hanson [Sat, 11 Feb 2017 06:42:30 +0000 (22:42 -0800)]
Use correct case-folding algorithm for symbols.

7 years agoChange ustring implementation to simplify to 8-bit legacy strings.
Chris Hanson [Sat, 11 Feb 2017 06:40:58 +0000 (22:40 -0800)]
Change ustring implementation to simplify to 8-bit legacy strings.

This was happening anyway given the previous definition of char-ascii?.

7 years agoFix char-ascii? to be 7-bit instead of 8.
Chris Hanson [Sat, 11 Feb 2017 06:06:34 +0000 (22:06 -0800)]
Fix char-ascii? to be 7-bit instead of 8.

Also create char-8-bit?.

7 years agoFix bug: typo meant value of utfX->string was wrong.
Chris Hanson [Sat, 11 Feb 2017 05:20:28 +0000 (21:20 -0800)]
Fix bug: typo meant value of utfX->string was wrong.

Also, consistently use the char decoding procedures.

7 years agoCharacter case mappers should preserve the bits.
Chris Hanson [Sat, 11 Feb 2017 04:54:35 +0000 (20:54 -0800)]
Character case mappers should preserve the bits.

7 years agoFix parser case-folding to use ustring-foldcase.
Chris Hanson [Sat, 11 Feb 2017 04:40:57 +0000 (20:40 -0800)]
Fix parser case-folding to use ustring-foldcase.

7 years agoImplement char-foldcase and ustring-foldcase.
Chris Hanson [Sat, 11 Feb 2017 04:40:46 +0000 (20:40 -0800)]
Implement char-foldcase and ustring-foldcase.

Also fix implementations of ustring-{up,down}case.

7 years agoAdd tables and support for case folding and string case conversion.
Chris Hanson [Sat, 11 Feb 2017 04:39:03 +0000 (20:39 -0800)]
Add tables and support for case folding and string case conversion.

7 years agoUse non-pointer hash tables for UCD tables.
Chris Hanson [Fri, 10 Feb 2017 08:14:02 +0000 (00:14 -0800)]
Use non-pointer hash tables for UCD tables.

7 years agoImplement non-pointer hash tables.
Chris Hanson [Fri, 10 Feb 2017 08:11:39 +0000 (00:11 -0800)]
Implement non-pointer hash tables.

These are like strong eq? hash tables but they don't rehash after gc.

7 years agoImplement much smarter code generation for UCD tables.
Chris Hanson [Fri, 10 Feb 2017 08:03:24 +0000 (00:03 -0800)]
Implement much smarter code generation for UCD tables.

New generator generates character sets for binary-valued properties.
For code-point valued properties, it uses fixnum hash tables.
It also uses fixnum hash tables for the numeric-type property.

The end result of this is a considerable reduction in code size.

7 years agoAdd header and explanatory comment to names.
Chris Hanson [Fri, 10 Feb 2017 06:18:45 +0000 (22:18 -0800)]
Add header and explanatory comment to names.

7 years agoAdd metadata to all of the XML properties.
Chris Hanson [Fri, 10 Feb 2017 06:14:53 +0000 (22:14 -0800)]
Add metadata to all of the XML properties.

7 years agoCorrectly implement character case conversions and R7RS char sets.
Chris Hanson [Thu, 9 Feb 2017 08:12:52 +0000 (00:12 -0800)]
Correctly implement character case conversions and R7RS char sets.

7 years agoOptimize the ucd tables a bit.
Chris Hanson [Thu, 9 Feb 2017 08:10:50 +0000 (00:10 -0800)]
Optimize the ucd tables a bit.

Need to reconsider the boolean tables, which will be smaller and might be faster
as char sets.

7 years agoChange the ucd converter to store raw prop files in a standard place.
Chris Hanson [Thu, 9 Feb 2017 07:47:57 +0000 (23:47 -0800)]
Change the ucd converter to store raw prop files in a standard place.

These files are being checked in, so it shouldn't be necessary to regenerate
them until the UCD is updated to a new version.

7 years agoFix typo in previous change.
Chris Hanson [Wed, 8 Feb 2017 08:27:07 +0000 (00:27 -0800)]
Fix typo in previous change.