mit-scheme.git
7 years agoRedefine substring as different from string-copy.
Chris Hanson [Sun, 23 Apr 2017 01:45:49 +0000 (18:45 -0700)]
Redefine substring as different from string-copy.

They are different in only one respect: string-copy always returns a mutable
string, while substring always returns an immutable string.

7 years agoConvert list->string, vector->string to use string-builder.
Chris Hanson [Sun, 23 Apr 2017 01:17:37 +0000 (18:17 -0700)]
Convert list->string, vector->string to use string-builder.

7 years agoFix call to string-builder that was missed.
Chris Hanson [Sun, 23 Apr 2017 01:14:39 +0000 (18:14 -0700)]
Fix call to string-builder that was missed.

7 years agoSimplify string, string*, string-append, string-append*.
Chris Hanson [Sun, 23 Apr 2017 00:54:10 +0000 (17:54 -0700)]
Simplify string, string*, string-append, string-append*.

7 years agoFix typo causing memory corruption.
Chris Hanson [Sun, 23 Apr 2017 00:53:53 +0000 (17:53 -0700)]
Fix typo causing memory corruption.

7 years agoChange string-copy to return legacy string only if arg is also legacy.
Chris Hanson [Sat, 22 Apr 2017 07:20:30 +0000 (00:20 -0700)]
Change string-copy to return legacy string only if arg is also legacy.

7 years agoMove NFC marking from canonical-composition to string->nfc.
Chris Hanson [Sat, 22 Apr 2017 07:17:19 +0000 (00:17 -0700)]
Move NFC marking from canonical-composition to string->nfc.

7 years agoSignificantly simplify string-builder.
Chris Hanson [Sat, 22 Apr 2017 07:05:56 +0000 (00:05 -0700)]
Significantly simplify string-builder.

* Eliminate options; now just optional buffer-length.
* Result type is specified at build rather than up front.
* Eliminate never-exported make-string-builder.

7 years agoChange string->nfc to return immutable value, and optimize a bit.
Chris Hanson [Fri, 21 Apr 2017 23:48:44 +0000 (16:48 -0700)]
Change string->nfc to return immutable value, and optimize a bit.

7 years agoSupport TEST environment variable in "make check".
Chris Hanson [Fri, 21 Apr 2017 23:48:03 +0000 (16:48 -0700)]
Support TEST environment variable in "make check".

Also clean up output slightly.

7 years agostring->nfd: also convert mutable strings already in NFD.
Chris Hanson [Fri, 21 Apr 2017 23:22:11 +0000 (16:22 -0700)]
string->nfd: also convert mutable strings already in NFD.

7 years agoChange string->nfd to return immutable value.
Chris Hanson [Fri, 21 Apr 2017 23:03:18 +0000 (16:03 -0700)]
Change string->nfd to return immutable value.

7 years agoChange builder options to distinguish between mutable and legacy results.
Chris Hanson [Fri, 21 Apr 2017 22:33:19 +0000 (15:33 -0700)]
Change builder options to distinguish between mutable and legacy results.

7 years agoRearrange and optimize. Also make ustring1 be zero-terminated.
Chris Hanson [Fri, 21 Apr 2017 22:04:17 +0000 (15:04 -0700)]
Rearrange and optimize.  Also make ustring1 be zero-terminated.

7 years agoMark ignored binding.
Chris Hanson [Fri, 21 Apr 2017 22:03:49 +0000 (15:03 -0700)]
Mark ignored binding.

7 years agoChange Edwin's implementation of strings to work for all "string-ish" types.
Chris Hanson [Fri, 21 Apr 2017 07:22:29 +0000 (00:22 -0700)]
Change Edwin's implementation of strings to work for all "string-ish" types.

7 years agoAdd tagging support for unicode-string.
Chris Hanson [Fri, 21 Apr 2017 07:21:41 +0000 (00:21 -0700)]
Add tagging support for unicode-string.

Also generate better error for unknown type codes.

7 years agoChange string primitives to uniformly support all "string-ish" types.
Chris Hanson [Fri, 21 Apr 2017 07:21:14 +0000 (00:21 -0700)]
Change string primitives to uniformly support all "string-ish" types.

7 years agoChange string-builder to generate immutable strings by default.
Chris Hanson [Fri, 21 Apr 2017 05:32:27 +0000 (22:32 -0700)]
Change string-builder to generate immutable strings by default.

Also fix bug in string->list assumed mutable inputs.

7 years agoNow that legacy string has the same layout as ustring1, merge handling of both.
Chris Hanson [Thu, 20 Apr 2017 06:00:54 +0000 (23:00 -0700)]
Now that legacy string has the same layout as ustring1, merge handling of both.

7 years agoAllow string operations to take Unicode strings with 1 byte per CP.
Chris Hanson [Thu, 20 Apr 2017 00:44:44 +0000 (17:44 -0700)]
Allow string operations to take Unicode strings with 1 byte per CP.

7 years agoChange string comparisons to normalize to NFC prior to comparing.
Chris Hanson [Wed, 19 Apr 2017 05:18:24 +0000 (22:18 -0700)]
Change string comparisons to normalize to NFC prior to comparing.

The procedures that return index values have not been updated since it's not
obvious what to do with them.  Comparison is meaningless for non-normalized
strings, so it's necessary that all comparisons be done between normalized
strings.  This means either (a) require compared strings to be normalized before
calling the comparator, or (b) have the comparator do normalization on the
arguments.  If (b) is chosen, then the returned index value will be wrong in the
case where the arguments aren't normalized, as it will refer to the normalized
strings, not the arguments.

I'm considering choosing (b) and changing the definitions of these procedures to
return a slice into the normalized strings instead of an index.  However, the
upcoming implementation of immutable strings may make it simple for every
immutable string to be normalized, which may make (a) feasible.

For now I'm going to ignore this, which is fine as long as only ASCII strings
are compared.

7 years agoRewrite string-builder for performance.
Chris Hanson [Wed, 19 Apr 2017 04:57:52 +0000 (21:57 -0700)]
Rewrite string-builder for performance.

7 years agoRewrite string copying for performance.
Chris Hanson [Wed, 19 Apr 2017 04:25:03 +0000 (21:25 -0700)]
Rewrite string copying for performance.

7 years agoMore refactoring of unicode-string layout.
Chris Hanson [Wed, 19 Apr 2017 03:17:47 +0000 (20:17 -0700)]
More refactoring of unicode-string layout.

7 years agoTeach top-level clean target to clean tools too.
Taylor R Campbell [Tue, 18 Apr 2017 18:59:01 +0000 (18:59 +0000)]
Teach top-level clean target to clean tools too.

7 years agoA round of small changes in preparation for supporting immutable strings.
Chris Hanson [Mon, 17 Apr 2017 04:49:40 +0000 (21:49 -0700)]
A round of small changes in preparation for supporting immutable strings.

7 years agoImplement compiler support for new primitives.
Chris Hanson [Mon, 17 Apr 2017 03:17:43 +0000 (20:17 -0700)]
Implement compiler support for new primitives.

7 years agoChange Unicode strings to store flag in type bits of length.
Chris Hanson [Mon, 17 Apr 2017 02:08:22 +0000 (19:08 -0700)]
Change Unicode strings to store flag in type bits of length.

7 years agoD'oh! Hook up printer to new string type.
Chris Hanson [Mon, 17 Apr 2017 02:08:12 +0000 (19:08 -0700)]
D'oh! Hook up printer to new string type.

7 years agoImplement primitives to read and write type/datum of object in memory.
Chris Hanson [Mon, 17 Apr 2017 01:47:37 +0000 (18:47 -0700)]
Implement primitives to read and write type/datum of object in memory.

7 years agoReturn end-index of TO from bytevector-copy!.
Chris Hanson [Mon, 17 Apr 2017 01:47:28 +0000 (18:47 -0700)]
Return end-index of TO from bytevector-copy!.

7 years agoNo need for X in the liarc bootstrap build.
Taylor R Campbell [Sat, 15 Apr 2017 18:57:34 +0000 (18:57 +0000)]
No need for X in the liarc bootstrap build.

7 years agoSplice shell arguments with ${1+"$@"}.
Taylor R Campbell [Sat, 15 Apr 2017 18:55:48 +0000 (18:55 +0000)]
Splice shell arguments with ${1+"$@"}.

Leave as "${@}" only where it is absolutely obvious there must be at
least one parameter anyway, e.g. because it is a full command line.

7 years agoFix bug: primitive-byte-ref returns a fixnum, not a raw number.
Chris Hanson [Fri, 14 Apr 2017 05:19:05 +0000 (22:19 -0700)]
Fix bug: primitive-byte-ref returns a fixnum, not a raw number.

Also clean up and reorganize open-coding of memory references.

7 years agoFix typo.
Chris Hanson [Fri, 14 Apr 2017 05:18:57 +0000 (22:18 -0700)]
Fix typo.

7 years agoChange unicode string representation to be more compact and flexible.
Chris Hanson [Thu, 13 Apr 2017 06:21:29 +0000 (23:21 -0700)]
Change unicode string representation to be more compact and flexible.

The new design is more densely coded and provides for immutable strings with
different coding, as well as memoization of NFC/NFD status.  However, in this
change only the standard 3-byte mutable representation is implemented.

7 years agoImplement select-on-bytes-per-word for gnerating word-length-specific code.
Chris Hanson [Thu, 13 Apr 2017 05:24:20 +0000 (22:24 -0700)]
Implement select-on-bytes-per-word for gnerating word-length-specific code.

7 years agoEliminate condition for open-coding integer->char.
Chris Hanson [Thu, 13 Apr 2017 05:23:52 +0000 (22:23 -0700)]
Eliminate condition for open-coding integer->char.

7 years agoMake sure that unicode strings are self-evaluating.
Chris Hanson [Thu, 13 Apr 2017 05:23:28 +0000 (22:23 -0700)]
Make sure that unicode strings are self-evaluating.

7 years agoStrip down code generated for primitive memory references.
Chris Hanson [Thu, 13 Apr 2017 04:18:27 +0000 (21:18 -0700)]
Strip down code generated for primitive memory references.

7 years agoImplement open-coding of byte-ref primitives.
Chris Hanson [Wed, 12 Apr 2017 05:35:10 +0000 (22:35 -0700)]
Implement open-coding of byte-ref primitives.

7 years agoImplement more primitive refs, and restrict to pointers only.
Chris Hanson [Wed, 12 Apr 2017 05:34:32 +0000 (22:34 -0700)]
Implement more primitive refs, and restrict to pointers only.

7 years agoFix compilation issue.
Chris Hanson [Wed, 12 Apr 2017 04:46:43 +0000 (21:46 -0700)]
Fix compilation issue.

7 years agoImplement allocate-nm-vector.
Chris Hanson [Wed, 12 Apr 2017 04:46:38 +0000 (21:46 -0700)]
Implement allocate-nm-vector.

7 years agoAllocate new type unicode-string.
Chris Hanson [Wed, 12 Apr 2017 04:21:07 +0000 (21:21 -0700)]
Allocate new type unicode-string.

7 years agoImplement bytes-per-object.
Chris Hanson [Wed, 12 Apr 2017 04:20:41 +0000 (21:20 -0700)]
Implement bytes-per-object.

7 years agoEliminate unused multi-byte procedures.
Chris Hanson [Mon, 10 Apr 2017 04:08:57 +0000 (21:08 -0700)]
Eliminate unused multi-byte procedures.

No need to support a bunch of code that may never be used.

7 years agoAdd 'copy? option to string-builder.
Chris Hanson [Sat, 1 Apr 2017 05:17:20 +0000 (22:17 -0700)]
Add 'copy? option to string-builder.

7 years agoMerge branch 'master' of git.sv.gnu.org:/srv/git/mit-scheme
Chris Hanson [Fri, 31 Mar 2017 04:31:39 +0000 (21:31 -0700)]
Merge branch 'master' of git.sv.gnu.org:/srv/git/mit-scheme

7 years agoFix bug: string output port must copy input strings.
Chris Hanson [Fri, 31 Mar 2017 04:30:55 +0000 (21:30 -0700)]
Fix bug: string output port must copy input strings.

7 years agoFix bugs: typos caught by the macos compiler.
Chris Hanson [Thu, 30 Mar 2017 06:31:37 +0000 (23:31 -0700)]
Fix bugs: typos caught by the macos compiler.

7 years agoAdd documentation for a few of the more recent string procedures.
Chris Hanson [Wed, 29 Mar 2017 05:17:35 +0000 (22:17 -0700)]
Add documentation for a few of the more recent string procedures.

7 years agoFix string-for-primitive: it wasn't handling slices.
Chris Hanson [Wed, 29 Mar 2017 05:02:22 +0000 (22:02 -0700)]
Fix string-for-primitive: it wasn't handling slices.

7 years agoOptimize string-in-nfX? since it's important that these be fast.
Chris Hanson [Wed, 29 Mar 2017 04:57:20 +0000 (21:57 -0700)]
Optimize string-in-nfX? since it's important that these be fast.

7 years agoNormalize strings prior to hashing so equivalent sequences hash the same.
Chris Hanson [Wed, 29 Mar 2017 04:52:44 +0000 (21:52 -0700)]
Normalize strings prior to hashing so equivalent sequences hash the same.

I've arbitrarily chosen NFD because its faster than NFC, but a case could be
made that NFC is preferable.

7 years agoEliminate Hangul Jamo from canonical cm/dm tables.
Chris Hanson [Wed, 29 Mar 2017 03:15:11 +0000 (20:15 -0700)]
Eliminate Hangul Jamo from canonical cm/dm tables.

This makes the bands about 1 MB smaller.

7 years agoImplement algorithmic Hangul Jamo compose/decompose.
Chris Hanson [Wed, 29 Mar 2017 01:16:07 +0000 (18:16 -0700)]
Implement algorithmic Hangul Jamo compose/decompose.

7 years agoFix code-generation bug in fast-division.
Chris Hanson [Tue, 28 Mar 2017 06:47:03 +0000 (23:47 -0700)]
Fix code-generation bug in fast-division.

Apparently this code was insufficiently tested.

7 years agoChange NFC_QC to be a boolean-valued table and exploit that.
Chris Hanson [Mon, 27 Mar 2017 03:59:27 +0000 (20:59 -0700)]
Change NFC_QC to be a boolean-valued table and exploit that.

7 years agoHave string builder track max code point written.
Chris Hanson [Mon, 27 Mar 2017 03:46:57 +0000 (20:46 -0700)]
Have string builder track max code point written.

This is used for two distinct purposes in the finisher.

7 years agoChange string-builder to normalize to NFC by default.
Chris Hanson [Sun, 26 Mar 2017 23:12:04 +0000 (16:12 -0700)]
Change string-builder to normalize to NFC by default.

7 years agoChange symbols to be in NFC.
Chris Hanson [Sun, 26 Mar 2017 20:50:46 +0000 (13:50 -0700)]
Change symbols to be in NFC.

7 years agoWorking NFC implementation.
Chris Hanson [Sun, 26 Mar 2017 20:45:13 +0000 (13:45 -0700)]
Working NFC implementation.

7 years agoInitial draft of NFC support; still need to write composition.
Chris Hanson [Sat, 25 Mar 2017 22:19:56 +0000 (15:19 -0700)]
Initial draft of NFC support; still need to write composition.

7 years agoAdd NFC_QC and Comp_EX tables.
Chris Hanson [Sat, 25 Mar 2017 22:19:21 +0000 (15:19 -0700)]
Add NFC_QC and Comp_EX tables.

7 years agoSynthesize canonical-dm table and use it to speed up decomposition.
Chris Hanson [Mon, 20 Mar 2017 03:22:29 +0000 (20:22 -0700)]
Synthesize canonical-dm table and use it to speed up decomposition.

7 years agoFix bug in canonical-ordering algorithm.
Chris Hanson [Mon, 20 Mar 2017 00:53:51 +0000 (17:53 -0700)]
Fix bug in canonical-ordering algorithm.

7 years agoRefactor test to make it easier to see the failures.
Chris Hanson [Mon, 20 Mar 2017 00:53:25 +0000 (17:53 -0700)]
Refactor test to make it easier to see the failures.

7 years agoBoost default stack size -- I'm tired of blowing out the stack.
Chris Hanson [Mon, 20 Mar 2017 00:52:38 +0000 (17:52 -0700)]
Boost default stack size -- I'm tired of blowing out the stack.

7 years agoD'oh! String normalization tests were broken, which explains why they pass.
Chris Hanson [Sun, 19 Mar 2017 20:20:31 +0000 (13:20 -0700)]
D'oh!  String normalization tests were broken, which explains why they pass.

7 years agoSqueeze a little more space out of the tables.
Chris Hanson [Sun, 19 Mar 2017 08:16:22 +0000 (01:16 -0700)]
Squeeze a little more space out of the tables.

7 years agoImplement decomposition-type table and use it for correct NFD conversion.
Chris Hanson [Sun, 19 Mar 2017 08:03:54 +0000 (01:03 -0700)]
Implement decomposition-type table and use it for correct NFD conversion.

7 years agoFurther compress the size of the UCD tables.
Chris Hanson [Sun, 19 Mar 2017 03:49:04 +0000 (20:49 -0700)]
Further compress the size of the UCD tables.

As of this latest set of changes the total size seems in the range of a megabyte
or so, which is much better than the 4-5 megabytes of earlier revisions.

7 years agoAdd a bunch of converters to/from bytevectors.
Chris Hanson [Sun, 19 Mar 2017 03:46:59 +0000 (20:46 -0700)]
Add a bunch of converters to/from bytevectors.

7 years agoFix some bugs in vector->string.
Chris Hanson [Sun, 19 Mar 2017 02:47:29 +0000 (19:47 -0700)]
Fix some bugs in vector->string.

7 years agoAdd hack to force printing chars in old format; can be eliminated after 9.3.
Chris Hanson [Sun, 19 Mar 2017 02:34:17 +0000 (19:34 -0700)]
Add hack to force printing chars in old format; can be eliminated after 9.3.

7 years agoMore simplification.
Chris Hanson [Sun, 19 Mar 2017 02:13:29 +0000 (19:13 -0700)]
More simplification.

7 years agoSimplify parse-atom to not fold case.
Chris Hanson [Sun, 19 Mar 2017 02:08:25 +0000 (19:08 -0700)]
Simplify parse-atom to not fold case.

7 years agoUse ucd-X-value directly in ustring.
Chris Hanson [Sun, 19 Mar 2017 00:08:31 +0000 (17:08 -0700)]
Use ucd-X-value directly in ustring.

7 years agoConvert all of the UCD tables to use bitwise tries.
Chris Hanson [Sat, 18 Mar 2017 21:34:38 +0000 (14:34 -0700)]
Convert all of the UCD tables to use bitwise tries.

7 years agoRework the character parser to handle backslash reasonably.
Chris Hanson [Sat, 18 Mar 2017 21:34:15 +0000 (14:34 -0700)]
Rework the character parser to handle backslash reasonably.

7 years agoAdd u16/u32 equivalents to bytevector.
Chris Hanson [Sat, 18 Mar 2017 04:41:18 +0000 (21:41 -0700)]
Add u16/u32 equivalents to bytevector.

7 years agoAdd draft of inversion-map code generator.
Chris Hanson [Wed, 15 Mar 2017 05:49:00 +0000 (22:49 -0700)]
Add draft of inversion-map code generator.

7 years agoUpdate explanation of HIGH range.
Chris Hanson [Mon, 13 Mar 2017 01:57:45 +0000 (18:57 -0700)]
Update explanation of HIGH range.

7 years agoRename "signal" to "inversion list" since that's the accepted name.
Chris Hanson [Mon, 13 Mar 2017 01:53:53 +0000 (18:53 -0700)]
Rename "signal" to "inversion list" since that's the accepted name.

7 years agoChange normalization test to use characters instead of integers.
Chris Hanson [Sat, 11 Mar 2017 09:12:25 +0000 (01:12 -0800)]
Change normalization test to use characters instead of integers.

7 years agoSpeed up reading of #\x... characters.
Chris Hanson [Sat, 11 Mar 2017 09:10:01 +0000 (01:10 -0800)]
Speed up reading of #\x... characters.

7 years agoUse string-builder instead of call-with-output-string.
Chris Hanson [Sat, 11 Mar 2017 08:42:21 +0000 (00:42 -0800)]
Use string-builder instead of call-with-output-string.

7 years agoImplement test case for string->nfd.
Chris Hanson [Sat, 11 Mar 2017 08:34:39 +0000 (00:34 -0800)]
Implement test case for string->nfd.

7 years agoFix symbols using now-illegal syntax.
Chris Hanson [Fri, 10 Mar 2017 07:37:19 +0000 (23:37 -0800)]
Fix symbols using now-illegal syntax.

7 years agoRewrite parser so that it supports Unicode input.
Chris Hanson [Fri, 10 Mar 2017 07:07:23 +0000 (23:07 -0800)]
Rewrite parser so that it supports Unicode input.

7 years agoFix missed references to parser.
Chris Hanson [Fri, 10 Mar 2017 04:43:23 +0000 (20:43 -0800)]
Fix missed references to parser.

7 years agoMajor refactoring of the parser.
Chris Hanson [Thu, 9 Mar 2017 06:59:15 +0000 (22:59 -0800)]
Major refactoring of the parser.

* Eliminate kludge that makes the parser environment sensitive.
* Eliminate most of the undocumented dynamic parameters.
* Eliminate the ability to change the character sets used in parsing.
* Eliminate never-used parse-objects.
* Don't export parse-object -- it's basically the same as read.
* Convert parser to use define-deferred instead of an explicit initializer.
* Streamline internals somewhat.

7 years agoAdd file-attributes tests that test the parser's interface.
Chris Hanson [Wed, 8 Mar 2017 06:18:08 +0000 (22:18 -0800)]
Add file-attributes tests that test the parser's interface.

7 years agoAdd input-line operation to input strings.
Chris Hanson [Wed, 8 Mar 2017 06:11:03 +0000 (22:11 -0800)]
Add input-line operation to input strings.

7 years agoImplement port-properties.
Chris Hanson [Wed, 8 Mar 2017 05:59:18 +0000 (21:59 -0800)]
Implement port-properties.

7 years agoReimplement interface between parser and file-attributes parser.
Chris Hanson [Wed, 8 Mar 2017 05:37:27 +0000 (21:37 -0800)]
Reimplement interface between parser and file-attributes parser.

New interface just collects the comment and passes it to the parser.

7 years agoReindent test cases for easier reading.
Chris Hanson [Wed, 8 Mar 2017 05:29:58 +0000 (21:29 -0800)]
Reindent test cases for easier reading.

7 years agoFix char-in-set? so it works with all characters.
Chris Hanson [Wed, 8 Mar 2017 04:20:15 +0000 (20:20 -0800)]
Fix char-in-set? so it works with all characters.