Chris Hanson [Wed, 29 Mar 2017 04:52:44 +0000 (21:52 -0700)]
Normalize strings prior to hashing so equivalent sequences hash the same.
I've arbitrarily chosen NFD because its faster than NFC, but a case could be
made that NFC is preferable.
Chris Hanson [Wed, 29 Mar 2017 03:15:11 +0000 (20:15 -0700)]
Eliminate Hangul Jamo from canonical cm/dm tables.
This makes the bands about 1 MB smaller.
Chris Hanson [Wed, 29 Mar 2017 01:16:07 +0000 (18:16 -0700)]
Implement algorithmic Hangul Jamo compose/decompose.
Chris Hanson [Tue, 28 Mar 2017 06:47:03 +0000 (23:47 -0700)]
Fix code-generation bug in fast-division.
Apparently this code was insufficiently tested.
Chris Hanson [Mon, 27 Mar 2017 03:59:27 +0000 (20:59 -0700)]
Change NFC_QC to be a boolean-valued table and exploit that.
Chris Hanson [Mon, 27 Mar 2017 03:46:57 +0000 (20:46 -0700)]
Have string builder track max code point written.
This is used for two distinct purposes in the finisher.
Chris Hanson [Sun, 26 Mar 2017 23:12:04 +0000 (16:12 -0700)]
Change string-builder to normalize to NFC by default.
Chris Hanson [Sun, 26 Mar 2017 20:50:46 +0000 (13:50 -0700)]
Change symbols to be in NFC.
Chris Hanson [Sun, 26 Mar 2017 20:45:13 +0000 (13:45 -0700)]
Working NFC implementation.
Chris Hanson [Sat, 25 Mar 2017 22:19:56 +0000 (15:19 -0700)]
Initial draft of NFC support; still need to write composition.
Chris Hanson [Sat, 25 Mar 2017 22:19:21 +0000 (15:19 -0700)]
Add NFC_QC and Comp_EX tables.
Chris Hanson [Mon, 20 Mar 2017 03:22:29 +0000 (20:22 -0700)]
Synthesize canonical-dm table and use it to speed up decomposition.
Chris Hanson [Mon, 20 Mar 2017 00:53:51 +0000 (17:53 -0700)]
Fix bug in canonical-ordering algorithm.
Chris Hanson [Mon, 20 Mar 2017 00:53:25 +0000 (17:53 -0700)]
Refactor test to make it easier to see the failures.
Chris Hanson [Mon, 20 Mar 2017 00:52:38 +0000 (17:52 -0700)]
Boost default stack size -- I'm tired of blowing out the stack.
Chris Hanson [Sun, 19 Mar 2017 20:20:31 +0000 (13:20 -0700)]
D'oh! String normalization tests were broken, which explains why they pass.
Chris Hanson [Sun, 19 Mar 2017 08:16:22 +0000 (01:16 -0700)]
Squeeze a little more space out of the tables.
Chris Hanson [Sun, 19 Mar 2017 08:03:54 +0000 (01:03 -0700)]
Implement decomposition-type table and use it for correct NFD conversion.
Chris Hanson [Sun, 19 Mar 2017 03:49:04 +0000 (20:49 -0700)]
Further compress the size of the UCD tables.
As of this latest set of changes the total size seems in the range of a megabyte
or so, which is much better than the 4-5 megabytes of earlier revisions.
Chris Hanson [Sun, 19 Mar 2017 03:46:59 +0000 (20:46 -0700)]
Add a bunch of converters to/from bytevectors.
Chris Hanson [Sun, 19 Mar 2017 02:47:29 +0000 (19:47 -0700)]
Fix some bugs in vector->string.
Chris Hanson [Sun, 19 Mar 2017 02:34:17 +0000 (19:34 -0700)]
Add hack to force printing chars in old format; can be eliminated after 9.3.
Chris Hanson [Sun, 19 Mar 2017 02:13:29 +0000 (19:13 -0700)]
More simplification.
Chris Hanson [Sun, 19 Mar 2017 02:08:25 +0000 (19:08 -0700)]
Simplify parse-atom to not fold case.
Chris Hanson [Sun, 19 Mar 2017 00:08:31 +0000 (17:08 -0700)]
Use ucd-X-value directly in ustring.
Chris Hanson [Sat, 18 Mar 2017 21:34:38 +0000 (14:34 -0700)]
Convert all of the UCD tables to use bitwise tries.
Chris Hanson [Sat, 18 Mar 2017 21:34:15 +0000 (14:34 -0700)]
Rework the character parser to handle backslash reasonably.
Chris Hanson [Sat, 18 Mar 2017 04:41:18 +0000 (21:41 -0700)]
Add u16/u32 equivalents to bytevector.
Chris Hanson [Wed, 15 Mar 2017 05:49:00 +0000 (22:49 -0700)]
Add draft of inversion-map code generator.
Chris Hanson [Mon, 13 Mar 2017 01:57:45 +0000 (18:57 -0700)]
Update explanation of HIGH range.
Chris Hanson [Mon, 13 Mar 2017 01:53:53 +0000 (18:53 -0700)]
Rename "signal" to "inversion list" since that's the accepted name.
Chris Hanson [Sat, 11 Mar 2017 09:12:25 +0000 (01:12 -0800)]
Change normalization test to use characters instead of integers.
Chris Hanson [Sat, 11 Mar 2017 09:10:01 +0000 (01:10 -0800)]
Speed up reading of #\x... characters.
Chris Hanson [Sat, 11 Mar 2017 08:42:21 +0000 (00:42 -0800)]
Use string-builder instead of call-with-output-string.
Chris Hanson [Sat, 11 Mar 2017 08:34:39 +0000 (00:34 -0800)]
Implement test case for string->nfd.
Chris Hanson [Fri, 10 Mar 2017 07:37:19 +0000 (23:37 -0800)]
Fix symbols using now-illegal syntax.
Chris Hanson [Fri, 10 Mar 2017 07:07:23 +0000 (23:07 -0800)]
Rewrite parser so that it supports Unicode input.
Chris Hanson [Fri, 10 Mar 2017 04:43:23 +0000 (20:43 -0800)]
Fix missed references to parser.
Chris Hanson [Thu, 9 Mar 2017 06:59:15 +0000 (22:59 -0800)]
Major refactoring of the parser.
* Eliminate kludge that makes the parser environment sensitive.
* Eliminate most of the undocumented dynamic parameters.
* Eliminate the ability to change the character sets used in parsing.
* Eliminate never-used parse-objects.
* Don't export parse-object -- it's basically the same as read.
* Convert parser to use define-deferred instead of an explicit initializer.
* Streamline internals somewhat.
Chris Hanson [Wed, 8 Mar 2017 06:18:08 +0000 (22:18 -0800)]
Add file-attributes tests that test the parser's interface.
Chris Hanson [Wed, 8 Mar 2017 06:11:03 +0000 (22:11 -0800)]
Add input-line operation to input strings.
Chris Hanson [Wed, 8 Mar 2017 05:59:18 +0000 (21:59 -0800)]
Implement port-properties.
Chris Hanson [Wed, 8 Mar 2017 05:37:27 +0000 (21:37 -0800)]
Reimplement interface between parser and file-attributes parser.
New interface just collects the comment and passes it to the parser.
Chris Hanson [Wed, 8 Mar 2017 05:29:58 +0000 (21:29 -0800)]
Reindent test cases for easier reading.
Chris Hanson [Wed, 8 Mar 2017 04:20:15 +0000 (20:20 -0800)]
Fix char-in-set? so it works with all characters.
Chris Hanson [Wed, 8 Mar 2017 04:11:26 +0000 (20:11 -0800)]
Small tweaks to file-attributes.
Chris Hanson [Wed, 8 Mar 2017 04:08:52 +0000 (20:08 -0800)]
Add file-attributes test to make check.
Chris Hanson [Tue, 7 Mar 2017 09:12:34 +0000 (01:12 -0800)]
Fix some issues with file-attribute parser; temporarily disable.
Chris Hanson [Tue, 7 Mar 2017 09:09:26 +0000 (01:09 -0800)]
Eliminate unused binding.
Chris Hanson [Tue, 7 Mar 2017 09:06:32 +0000 (01:06 -0800)]
Change host-adapter to be ignored except on 9.2.
Also fix typo in tagged-object type name.
Chris Hanson [Tue, 7 Mar 2017 06:11:45 +0000 (22:11 -0800)]
Eliminate support for custom parser tables.
Chris Hanson [Tue, 7 Mar 2017 05:55:15 +0000 (21:55 -0800)]
Merge branch 'master' of git.sv.gnu.org:/srv/git/mit-scheme
Chris Hanson [Tue, 7 Mar 2017 05:52:18 +0000 (21:52 -0800)]
Eliminate need for file-attributes parser to use custom parser table.
Also add tests for the parser using the conveniently-provided test strings.
Chris Hanson [Tue, 7 Mar 2017 05:02:46 +0000 (21:02 -0800)]
Don't save boot inits if there are none.
This exposed some packages with inits that weren't doing anything.
Matt Birkholz [Tue, 7 Mar 2017 04:13:40 +0000 (21:13 -0700)]
plugin ChangeLogs: Add missing cd commandline.
Matt Birkholz [Tue, 7 Mar 2017 04:01:53 +0000 (21:01 -0700)]
doc: Use default htmldir, pdfdir, etc. Rename updated manpage.
Define docdir, part of the default htmldir, pdfdir, etc.
Replace \- (minus) with - (hyphen) in the manpage. (This was an old,
aesthetic choice?)
Matt Birkholz [Tue, 7 Mar 2017 03:49:40 +0000 (20:49 -0700)]
Load necessary options (not loaded when Edwin is not loaded).
Matt Birkholz [Tue, 7 Mar 2017 03:46:39 +0000 (20:46 -0700)]
Generalize load-ffi-quietly to use when loading other options.
Matt Birkholz [Tue, 7 Mar 2017 03:43:56 +0000 (20:43 -0700)]
edwin: Add input-event unparser. Fix inferior unparser.
Matt Birkholz [Tue, 7 Mar 2017 03:41:24 +0000 (20:41 -0700)]
doc/ref-manual/graphics.texi: typo
Matt Birkholz [Tue, 7 Mar 2017 03:32:32 +0000 (20:32 -0700)]
cref/conpkg.scm: Fourth slot of import links: 'deprecated, not #t.
Chris Hanson [Tue, 7 Mar 2017 01:25:46 +0000 (17:25 -0800)]
Change sequence builders to copy small sequences.
Chris Hanson [Tue, 7 Mar 2017 01:17:17 +0000 (17:17 -0800)]
Change char-XXX-full converters to store strings.
Chris Hanson [Tue, 7 Mar 2017 00:33:42 +0000 (16:33 -0800)]
Fix test, now that I understand what's going on.
Chris Hanson [Mon, 6 Mar 2017 08:12:53 +0000 (00:12 -0800)]
Add Unicode segmentation tests and fix implementation to pass.*
*Except for two examples, marked in the test, that are inconsistent with my
model for how this should work.
Chris Hanson [Sun, 5 Mar 2017 22:32:21 +0000 (14:32 -0800)]
Merge branch 'master' of git.sv.gnu.org:/srv/git/mit-scheme
Matt Birkholz [Sun, 5 Mar 2017 19:26:28 +0000 (12:26 -0700)]
svm: Punt unnecessary (global-definitions "../cref/cref").
Matt Birkholz [Sun, 5 Mar 2017 19:06:53 +0000 (12:06 -0700)]
Match up 9.2 and current definitions of GUARANTEE.
Fix the 9.2 host adapter to agree with expected behavior (returning
the object), after changing the new definition so that it is easier to
continue from (error...) with a substitute.
Matt Birkholz [Sun, 5 Mar 2017 18:10:48 +0000 (11:10 -0700)]
Speed up SVM cross-build with new finish-cross-compilation:files.
Undo
0ee3b64 <compiler/base/crsend.scm: Use a compiled compress
procedure ASAP.>. Delay compressing info files until after the .mocs
are processed and a compiled runtime can be booted.
Matt Birkholz [Sun, 5 Mar 2017 16:02:10 +0000 (09:02 -0700)]
Assign &lambda-components before (runtime ustring) needs it.
Chris Hanson [Sun, 5 Mar 2017 08:48:50 +0000 (00:48 -0800)]
Eliminate long-obsolete lexpr lambdas.
Chris Hanson [Sun, 5 Mar 2017 07:20:27 +0000 (23:20 -0800)]
Fix design flaws in segmentation state machines.
Chris Hanson [Sun, 5 Mar 2017 07:18:06 +0000 (23:18 -0800)]
Must load host-adapter when compiling svm1.
Chris Hanson [Sun, 5 Mar 2017 00:24:48 +0000 (16:24 -0800)]
Save ucd-segment-states for future reference.
Chris Hanson [Sun, 5 Mar 2017 00:20:50 +0000 (16:20 -0800)]
Finish documenting the remaining string procedures.
Chris Hanson [Sun, 5 Mar 2017 00:20:27 +0000 (16:20 -0800)]
Change the default of 'copy? in string-trimmer to #f.
Chris Hanson [Sat, 4 Mar 2017 09:01:48 +0000 (01:01 -0800)]
Finish changing string-ci-hash to string-hash-ci.
Chris Hanson [Sat, 4 Mar 2017 08:35:01 +0000 (00:35 -0800)]
Merge branch 'master' of git.sv.gnu.org:/srv/git/mit-scheme
Chris Hanson [Sat, 4 Mar 2017 08:34:37 +0000 (00:34 -0800)]
Add a bunch more documentation for strings.
Chris Hanson [Sat, 4 Mar 2017 08:34:15 +0000 (00:34 -0800)]
Use @acronym{ASCII}.
Chris Hanson [Sat, 4 Mar 2017 08:33:42 +0000 (00:33 -0800)]
Add hacks to make URLs more like web pages.
Chris Hanson [Sat, 4 Mar 2017 08:33:25 +0000 (00:33 -0800)]
Export string-hash-ci.
Chris Hanson [Sat, 4 Mar 2017 08:32:57 +0000 (00:32 -0800)]
Redefine the string-find-X procedures to take substring indices.
Chris Hanson [Sat, 4 Mar 2017 08:32:32 +0000 (00:32 -0800)]
Move substring? to be near its relatives.
Chris Hanson [Sat, 4 Mar 2017 08:31:51 +0000 (00:31 -0800)]
Fix bug: string-padder was adding the wrong number of clusters.
Chris Hanson [Sat, 4 Mar 2017 04:32:17 +0000 (20:32 -0800)]
Document string->vector and vector->string.
Chris Hanson [Sat, 4 Mar 2017 04:31:49 +0000 (20:31 -0800)]
Remove redundant description of {,sub}string->list.
Chris Hanson [Sat, 4 Mar 2017 04:31:31 +0000 (20:31 -0800)]
Document string-hash-ci.
Chris Hanson [Sat, 4 Mar 2017 04:30:13 +0000 (20:30 -0800)]
Change string-search-X interface back to its original design.
Matt Birkholz [Fri, 3 Mar 2017 23:59:01 +0000 (16:59 -0700)]
runtime/chrsyn: Pass TABLE through to char->syntax-code.
Matt Birkholz [Fri, 3 Mar 2017 23:08:15 +0000 (16:08 -0700)]
Load runtime/host-adapter when building a cross-compiler.
Define GUARANTEE which is now used in the new compiler/sf/cref.
Collect a couple other existing hacks to the host runtime.
Increment the CREF version since it grew deprecated bindings.
Chris Hanson [Fri, 3 Mar 2017 05:52:06 +0000 (21:52 -0800)]
Merge branch 'master' of git.sv.gnu.org:/srv/git/mit-scheme
Chris Hanson [Fri, 3 Mar 2017 05:51:32 +0000 (21:51 -0800)]
Use canonical caseless matching for symbols.
Matt Birkholz [Thu, 2 Mar 2017 22:47:09 +0000 (15:47 -0700)]
x11: Remove reference to deprecated char->string.
Matt Birkholz [Thu, 2 Mar 2017 22:46:40 +0000 (15:46 -0700)]
Fix char-set-predicate to accept non-characters.
Chris Hanson [Thu, 2 Mar 2017 07:46:38 +0000 (23:46 -0800)]
Another round of eliminations.
Chris Hanson [Thu, 2 Mar 2017 07:33:34 +0000 (23:33 -0800)]
Giant edit to remove most of the now-obsolete guarantee-FOO bindings.
Chris Hanson [Thu, 2 Mar 2017 05:12:50 +0000 (21:12 -0800)]
Change string-joiner and string-splitter to use keyword options.
Also enhance keyword-option-parser.
Chris Hanson [Wed, 1 Mar 2017 09:42:28 +0000 (01:42 -0800)]
Implement dumb Unicode string search, and eliminate old implementation.
It looks like the KMP string-search algorithm is better for Unicode than BM, so
that will need to be implemented soon-ish.
Chris Hanson [Wed, 1 Mar 2017 02:13:35 +0000 (18:13 -0800)]
Eliminate guarantee-substring.