Chris Hanson [Fri, 3 Feb 2017 01:37:59 +0000 (17:37 -0800)]
Reorganize and curate standard Scheme indentation rules.
Chris Hanson [Tue, 31 Jan 2017 05:20:12 +0000 (21:20 -0800)]
Update XML code to use Unicode strings throughout.
I need this to be able to read the Unicode Character Database.
Chris Hanson [Tue, 31 Jan 2017 03:15:43 +0000 (19:15 -0800)]
Fix bug: ranges aren't necessarily code points.
Matt Birkholz [Tue, 31 Jan 2017 01:39:32 +0000 (18:39 -0700)]
svm: typo
Matt Birkholz [Tue, 31 Jan 2017 00:33:40 +0000 (17:33 -0700)]
Undo
d7f390f now that LIAR/svm is compiling constants properly(?).
Matt Birkholz [Tue, 31 Jan 2017 00:31:22 +0000 (17:31 -0700)]
svm: Fix handling of machine-constants that are larger than 32bits.
Matt Birkholz [Tue, 31 Jan 2017 00:26:39 +0000 (17:26 -0700)]
svm: Stub out bogus rtl:constant-cost copied from i386.
Matt Birkholz [Tue, 31 Jan 2017 00:21:19 +0000 (17:21 -0700)]
svm: Remove imports from (cross-reference).
Matt Birkholz [Mon, 30 Jan 2017 18:47:27 +0000 (11:47 -0700)]
Replace unbound ascii-char? with char->... stolen from LIAR/x86-64.
Matt Birkholz [Mon, 30 Jan 2017 17:52:00 +0000 (10:52 -0700)]
Fix infinite string input ports; add missing increment.
Chris Hanson [Mon, 30 Jan 2017 09:42:20 +0000 (01:42 -0800)]
Rework the UTF-8 codecs:
* Allow any scalar value to be used, as required by Unicode.
* Implement strict decoding as described in Unicode document.
* Change test cases to match new behavior.
Chris Hanson [Mon, 30 Jan 2017 09:41:13 +0000 (01:41 -0800)]
Change bucky-bit prefixes to prefer upper-case for output.
Also make sure that upper-case is accepted when case-folding is off.
Chris Hanson [Mon, 30 Jan 2017 09:40:19 +0000 (01:40 -0800)]
Implement char->scalar-value.
Chris Hanson [Mon, 30 Jan 2017 04:42:28 +0000 (20:42 -0800)]
Update documentation for param:parser-fold-case?.
Chris Hanson [Mon, 30 Jan 2017 04:41:20 +0000 (20:41 -0800)]
Implement #!fold-case and #!no-fold-case.
Chris Hanson [Mon, 30 Jan 2017 03:16:35 +0000 (19:16 -0800)]
Fix bug: ustrings may be equal but still have different type codes.
Also simplify implementations of eqv? and equal?, and remove eqv? handling of
empty vectors.
Chris Hanson [Mon, 30 Jan 2017 03:12:05 +0000 (19:12 -0800)]
Change string printer to generate R7RS-compatible strings.
Chris Hanson [Mon, 30 Jan 2017 03:08:41 +0000 (19:08 -0800)]
Change parser to respect fold-case? in various places.
Chris Hanson [Mon, 30 Jan 2017 03:00:38 +0000 (19:00 -0800)]
Change some of the parser's parameter names:
* Rename param:parser-canonicalize-symbols? to param:parser-fold-case?.
* Rename param:parser-enable-file-attributes-parsing? to
param:parser-enable-attributes?.
* Eliminate unnecessary *parser-enable-file-attributes-parsing?*
and *parser-keyword-style*.
* Change port properties to eliminate *...* and use new names.
Chris Hanson [Mon, 30 Jan 2017 02:40:53 +0000 (18:40 -0800)]
Refactor the character set abstraction:
* Clarify the use of "code point" versus "scalar value".
* Rename well-formed-scalar-value-list? to code-point-list? and broaden its
scope to allow characters, strings, and character sets.
* Rename scalar-values->char-set to char-set* and broaden its domain to include
any code-point-list?.
* Rename char-set->scalar-values to char-set->code-points.
* Implement char-in-set? which is char-member? with the args reversed. This
makes it consistent with scalar-value-in-char-set?. Deprecate char-member?.
* Implement char-set-union* and char-set-intersection*.
* Eliminate all of the "alphabet" names which are obsolete.
* Eliminate guarantee-char-set and error:not-char-set.
Chris Hanson [Mon, 30 Jan 2017 02:39:57 +0000 (18:39 -0800)]
Add substring indices to prefix/suffix tests.
Also simplify the implementations and fix a thinko in the suffix
implementations.
Chris Hanson [Mon, 30 Jan 2017 02:06:21 +0000 (18:06 -0800)]
Rewrite the character-name support to support unicode and case folding.
Also simplify the code a bit.
Chris Hanson [Mon, 30 Jan 2017 02:06:02 +0000 (18:06 -0800)]
Use boot inits in char.scm.
Chris Hanson [Mon, 30 Jan 2017 02:02:38 +0000 (18:02 -0800)]
Adjust tests to match changes to unicode-scalar-value?.
Also add checks of unicode-code-point?.
Chris Hanson [Mon, 30 Jan 2017 01:56:53 +0000 (17:56 -0800)]
Fix implementation of unicode-scalar-value? to not exclude non-characters.
Also implement unicode-code-point?.
Chris Hanson [Mon, 30 Jan 2017 01:53:36 +0000 (17:53 -0800)]
Implement \x<hex>; syntax for strings.
Chris Hanson [Sun, 29 Jan 2017 08:50:20 +0000 (00:50 -0800)]
Implement #\x... syntax for characters.
Chris Hanson [Sun, 29 Jan 2017 08:42:13 +0000 (00:42 -0800)]
Eliminate char->ascii and ascii->char, which were misnomers.
Change char-ascii? to be true only for 7-bit chars. Also change char-ascii? to
return a boolean and implement ascii-char?.
Chris Hanson [Sun, 29 Jan 2017 06:00:21 +0000 (22:00 -0800)]
Fix bug: would-block value only returned if nothing has been read.
Chris Hanson [Sun, 29 Jan 2017 04:26:35 +0000 (20:26 -0800)]
Simplify logic for printing generic I/O ports.
Chris Hanson [Sat, 28 Jan 2017 23:38:50 +0000 (15:38 -0800)]
Upgrade compound-predicate implementation with latest from book.
Also clean up the initialization sequence.
Chris Hanson [Sat, 28 Jan 2017 22:36:55 +0000 (14:36 -0800)]
Move tests from test-predicate-lattice -> test-compound-predicate.
Chris Hanson [Sat, 28 Jan 2017 11:20:29 +0000 (03:20 -0800)]
Eliminate use of obsolete get-if-available method.
Chris Hanson [Sat, 28 Jan 2017 11:19:45 +0000 (03:19 -0800)]
Move non-{top,bottom}-tag? to be near {top,bottom}-tag?.
Chris Hanson [Sat, 28 Jan 2017 11:19:02 +0000 (03:19 -0800)]
Implement simple-{list,lset}-memoizer to capture common pattern.
Chris Hanson [Sat, 28 Jan 2017 11:18:09 +0000 (03:18 -0800)]
Some tests had undefined assertions; use new assertions instead.
Chris Hanson [Sat, 28 Jan 2017 11:15:42 +0000 (03:15 -0800)]
Improve the unit-testing framework in a few ways.
* Simplified the creation of new assertions.
* Added ability to have templated failure messages.
* Made it easy to make negated assertions.
* Added a handful of new assertions.
Chris Hanson [Sat, 28 Jan 2017 05:06:37 +0000 (21:06 -0800)]
Rename predicate constructor/accessor to tagger/untagger.
Chris Hanson [Sat, 28 Jan 2017 04:46:57 +0000 (20:46 -0800)]
Fix regexp bug in previous change. Add run-time diagnostics.
Chris Hanson [Sat, 28 Jan 2017 04:36:30 +0000 (20:36 -0800)]
Normalize .gitignore directory patterns.
Chris Hanson [Sat, 28 Jan 2017 02:21:48 +0000 (18:21 -0800)]
Fix up STAGE0 handling which failed to use the specified build.
Also add -n arg for testing and generalized n-stages arg.
Chris Hanson [Fri, 27 Jan 2017 21:51:10 +0000 (13:51 -0800)]
Fix compiler crash on SVM with constant that doesn't fit in 32-bit signed.
Chris Hanson [Fri, 27 Jan 2017 20:58:56 +0000 (12:58 -0800)]
Summarize test results at end of run.
Chris Hanson [Fri, 27 Jan 2017 20:58:39 +0000 (12:58 -0800)]
Don't return legacy strings containing UTF-8.
Chris Hanson [Fri, 27 Jan 2017 20:25:05 +0000 (12:25 -0800)]
Deprecate symbol-name.
Chris Hanson [Fri, 27 Jan 2017 16:17:31 +0000 (08:17 -0800)]
Fix bugs: fixnum sizes must be measured at runtime.
Otherwise cross-compiling on a host that's wider than the target will not work.
Chris Hanson [Fri, 27 Jan 2017 10:54:09 +0000 (02:54 -0800)]
Fix typo.
Chris Hanson [Fri, 27 Jan 2017 10:31:37 +0000 (02:31 -0800)]
Major refactor to use ustring in important places.
There is much more work to do but this converts all the textual I/O, parser
buffers, pathnames, URIs, and a bunch of the XML code. The older Unicode
support in (runtime unicode) is completely gone now. Outside of Edwin, it
should be fairly safe to assume that legacy strings are *NOT* UTF-8 encoded.
Some specific work items remaining:
* Eliminate symbol-name, which violates the non-utf8-legacy rule.
* Finish converting the XML code to consistently use ustrings.
* Implement real Unicode casing, ordering, and character sets.
* Change the parser to use the R7RS-defined character classes.
* Isolate Edwin from the runtime system's string implementation, since porting
it to Unicode is not worth the trouble. It should be frozen to use only
ASCII, not ISO 8859-1 as at present.
And last of all:
* Once Edwin is isolated, convert the runtime system to use ustrings everywhere,
then rename them from "ustring" to "string".
Chris Hanson [Fri, 27 Jan 2017 07:09:16 +0000 (23:09 -0800)]
Fix thinko.
Chris Hanson [Fri, 27 Jan 2017 06:34:23 +0000 (22:34 -0800)]
Change string I/O to use ustrings.
Chris Hanson [Fri, 27 Jan 2017 06:34:03 +0000 (22:34 -0800)]
Convert generic I/O to support ustring.
Chris Hanson [Fri, 27 Jan 2017 06:08:09 +0000 (22:08 -0800)]
Implement converters between utf8-string and ustring.
These are temporary: both utf8-string and wide-string are going to be
eliminated. Until then, we need some scaffolding to incrementally rewrite code
that uses them.
Chris Hanson [Fri, 27 Jan 2017 05:53:07 +0000 (21:53 -0800)]
Tweak to use bytevector.
Chris Hanson [Fri, 27 Jan 2017 05:52:44 +0000 (21:52 -0800)]
Tweak to use ustrings.
Chris Hanson [Fri, 27 Jan 2017 04:36:36 +0000 (20:36 -0800)]
Add unicode support to equal?.
Chris Hanson [Fri, 27 Jan 2017 04:34:05 +0000 (20:34 -0800)]
Change printer to support unicode.
Chris Hanson [Fri, 27 Jan 2017 04:26:05 +0000 (20:26 -0800)]
A handful of tweaks.
Chris Hanson [Fri, 27 Jan 2017 03:44:32 +0000 (19:44 -0800)]
Change string hash tables to support unicode strings.
Chris Hanson [Fri, 27 Jan 2017 03:40:00 +0000 (19:40 -0800)]
Make sure that strings being passed to primitives are converted.
Chris Hanson [Fri, 27 Jan 2017 01:55:57 +0000 (17:55 -0800)]
Change pathname abstraction to use unicode strings.
Chris Hanson [Fri, 27 Jan 2017 01:55:17 +0000 (17:55 -0800)]
Add support for running fewer than three stages.
Chris Hanson [Fri, 27 Jan 2017 01:45:54 +0000 (17:45 -0800)]
Fix typo.
Chris Hanson [Fri, 27 Jan 2017 01:23:49 +0000 (17:23 -0800)]
Eliminate large swath of unused exports from (runtime unicode) package.
Chris Hanson [Fri, 27 Jan 2017 01:00:18 +0000 (17:00 -0800)]
Eliminate use of xstring in IMAIL.
Chris Hanson [Fri, 27 Jan 2017 00:53:37 +0000 (16:53 -0800)]
Eliminate use of xstring in Edwin.
Chris Hanson [Fri, 27 Jan 2017 00:30:33 +0000 (16:30 -0800)]
Refactor symbol implementation to use UTF-8 bytevectors for names.
Primitives handle this correctly since they accept either a legacy string or a
bytevector. As long as no one peeks behind the abstraction this should be
transparent.
However, symbols with non-ASCII names will produce non-legacy strings when
asked. AFAIK there are none currently in use.
Chris Hanson [Fri, 27 Jan 2017 00:30:13 +0000 (16:30 -0800)]
Eliminate incorrect registration of legacy-string?.
Chris Hanson [Thu, 26 Jan 2017 23:51:34 +0000 (15:51 -0800)]
Change bytevectors to use Unicode strings.
Chris Hanson [Thu, 26 Jan 2017 23:45:25 +0000 (15:45 -0800)]
Merge branch 'master' of git.sv.gnu.org:/srv/git/mit-scheme
Chris Hanson [Thu, 26 Jan 2017 23:44:58 +0000 (15:44 -0800)]
Implement a Unicode string abstraction.
Chris Hanson [Thu, 26 Jan 2017 23:41:51 +0000 (15:41 -0800)]
Implement a Unicode string abstraction.
Chris Hanson [Thu, 26 Jan 2017 23:37:57 +0000 (15:37 -0800)]
Implement char=-predicate and char-ci=-predicate.
Chris Hanson [Thu, 26 Jan 2017 23:30:13 +0000 (15:30 -0800)]
Add "legacy" names for standard string operations.
Also deprecate "vector-8b" names.
Chris Hanson [Thu, 26 Jan 2017 23:21:55 +0000 (15:21 -0800)]
bytevectors: Implement bytevector-hash; fix a couple of bugs and simplify.
Chris Hanson [Thu, 26 Jan 2017 22:57:37 +0000 (14:57 -0800)]
Implement fix:end-index and fix:start-index.
Chris Hanson [Thu, 26 Jan 2017 22:16:02 +0000 (14:16 -0800)]
Revert "Initial draft of new string implementation."
This reverts commit
aafeee81eea3921e043d0332314eb4e44da176fa.
Chris Hanson [Thu, 26 Jan 2017 21:43:26 +0000 (13:43 -0800)]
Eliminate call to now-undefined simple-predicate?.
Chris Hanson [Wed, 25 Jan 2017 19:16:23 +0000 (11:16 -0800)]
Fix thinko: caller argument in wrong place.
Chris Hanson [Wed, 25 Jan 2017 08:40:54 +0000 (00:40 -0800)]
Initial draft of new string implementation.
Chris Hanson [Wed, 25 Jan 2017 05:01:29 +0000 (21:01 -0800)]
Tweak pagination.
Chris Hanson [Wed, 25 Jan 2017 04:57:16 +0000 (20:57 -0800)]
Create synchronize-output-port and make it generic over all output ports.
Chris Hanson [Wed, 25 Jan 2017 04:25:23 +0000 (20:25 -0800)]
Restrict most genio exports. A couple of renames.
Chris Hanson [Wed, 25 Jan 2017 03:54:51 +0000 (19:54 -0800)]
Plumb genio to pass caller name down to operations.
Chris Hanson [Wed, 25 Jan 2017 03:15:03 +0000 (19:15 -0800)]
Major refactor of textual I/O ports.
New design uses a binary port to do actual I/O, so is mostly about coding.
Chris Hanson [Wed, 25 Jan 2017 03:14:07 +0000 (19:14 -0800)]
Export fix:iota.
Chris Hanson [Tue, 24 Jan 2017 22:10:30 +0000 (14:10 -0800)]
Implement fix:iota.
Chris Hanson [Tue, 24 Jan 2017 21:19:40 +0000 (13:19 -0800)]
Add comment for return value of write-bytevector.
Chris Hanson [Tue, 24 Jan 2017 20:37:30 +0000 (12:37 -0800)]
Remove unused xstring-byte-* procedures.
Chris Hanson [Tue, 24 Jan 2017 16:58:23 +0000 (08:58 -0800)]
Fix broken indent.
Chris Hanson [Tue, 24 Jan 2017 16:57:38 +0000 (08:57 -0800)]
Change reload-save-string/reload-retrieve-string to preserve type.
Chris Hanson [Mon, 23 Jan 2017 05:46:36 +0000 (21:46 -0800)]
Implement find-map.
Chris Hanson [Mon, 23 Jan 2017 05:41:58 +0000 (21:41 -0800)]
Allow undo in debugger detail buffers.
Chris Hanson [Fri, 20 Jan 2017 09:48:51 +0000 (01:48 -0800)]
Add new char tests to standard checks.
Chris Hanson [Fri, 20 Jan 2017 09:46:10 +0000 (01:46 -0800)]
Implement tests for characters, particularly UTF-8 codec.
Chris Hanson [Fri, 20 Jan 2017 09:45:51 +0000 (01:45 -0800)]
Allow assert-error to be used without explicit error conditions.
Chris Hanson [Fri, 20 Jan 2017 09:44:58 +0000 (01:44 -0800)]
Implement #\alarm and change #\u+00 to print as #\null.
Chris Hanson [Thu, 19 Jan 2017 08:28:43 +0000 (00:28 -0800)]
Make binary ports work independent of their buffer size.
Although they require a minimum size of 1 so that single-byte ops work. Also
re-jigger names in preparation for reusing the sources and sinks for textual
ports.
Chris Hanson [Wed, 18 Jan 2017 11:00:08 +0000 (03:00 -0800)]
Implement UTF-X codecs for chars and strings.
Chris Hanson [Wed, 18 Jan 2017 07:47:10 +0000 (23:47 -0800)]
Implement character encoders for UTF-16 and UTF-32.
Chris Hanson [Wed, 18 Jan 2017 07:31:33 +0000 (23:31 -0800)]
Rearrange to put new accessors prior to string converters.