Chris Hanson [Mon, 28 Jun 2004 03:26:20 +0000 (03:26 +0000)]
Implement XML-MISC-CONTENT-ITEM?.
Chris Hanson [Sun, 27 Jun 2004 06:26:33 +0000 (06:26 +0000)]
Fix valid-content tests on output of processing instructions to
correspond to those in xml-struct.
Chris Hanson [Wed, 23 Jun 2004 03:45:50 +0000 (03:45 +0000)]
Add support for fractional seconds in ISO 8601 times.
Chris Hanson [Wed, 16 Jun 2004 01:55:18 +0000 (01:55 +0000)]
Update menu: delete missing section in hash-table docs.
Chris Hanson [Sun, 13 Jun 2004 04:14:22 +0000 (04:14 +0000)]
Must lock table during REHASH-TABLE!.
Chris Hanson [Sat, 12 Jun 2004 03:46:22 +0000 (03:46 +0000)]
Make sure hashing operations integrate as I intended. Reduce table
locking to protect against abort but not simultaneous access.
Chris Hanson [Sat, 12 Jun 2004 02:15:48 +0000 (02:15 +0000)]
Reimplement PRIME-NUMBERS-STREAM to use less space.
Chris Hanson [Sat, 12 Jun 2004 02:14:56 +0000 (02:14 +0000)]
Implement SMALLEST-FIXNUM and LARGEST-FIXNUM.
Chris Hanson [Mon, 7 Jun 2004 19:54:30 +0000 (19:54 +0000)]
Reflect new hash-table implementation.
Chris Hanson [Mon, 7 Jun 2004 19:47:57 +0000 (19:47 +0000)]
New hash-table implementation.
Chris Hanson [Thu, 27 May 2004 16:06:31 +0000 (16:06 +0000)]
When closing a port, don't try to flush output if the channel is
already closed.
Chris Hanson [Thu, 27 May 2004 14:04:32 +0000 (14:04 +0000)]
Export UTF-xx input ports.
Chris Hanson [Thu, 27 May 2004 14:03:06 +0000 (14:03 +0000)]
Add missing error checking to UTF-8 decoder: was allowing illegal code
points. Simplify code that checks for illegal code points; some of
the checks were redundant. Implement object buffering, and use it to
reimplement wide-string format conversions and ports. Implement input
ports for UTF-xx strings.
Chris Hanson [Wed, 26 May 2004 17:43:18 +0000 (17:43 +0000)]
Implement byte sources.
Chris Hanson [Wed, 26 May 2004 17:05:56 +0000 (17:05 +0000)]
Add procedures to do output directly to UTF-xx strings.
Chris Hanson [Wed, 26 May 2004 17:03:14 +0000 (17:03 +0000)]
Fix bug in handling of wide strings.
Chris Hanson [Wed, 26 May 2004 15:26:29 +0000 (15:26 +0000)]
Use new procedure PORT/SUPPORTS-CODING? to eliminate error when
writing XML to string.
Chris Hanson [Wed, 26 May 2004 15:20:22 +0000 (15:20 +0000)]
Add new procedure PORT/SUPPORTS-CODING?.
Chris Hanson [Wed, 26 May 2004 10:52:11 +0000 (10:52 +0000)]
When deciding whether it is legal to associate an IRI with a name,
distinguish between a name with no prefix and a name that is not
namespace well formed. The former may have an IRI, and the latter may
not.
Chris Hanson [Tue, 30 Mar 2004 04:45:01 +0000 (04:45 +0000)]
Generalize code to toggle Dired sort order.
Chris Hanson [Tue, 30 Mar 2004 04:27:52 +0000 (04:27 +0000)]
New port abstraction is hiding unread characters from the underlying
port operations; consequently, the buffer-input implementation was
returning the wrong value for the current mark. This has been kludged
around.
Chris Hanson [Wed, 24 Mar 2004 21:16:55 +0000 (21:16 +0000)]
Allow "utf7" and "utf8" character sets.
Chris Hanson [Tue, 9 Mar 2004 06:26:50 +0000 (06:26 +0000)]
Change PAGE_READWRITE to PAGE_EXECUTE_READWRITE, so that XP SP2
doesn't invalidate all execution in the heap.
Chris Hanson [Tue, 9 Mar 2004 03:46:42 +0000 (03:46 +0000)]
Don't try to allocate zero-length string in RELOAD-SAVE-STRING.
Chris Hanson [Thu, 26 Feb 2004 19:05:06 +0000 (19:05 +0000)]
INPUT-PORT/READ-STRING wasn't returning an EOF object when needed.
Chris Hanson [Thu, 26 Feb 2004 19:03:58 +0000 (19:03 +0000)]
Fix typo that prevented EOF from being properly detected.
Chris Hanson [Thu, 26 Feb 2004 18:31:41 +0000 (18:31 +0000)]
Update version number to reflect changes.
Chris Hanson [Thu, 26 Feb 2004 04:52:03 +0000 (04:52 +0000)]
Allow a name to contain colons as specified by the XML standard.
However, don't allow association of an IRI with the name unless the
name uses a single colon as specified by the namespace standard.
Chris Hanson [Thu, 26 Feb 2004 04:50:14 +0000 (04:50 +0000)]
Fix thinko in handling of name parsing.
Chris Hanson [Thu, 26 Feb 2004 01:58:53 +0000 (01:58 +0000)]
Restore colon as name-initial char.
Chris Hanson [Thu, 26 Feb 2004 01:52:24 +0000 (01:52 +0000)]
Remove now-obsolete code that forces output coding to UTF-8.
Chris Hanson [Wed, 25 Feb 2004 21:00:52 +0000 (21:00 +0000)]
Generate BOM on output for those encodings that require it.
Chris Hanson [Wed, 25 Feb 2004 20:59:29 +0000 (20:59 +0000)]
Fix bugs in implementation of UTF-32 coding.
Chris Hanson [Wed, 25 Feb 2004 20:59:02 +0000 (20:59 +0000)]
Add name for BOM character.
Chris Hanson [Tue, 24 Feb 2004 20:59:09 +0000 (20:59 +0000)]
Fix thinko.
Chris Hanson [Tue, 24 Feb 2004 20:49:08 +0000 (20:49 +0000)]
Use temporary file as intermediary for write/re-read test. This tests
the character coding as well as the plain I/O.
Chris Hanson [Tue, 24 Feb 2004 20:48:32 +0000 (20:48 +0000)]
Fix typo.
Chris Hanson [Tue, 24 Feb 2004 20:36:42 +0000 (20:36 +0000)]
Implement support for character coding.
Chris Hanson [Tue, 24 Feb 2004 20:35:48 +0000 (20:35 +0000)]
Implement operations to detect known codings and line endings of a
port. Add support for US-ASCII, UTF-16, and UTF-32 codings.
Chris Hanson [Tue, 24 Feb 2004 20:34:50 +0000 (20:34 +0000)]
Don't read more characters than are needed. The XML character-coding
detection depends on this.
Chris Hanson [Tue, 24 Feb 2004 05:51:12 +0000 (05:51 +0000)]
Export DISCARD-CHAR.
Chris Hanson [Tue, 24 Feb 2004 05:50:44 +0000 (05:50 +0000)]
Export UNREAD-CHAR.
Chris Hanson [Tue, 24 Feb 2004 04:23:12 +0000 (04:23 +0000)]
Canonicalize UTF-16 and UTF-32 names.
Chris Hanson [Tue, 24 Feb 2004 01:51:00 +0000 (01:51 +0000)]
Clean up output a little.
Chris Hanson [Tue, 24 Feb 2004 01:45:53 +0000 (01:45 +0000)]
When using XML line ending on I/O port, treat output side as TEXT.
Chris Hanson [Mon, 23 Feb 2004 20:56:21 +0000 (20:56 +0000)]
Eliminate PARSE-XML-DOCUMENT. Merge STRING->XML and SUBSTRING->XML.
Force input coding to UTF-8 (for now). Force input line ending to
XML-1.0.
Chris Hanson [Mon, 23 Feb 2004 20:55:11 +0000 (20:55 +0000)]
Some tweaks to handle changes in I/O subsystem. Force UTF-8 coding on
output (for now).
Chris Hanson [Mon, 23 Feb 2004 20:53:22 +0000 (20:53 +0000)]
Use STRING->PARSER-BUFFER rather than WIDE-STRING->PARSER-BUFFER,
since the former has replaced the latter.
Chris Hanson [Mon, 23 Feb 2004 20:52:49 +0000 (20:52 +0000)]
Use wide string to test re-reading of document.
Chris Hanson [Mon, 23 Feb 2004 20:51:47 +0000 (20:51 +0000)]
Eliminate SOURCE->PARSER-BUFFER. Merge procedures
*STRING->PARSER-BUFFER into a single procedure.
Chris Hanson [Mon, 23 Feb 2004 20:50:33 +0000 (20:50 +0000)]
Rewrite STRING->WIDE-STRING to make it more efficient.
Chris Hanson [Mon, 23 Feb 2004 20:49:32 +0000 (20:49 +0000)]
Add support for UTF-32.
Chris Hanson [Wed, 18 Feb 2004 19:52:06 +0000 (19:52 +0000)]
Fix problems with parsing of element content.
Chris Hanson [Tue, 17 Feb 2004 05:53:31 +0000 (05:53 +0000)]
Use new arguments for OPEN-TCP-STREAM-SOCKET.
Chris Hanson [Tue, 17 Feb 2004 05:46:20 +0000 (05:46 +0000)]
Fix some bugs in the parser buffer.
Chris Hanson [Tue, 17 Feb 2004 05:35:46 +0000 (05:35 +0000)]
Fix typo.
Chris Hanson [Tue, 17 Feb 2004 05:00:18 +0000 (05:00 +0000)]
Add kludge to define MATCH-UTF8-CHAR-IN-ALPHABET.
Chris Hanson [Tue, 17 Feb 2004 04:59:54 +0000 (04:59 +0000)]
Add line-ending support.
Chris Hanson [Tue, 17 Feb 2004 04:59:29 +0000 (04:59 +0000)]
Add NEWLINE line-ending.
Chris Hanson [Mon, 16 Feb 2004 05:50:43 +0000 (05:50 +0000)]
Changes required by reimplementation of I/O subsystem.
Chris Hanson [Mon, 16 Feb 2004 05:40:46 +0000 (05:40 +0000)]
Bump version to reflect major change.
Chris Hanson [Mon, 16 Feb 2004 05:39:37 +0000 (05:39 +0000)]
The I/O subsystem has once again been redesigned. The primary goal of
this large change is to integrate support for Unicode and character
coding directly into the I/O subsystem. Secondary goals are to
improve I/O performance, to simplify the design, and to provide
flexibility for future enhancement.
This change set has received cursory testing, and no doubt a number of
problems remain. Additionally, there are several unfinished aspects
to the change. But this version works well enough to run Edwin.
Detailed changes
----------------
The term "line translation" is everywhere replaced with "line ending".
A line ending is now specified by a symbol, such as 'crlf or 'lf;
previously it was a string. I/O files now support a single line
ending for both input and output sides; previously there were two
independent line translations.
The I/O buffers have been completely redesigned. They now operate in
three stages: one stage does byte-stream I/O, the second manages
coding (e.g. UTF-8), and the third manages line endings. Only bytes
are buffered. As a consequence, READ-CHAR and WRITE-CHAR will now
handle any Unicode character, provided the port's coding is set to an
appropriate value.
The READ-SUBSTRING port operation can now assume that its START
argument is strictly less than its END argument. Likewise for the new
operations READ-WIDE-SUBSTRING and READ-EXTERNAL-SUBSTRING.
The WRITE-SUBSTRING port operation now returns either #F or a
non-negative integer. It can also now assume that its START argument
is strictly less than its END argument. Both of these properties are
true for the new WRITE-WIDE-SUBSTRING and WRITE-EXTERNAL-SUBSTRING.
The WRITE-CHAR port operation now returns either #F, 0, or 1, as if it
was a call to WRITE-SUBSTRING with a one-char string.
The CHAR-READY? port operation and the INPUT-PORT/CHAR-READY?
procedure no longer accept a second "interval" argument. Handling of
the timeout interval is instead implemented directly in the
CHAR-READY? procedure.
Strings are always considered to be encoded using ISO-8859-1.
The parser-buffer datatype has been widened to handle all Unicode
characters.
All ports now support the FRESH-LINE operation, which is implemented
as a layer on top of the supplied operations. Similarly, the
PEEK-CHAR, DISCARD-CHAR, and new UNREAD-CHAR operations are
implemented for all ports.
End-of-file objects now have an associated port.
RUN-SHELL-COMMAND and RUN-SYNCHRONOUS-SUBPROCESS now accept a keyword
argument LINE-ENDING, which replaces the old options
INPUT-LINE-TRANSLATION and OUTPUT-LINE-TRANSLATION.
Transcript support has been moved into the core port abstraction.
Consequently, it is no longer necessary to encapsulate a port in order
to get transcript support. Encapsulated ports have been eliminated,
as this was their only use.
The procedures OPEN-TCP-STREAM-SOCKET, OPEN-UNIX-STREAM-SOCKET,
SUBPROCESS-I/O-PORT, and TCP-SERVER-CONNECTION-ACCEPT have changed
their argument structure. All arguments dealing with buffer size and
line translation have been eliminated. In the new implementation, the
buffer size is fixed, and handling of line endings is changed by
calling PORT/SET-LINE-ENDING.
The following variables have been eliminated:
CHANNEL-WRITE-CHAR-BLOCK
CHANNEL-WRITE-STRING-BLOCK
ENCAPSULATED-PORT/PORT
ENCAPSULATED-PORT/STATE
ENCAPSULATED-PORT?
GUARANTEE-ENCAPSULATED-PORT
INPUT-PORT/CHANNEL
INPUT-PORT/COPY
INPUT-PORT/CUSTOM-OPERATION
INPUT-PORT/OPERATION
INPUT-PORT/OPERATION
INPUT-PORT/OPERATION-NAMES
INPUT-PORT/STATE
MAKE-ENCAPSULATED-PORT
MAKE-GENERIC-INPUT-PORT
MAKE-GENERIC-OUTPUT-PORT
MAKE-I/O-PORT
MAKE-INPUT-PORT
MAKE-OUTPUT-PORT
MATCH-UTF8-CHAR-IN-ALPHABET
OUTPUT-PORT/CHANNEL
OUTPUT-PORT/COPY
OUTPUT-PORT/CUSTOM-OPERATION
OUTPUT-PORT/OPERATION
OUTPUT-PORT/OPERATION
OUTPUT-PORT/OPERATION-NAMES
OUTPUT-PORT/STATE
PATHNAME-END-OF-LINE-STRING
PATHNAME-NEWLINE-TRANSLATION
SET-ENCAPSULATED-PORT/STATE!
SET-INPUT-PORT/STATE!
SET-OUTPUT-PORT/STATE!
The following port operations have been eliminated:
BUFFERED-INPUT-CHARS
BUFFERED-OUTPUT-CHARS
CHARS-REMAINING
DISCARD-CHAR
DISCARD-CHARS
FRESH-LINE
INPUT-BUFFER-SIZE
OUTPUT-BUFFER-SIZE
PEEK-CHAR
READ-STRING
REST->STRING
SET-INPUT-BUFFER-SIZE
SET-OUTPUT-BUFFER-SIZE
\f
To do:
* locking
* column tracking
* convert parser from peek/discard to read/unread
* [?] integrate parser-buffer support (port.scm/input.scm)
* change buffer I/O ports to handle line endings as needed
Change arg structure of:
char-ready? port operation
input-port/char-ready?
make-generic-i/o-port
make-input-buffer
make-output-buffer
open-tcp-stream-socket
open-unix-stream-socket
subprocess-i/o-port
tcp-server-connection-accept
Renamed variables:
os/default-end-of-line-translation => default-line-ending
os/file-end-of-line-translation => file-line-ending
New variables:
channel-has-input?
channel-write-byte-block
condition-type:char-decoding-error
condition-type:char-encoding-error
condition-type:not-8-bit-char
console-i/o-port?
eof-object-port
error:char-decoding
error:char-encoding
error:not-8-bit-char
guarantee-wide-substring
input-port/read-external-substring
input-port/read-wide-substring
input-port/unread-char
match-parser-buffer-char-in-alphabet
match-parser-buffer-char-in-alphabet-no-advance
match-parser-buffer-char-not-in-alphabet
match-parser-buffer-char-not-in-alphabet-no-advance
match-parser-buffer-char-not-in-set
match-parser-buffer-char-not-in-set-no-advance
output-port/write-external-substring
output-port/write-wide-substring
port/coding
port/line-ending
port/set-coding
port/set-line-ending
port=?
set-channel-port!
unread-char
wide-string->parser-buffer
wide-substring
wide-substring->parser-buffer
New port operations:
coding
line-ending
read-external-substring
read-wide-substring
set-coding
set-line-ending
write-external-substring
write-wide-substring
Chris Hanson [Fri, 6 Feb 2004 18:15:40 +0000 (18:15 +0000)]
Fix typo.
Chris Hanson [Wed, 4 Feb 2004 05:02:12 +0000 (05:02 +0000)]
Fix typos.
Chris Hanson [Wed, 4 Feb 2004 05:01:32 +0000 (05:01 +0000)]
Fix CLOSE-ENOUGH?.
Chris Hanson [Tue, 3 Feb 2004 18:46:50 +0000 (18:46 +0000)]
Don't set super/hyper bucky bits based on modifier keys.
Chris Hanson [Sat, 31 Jan 2004 02:16:53 +0000 (02:16 +0000)]
Don't specially handle control/meta-modified alphabetic keys; this
appears to be a broken optimization from long ago. Thanks to Joe
Marshall for figuring it out.
Chris Hanson [Mon, 19 Jan 2004 21:14:56 +0000 (21:14 +0000)]
Update CVS access information.
Chris Hanson [Mon, 19 Jan 2004 05:06:22 +0000 (05:06 +0000)]
Implement support for associating input-port "position" with each
pointer object in the output of the parser. This is useful for
mapping s-expressions back to positions in the source code, for
example. Also, rearrange the code a bit to make it clearer.
Chris Hanson [Mon, 19 Jan 2004 04:37:14 +0000 (04:37 +0000)]
Rewrite the CHAR-READY? operation to use TEST-SELECT-DESCRIPTOR rather
than a non-blocking read. The latter used five system calls, while
the former uses one to achieve the same effect. Also, the
INPUT-BUFFER/READ-UNTIL-DELIMITER and
INPUT-BUFFER/DISCARD-UNTIL-DELIMITER procedures were eliminated.
Chris Hanson [Mon, 19 Jan 2004 04:30:57 +0000 (04:30 +0000)]
Eliminate the READ-STRING and DISCARD-CHARS operations.
Chris Hanson [Mon, 19 Jan 2004 04:30:41 +0000 (04:30 +0000)]
Deal gracefully with EOF in READ-FINISH operation.
Chris Hanson [Sun, 18 Jan 2004 06:04:49 +0000 (06:04 +0000)]
Use getpt() if available.
Chris Hanson [Sat, 17 Jan 2004 13:55:46 +0000 (13:55 +0000)]
Combine TABLE and DB parameters.
Chris Hanson [Sat, 17 Jan 2004 13:49:49 +0000 (13:49 +0000)]
Simplify table-lookup mechanism.
Chris Hanson [Sat, 17 Jan 2004 01:40:27 +0000 (01:40 +0000)]
Add "autom4te.cache" to cleanup.
Chris Hanson [Fri, 16 Jan 2004 21:07:33 +0000 (21:07 +0000)]
Add ssp.
Chris Hanson [Fri, 16 Jan 2004 21:05:12 +0000 (21:05 +0000)]
Add ssp.
Chris Hanson [Fri, 16 Jan 2004 20:59:05 +0000 (20:59 +0000)]
Fix quoting.
Chris Hanson [Fri, 16 Jan 2004 20:47:22 +0000 (20:47 +0000)]
Eliminate obsolete references to INPUT-PORT/OPERATION and
OUTPUT-PORT/OPERATION.
Chris Hanson [Fri, 16 Jan 2004 20:43:16 +0000 (20:43 +0000)]
Bump component version to reflect changes since last release.
Chris Hanson [Fri, 16 Jan 2004 20:32:40 +0000 (20:32 +0000)]
Eliminate use of obsolete OUTPUT-PORT/OPERATION.
Chris Hanson [Fri, 16 Jan 2004 20:31:06 +0000 (20:31 +0000)]
Eliminate use of obsolete INPUT-PORT/OPERATION.
Chris Hanson [Fri, 16 Jan 2004 19:43:52 +0000 (19:43 +0000)]
Provide BASE-PORT to parser.
Chris Hanson [Fri, 16 Jan 2004 19:39:53 +0000 (19:39 +0000)]
Fix handling of quote within strings.
Chris Hanson [Fri, 16 Jan 2004 19:26:06 +0000 (19:26 +0000)]
Fix syntax definitions to reflect what the parser does, and simplify
them for clarity.
Chris Hanson [Fri, 16 Jan 2004 19:11:14 +0000 (19:11 +0000)]
Quote some more prefixed atom delimiters.
Chris Hanson [Fri, 16 Jan 2004 19:07:15 +0000 (19:07 +0000)]
Now that comma is an atom delimiter, it's necessary to quote it in
prefixed character constants.
Chris Hanson [Fri, 16 Jan 2004 19:04:38 +0000 (19:04 +0000)]
Pass the shared objects database as an argument to all the handlers,
rather than using a dynamically-bound variable. Pass an additional
argument to indicate when close-paren and close-bracket are allowed.
Fix long-standing bug in handling of unmatched close parens at top
level: the port comparison was never true because of encapsulation.
Chris Hanson [Fri, 16 Jan 2004 06:33:47 +0000 (06:33 +0000)]
Fix some minor bugs. Considerably simplify parsing of characters.
Chris Hanson [Fri, 16 Jan 2004 05:48:23 +0000 (05:48 +0000)]
Compensate for a change to the definition of CHAR-SET/ATOM-DELIMITERS.
Chris Hanson [Fri, 16 Jan 2004 05:44:21 +0000 (05:44 +0000)]
Add name for non-blocking space.
Chris Hanson [Thu, 15 Jan 2004 21:00:16 +0000 (21:00 +0000)]
Initial draft of new parser. Needs more testing, and at least one
feature is missing.
Chris Hanson [Thu, 15 Jan 2004 20:59:12 +0000 (20:59 +0000)]
Implement %STRING->SYMBOL for to eliminate unnecessary copying in
parser.
Chris Hanson [Thu, 15 Jan 2004 20:58:36 +0000 (20:58 +0000)]
Fix incorrect package references for files loaded at the very
beginning of the boot.
Chris Hanson [Sun, 11 Jan 2004 07:18:05 +0000 (07:18 +0000)]
Eliminate INPUT-BUFFER/DISCARD-CHAR, which couldn't be used with
non-blocking input ports because there was no way to tell whether the
char was discarded. Instead, use INPUT-BUFFER/READ-CHAR in its place,
which is only slightly slower and does provide this indication.
Chris Hanson [Sun, 11 Jan 2004 05:25:57 +0000 (05:25 +0000)]
Fix problem: some uses of terminated-region-matcher must behave as
they did prior to revision 1.51.
Chris Hanson [Fri, 9 Jan 2004 21:12:19 +0000 (21:12 +0000)]
Implement REVERSE* and REVERSE*!, like REVERSE and REVERSE! but a
non-null tail element can be specified.
Chris Hanson [Fri, 9 Jan 2004 20:22:22 +0000 (20:22 +0000)]
Fix bug: RANDOM-BYTE-VECTOR has to supply a default state object if
none is given.
Chris Hanson [Thu, 8 Jan 2004 17:52:34 +0000 (17:52 +0000)]
Fix thinko in previous change.