Chris Hanson [Tue, 10 Aug 2004 01:09:41 +0000 (01:09 +0000)]
In CONVERT-XML-STRING-VALUE, make sure error message has "XML" in it.
Chris Hanson [Tue, 10 Aug 2004 01:03:02 +0000 (01:03 +0000)]
Export FLATTEN-XML-ELEMENT-CONTENTS.
Chris Hanson [Sat, 24 Jul 2004 04:39:49 +0000 (04:39 +0000)]
Fix definitions of entities so that they work with all character sets.
Chris Hanson [Sat, 24 Jul 2004 04:29:45 +0000 (04:29 +0000)]
Fix bug: DTD can't have namespace on its root element name.
Chris Hanson [Sat, 24 Jul 2004 04:21:58 +0000 (04:21 +0000)]
Fix broken character definitions. (Arrgh.)
Chris Hanson [Sat, 24 Jul 2004 04:03:09 +0000 (04:03 +0000)]
Fix thinko in call to MAKE-XML-!ENTITY.
Chris Hanson [Sat, 24 Jul 2004 03:45:54 +0000 (03:45 +0000)]
Add support for XHTML predefined entities. These are available only
when the document has an XHTML DTD.
Chris Hanson [Sat, 24 Jul 2004 03:19:23 +0000 (03:19 +0000)]
Add predicates to identify XHTML DTDs.
Chris Hanson [Sat, 24 Jul 2004 03:03:24 +0000 (03:03 +0000)]
Change HTML-EXTERNAL-DTD to HTML-EXTERNAL-ID.
Chris Hanson [Sat, 24 Jul 2004 02:26:24 +0000 (02:26 +0000)]
Add constructors to aid in building conformant XHTML documents.
Chris Hanson [Sat, 24 Jul 2004 02:12:20 +0000 (02:12 +0000)]
Add support for XHTML 1.1.
Chris Hanson [Thu, 22 Jul 2004 03:01:50 +0000 (03:01 +0000)]
Fix some text that isn't right for Edwin. (closes: [bugs #7233])
Chris Hanson [Mon, 19 Jul 2004 17:36:48 +0000 (17:36 +0000)]
Move generic XML convenience procedures from "xhtml.scm" to
"xml-struct.scm". Add new procedures STANDARD-XML-ELEMENT-CONSTRUCTOR
and STANDARD-XML-ELEMENT-PREDICATE.
Chris Hanson [Mon, 19 Jul 2004 17:20:40 +0000 (17:20 +0000)]
Export FLATTEN-XML-ELEMENT-CONTENTS.
Chris Hanson [Mon, 19 Jul 2004 04:45:20 +0000 (04:45 +0000)]
Update list of element names to cover exactly those elements defined
by XHTML 1.0 strict, and no others. Add some context information, for
use in styling and analysis.
New procedures GUARANTEE-HTML-ELEMENT, HTML-ELEMENT-NAME?,
GUARANTEE-HTML-ELEMENT-NAME, HTML-ELEMENT-CONTEXT,
HTML-ELEMENT-NAME-CONTEXT, HTML-ELEMENT-NAMES. Rename HTML-ATTRS to
XML-ATTRS. Rename HTML:COMMENT to XML-COMMENT and move it to
"xml-struct".
Chris Hanson [Sun, 18 Jul 2004 04:34:06 +0000 (04:34 +0000)]
Allow HTML:COMMENT to take anything that satisfies XML-CHAR-DATA? as
an argument. Also, be a little smarter about when to add leading or
trailing whitespace.
Chris Hanson [Thu, 15 Jul 2004 19:50:43 +0000 (19:50 +0000)]
Add support for NMTOKENS values.
Chris Hanson [Thu, 15 Jul 2004 18:25:07 +0000 (18:25 +0000)]
Generalize HTML-ATTRS to allow xml-attribute objects as arguments,
interspersed with keyword pairs.
Chris Hanson [Thu, 15 Jul 2004 18:16:49 +0000 (18:16 +0000)]
Add XHTML support.
Chris Hanson [Thu, 15 Jul 2004 04:07:40 +0000 (04:07 +0000)]
Allow SYMBOL to accept characters as arguments.
Chris Hanson [Thu, 15 Jul 2004 04:05:39 +0000 (04:05 +0000)]
Implement SYMBOL.
Chris Hanson [Mon, 12 Jul 2004 19:08:36 +0000 (19:08 +0000)]
Implement HTML-ELEMENT?.
Chris Hanson [Mon, 12 Jul 2004 19:05:36 +0000 (19:05 +0000)]
Move xhtml support into this package. Change names to contain "html"
so they don't conflict with others.
Chris Hanson [Mon, 5 Jul 2004 03:59:36 +0000 (03:59 +0000)]
New macro RULE-MATCHER. Rewrite rule-matching mechanism to make it
more abstract.
Chris Hanson [Sun, 4 Jul 2004 05:37:25 +0000 (05:37 +0000)]
Fix typo.
Chris Hanson [Sun, 4 Jul 2004 05:28:56 +0000 (05:28 +0000)]
Add new operations to categorize type codes.
Chris Hanson [Sun, 4 Jul 2004 05:23:43 +0000 (05:23 +0000)]
Add new primitive TYPE->GC-TYPE.
Chris Hanson [Fri, 2 Jul 2004 01:00:46 +0000 (01:00 +0000)]
OBJECT-GC-TYPE is no longer a primitive.
Chris Hanson [Fri, 2 Jul 2004 00:54:07 +0000 (00:54 +0000)]
Fix definitions of OBJECT-POINTER? and OBJECT-NON-POINTER? so they are
more accurate.
Chris Hanson [Fri, 2 Jul 2004 00:51:53 +0000 (00:51 +0000)]
Use OBJECT-NON-POINTER? rather than NON-POINTER-OBJECT?.
Chris Hanson [Thu, 1 Jul 2004 15:23:56 +0000 (15:23 +0000)]
Fix typo in previous change.
Chris Hanson [Thu, 1 Jul 2004 01:19:59 +0000 (01:19 +0000)]
Move REGISTER-TYPES-COMPATIBLE? to arch-independent file.
Chris Hanson [Mon, 28 Jun 2004 03:27:04 +0000 (03:27 +0000)]
Implement XML-PROCESSING-INSTRUCTIONS-HANDLERS.
Chris Hanson [Mon, 28 Jun 2004 03:26:20 +0000 (03:26 +0000)]
Implement XML-MISC-CONTENT-ITEM?.
Chris Hanson [Sun, 27 Jun 2004 06:26:33 +0000 (06:26 +0000)]
Fix valid-content tests on output of processing instructions to
correspond to those in xml-struct.
Chris Hanson [Wed, 23 Jun 2004 03:45:50 +0000 (03:45 +0000)]
Add support for fractional seconds in ISO 8601 times.
Chris Hanson [Wed, 16 Jun 2004 01:55:18 +0000 (01:55 +0000)]
Update menu: delete missing section in hash-table docs.
Chris Hanson [Sun, 13 Jun 2004 04:14:22 +0000 (04:14 +0000)]
Must lock table during REHASH-TABLE!.
Chris Hanson [Sat, 12 Jun 2004 03:46:22 +0000 (03:46 +0000)]
Make sure hashing operations integrate as I intended. Reduce table
locking to protect against abort but not simultaneous access.
Chris Hanson [Sat, 12 Jun 2004 02:15:48 +0000 (02:15 +0000)]
Reimplement PRIME-NUMBERS-STREAM to use less space.
Chris Hanson [Sat, 12 Jun 2004 02:14:56 +0000 (02:14 +0000)]
Implement SMALLEST-FIXNUM and LARGEST-FIXNUM.
Chris Hanson [Mon, 7 Jun 2004 19:54:30 +0000 (19:54 +0000)]
Reflect new hash-table implementation.
Chris Hanson [Mon, 7 Jun 2004 19:47:57 +0000 (19:47 +0000)]
New hash-table implementation.
Chris Hanson [Thu, 27 May 2004 16:06:31 +0000 (16:06 +0000)]
When closing a port, don't try to flush output if the channel is
already closed.
Chris Hanson [Thu, 27 May 2004 14:04:32 +0000 (14:04 +0000)]
Export UTF-xx input ports.
Chris Hanson [Thu, 27 May 2004 14:03:06 +0000 (14:03 +0000)]
Add missing error checking to UTF-8 decoder: was allowing illegal code
points. Simplify code that checks for illegal code points; some of
the checks were redundant. Implement object buffering, and use it to
reimplement wide-string format conversions and ports. Implement input
ports for UTF-xx strings.
Chris Hanson [Wed, 26 May 2004 17:43:18 +0000 (17:43 +0000)]
Implement byte sources.
Chris Hanson [Wed, 26 May 2004 17:05:56 +0000 (17:05 +0000)]
Add procedures to do output directly to UTF-xx strings.
Chris Hanson [Wed, 26 May 2004 17:03:14 +0000 (17:03 +0000)]
Fix bug in handling of wide strings.
Chris Hanson [Wed, 26 May 2004 15:26:29 +0000 (15:26 +0000)]
Use new procedure PORT/SUPPORTS-CODING? to eliminate error when
writing XML to string.
Chris Hanson [Wed, 26 May 2004 15:20:22 +0000 (15:20 +0000)]
Add new procedure PORT/SUPPORTS-CODING?.
Chris Hanson [Wed, 26 May 2004 10:52:11 +0000 (10:52 +0000)]
When deciding whether it is legal to associate an IRI with a name,
distinguish between a name with no prefix and a name that is not
namespace well formed. The former may have an IRI, and the latter may
not.
Chris Hanson [Tue, 30 Mar 2004 04:45:01 +0000 (04:45 +0000)]
Generalize code to toggle Dired sort order.
Chris Hanson [Tue, 30 Mar 2004 04:27:52 +0000 (04:27 +0000)]
New port abstraction is hiding unread characters from the underlying
port operations; consequently, the buffer-input implementation was
returning the wrong value for the current mark. This has been kludged
around.
Chris Hanson [Wed, 24 Mar 2004 21:16:55 +0000 (21:16 +0000)]
Allow "utf7" and "utf8" character sets.
Chris Hanson [Tue, 9 Mar 2004 06:26:50 +0000 (06:26 +0000)]
Change PAGE_READWRITE to PAGE_EXECUTE_READWRITE, so that XP SP2
doesn't invalidate all execution in the heap.
Chris Hanson [Tue, 9 Mar 2004 03:46:42 +0000 (03:46 +0000)]
Don't try to allocate zero-length string in RELOAD-SAVE-STRING.
Chris Hanson [Thu, 26 Feb 2004 19:05:06 +0000 (19:05 +0000)]
INPUT-PORT/READ-STRING wasn't returning an EOF object when needed.
Chris Hanson [Thu, 26 Feb 2004 19:03:58 +0000 (19:03 +0000)]
Fix typo that prevented EOF from being properly detected.
Chris Hanson [Thu, 26 Feb 2004 18:31:41 +0000 (18:31 +0000)]
Update version number to reflect changes.
Chris Hanson [Thu, 26 Feb 2004 04:52:03 +0000 (04:52 +0000)]
Allow a name to contain colons as specified by the XML standard.
However, don't allow association of an IRI with the name unless the
name uses a single colon as specified by the namespace standard.
Chris Hanson [Thu, 26 Feb 2004 04:50:14 +0000 (04:50 +0000)]
Fix thinko in handling of name parsing.
Chris Hanson [Thu, 26 Feb 2004 01:58:53 +0000 (01:58 +0000)]
Restore colon as name-initial char.
Chris Hanson [Thu, 26 Feb 2004 01:52:24 +0000 (01:52 +0000)]
Remove now-obsolete code that forces output coding to UTF-8.
Chris Hanson [Wed, 25 Feb 2004 21:00:52 +0000 (21:00 +0000)]
Generate BOM on output for those encodings that require it.
Chris Hanson [Wed, 25 Feb 2004 20:59:29 +0000 (20:59 +0000)]
Fix bugs in implementation of UTF-32 coding.
Chris Hanson [Wed, 25 Feb 2004 20:59:02 +0000 (20:59 +0000)]
Add name for BOM character.
Chris Hanson [Tue, 24 Feb 2004 20:59:09 +0000 (20:59 +0000)]
Fix thinko.
Chris Hanson [Tue, 24 Feb 2004 20:49:08 +0000 (20:49 +0000)]
Use temporary file as intermediary for write/re-read test. This tests
the character coding as well as the plain I/O.
Chris Hanson [Tue, 24 Feb 2004 20:48:32 +0000 (20:48 +0000)]
Fix typo.
Chris Hanson [Tue, 24 Feb 2004 20:36:42 +0000 (20:36 +0000)]
Implement support for character coding.
Chris Hanson [Tue, 24 Feb 2004 20:35:48 +0000 (20:35 +0000)]
Implement operations to detect known codings and line endings of a
port. Add support for US-ASCII, UTF-16, and UTF-32 codings.
Chris Hanson [Tue, 24 Feb 2004 20:34:50 +0000 (20:34 +0000)]
Don't read more characters than are needed. The XML character-coding
detection depends on this.
Chris Hanson [Tue, 24 Feb 2004 05:51:12 +0000 (05:51 +0000)]
Export DISCARD-CHAR.
Chris Hanson [Tue, 24 Feb 2004 05:50:44 +0000 (05:50 +0000)]
Export UNREAD-CHAR.
Chris Hanson [Tue, 24 Feb 2004 04:23:12 +0000 (04:23 +0000)]
Canonicalize UTF-16 and UTF-32 names.
Chris Hanson [Tue, 24 Feb 2004 01:51:00 +0000 (01:51 +0000)]
Clean up output a little.
Chris Hanson [Tue, 24 Feb 2004 01:45:53 +0000 (01:45 +0000)]
When using XML line ending on I/O port, treat output side as TEXT.
Chris Hanson [Mon, 23 Feb 2004 20:56:21 +0000 (20:56 +0000)]
Eliminate PARSE-XML-DOCUMENT. Merge STRING->XML and SUBSTRING->XML.
Force input coding to UTF-8 (for now). Force input line ending to
XML-1.0.
Chris Hanson [Mon, 23 Feb 2004 20:55:11 +0000 (20:55 +0000)]
Some tweaks to handle changes in I/O subsystem. Force UTF-8 coding on
output (for now).
Chris Hanson [Mon, 23 Feb 2004 20:53:22 +0000 (20:53 +0000)]
Use STRING->PARSER-BUFFER rather than WIDE-STRING->PARSER-BUFFER,
since the former has replaced the latter.
Chris Hanson [Mon, 23 Feb 2004 20:52:49 +0000 (20:52 +0000)]
Use wide string to test re-reading of document.
Chris Hanson [Mon, 23 Feb 2004 20:51:47 +0000 (20:51 +0000)]
Eliminate SOURCE->PARSER-BUFFER. Merge procedures
*STRING->PARSER-BUFFER into a single procedure.
Chris Hanson [Mon, 23 Feb 2004 20:50:33 +0000 (20:50 +0000)]
Rewrite STRING->WIDE-STRING to make it more efficient.
Chris Hanson [Mon, 23 Feb 2004 20:49:32 +0000 (20:49 +0000)]
Add support for UTF-32.
Chris Hanson [Wed, 18 Feb 2004 19:52:06 +0000 (19:52 +0000)]
Fix problems with parsing of element content.
Chris Hanson [Tue, 17 Feb 2004 05:53:31 +0000 (05:53 +0000)]
Use new arguments for OPEN-TCP-STREAM-SOCKET.
Chris Hanson [Tue, 17 Feb 2004 05:46:20 +0000 (05:46 +0000)]
Fix some bugs in the parser buffer.
Chris Hanson [Tue, 17 Feb 2004 05:35:46 +0000 (05:35 +0000)]
Fix typo.
Chris Hanson [Tue, 17 Feb 2004 05:00:18 +0000 (05:00 +0000)]
Add kludge to define MATCH-UTF8-CHAR-IN-ALPHABET.
Chris Hanson [Tue, 17 Feb 2004 04:59:54 +0000 (04:59 +0000)]
Add line-ending support.
Chris Hanson [Tue, 17 Feb 2004 04:59:29 +0000 (04:59 +0000)]
Add NEWLINE line-ending.
Chris Hanson [Mon, 16 Feb 2004 05:50:43 +0000 (05:50 +0000)]
Changes required by reimplementation of I/O subsystem.
Chris Hanson [Mon, 16 Feb 2004 05:40:46 +0000 (05:40 +0000)]
Bump version to reflect major change.
Chris Hanson [Mon, 16 Feb 2004 05:39:37 +0000 (05:39 +0000)]
The I/O subsystem has once again been redesigned. The primary goal of
this large change is to integrate support for Unicode and character
coding directly into the I/O subsystem. Secondary goals are to
improve I/O performance, to simplify the design, and to provide
flexibility for future enhancement.
This change set has received cursory testing, and no doubt a number of
problems remain. Additionally, there are several unfinished aspects
to the change. But this version works well enough to run Edwin.
Detailed changes
----------------
The term "line translation" is everywhere replaced with "line ending".
A line ending is now specified by a symbol, such as 'crlf or 'lf;
previously it was a string. I/O files now support a single line
ending for both input and output sides; previously there were two
independent line translations.
The I/O buffers have been completely redesigned. They now operate in
three stages: one stage does byte-stream I/O, the second manages
coding (e.g. UTF-8), and the third manages line endings. Only bytes
are buffered. As a consequence, READ-CHAR and WRITE-CHAR will now
handle any Unicode character, provided the port's coding is set to an
appropriate value.
The READ-SUBSTRING port operation can now assume that its START
argument is strictly less than its END argument. Likewise for the new
operations READ-WIDE-SUBSTRING and READ-EXTERNAL-SUBSTRING.
The WRITE-SUBSTRING port operation now returns either #F or a
non-negative integer. It can also now assume that its START argument
is strictly less than its END argument. Both of these properties are
true for the new WRITE-WIDE-SUBSTRING and WRITE-EXTERNAL-SUBSTRING.
The WRITE-CHAR port operation now returns either #F, 0, or 1, as if it
was a call to WRITE-SUBSTRING with a one-char string.
The CHAR-READY? port operation and the INPUT-PORT/CHAR-READY?
procedure no longer accept a second "interval" argument. Handling of
the timeout interval is instead implemented directly in the
CHAR-READY? procedure.
Strings are always considered to be encoded using ISO-8859-1.
The parser-buffer datatype has been widened to handle all Unicode
characters.
All ports now support the FRESH-LINE operation, which is implemented
as a layer on top of the supplied operations. Similarly, the
PEEK-CHAR, DISCARD-CHAR, and new UNREAD-CHAR operations are
implemented for all ports.
End-of-file objects now have an associated port.
RUN-SHELL-COMMAND and RUN-SYNCHRONOUS-SUBPROCESS now accept a keyword
argument LINE-ENDING, which replaces the old options
INPUT-LINE-TRANSLATION and OUTPUT-LINE-TRANSLATION.
Transcript support has been moved into the core port abstraction.
Consequently, it is no longer necessary to encapsulate a port in order
to get transcript support. Encapsulated ports have been eliminated,
as this was their only use.
The procedures OPEN-TCP-STREAM-SOCKET, OPEN-UNIX-STREAM-SOCKET,
SUBPROCESS-I/O-PORT, and TCP-SERVER-CONNECTION-ACCEPT have changed
their argument structure. All arguments dealing with buffer size and
line translation have been eliminated. In the new implementation, the
buffer size is fixed, and handling of line endings is changed by
calling PORT/SET-LINE-ENDING.
The following variables have been eliminated:
CHANNEL-WRITE-CHAR-BLOCK
CHANNEL-WRITE-STRING-BLOCK
ENCAPSULATED-PORT/PORT
ENCAPSULATED-PORT/STATE
ENCAPSULATED-PORT?
GUARANTEE-ENCAPSULATED-PORT
INPUT-PORT/CHANNEL
INPUT-PORT/COPY
INPUT-PORT/CUSTOM-OPERATION
INPUT-PORT/OPERATION
INPUT-PORT/OPERATION
INPUT-PORT/OPERATION-NAMES
INPUT-PORT/STATE
MAKE-ENCAPSULATED-PORT
MAKE-GENERIC-INPUT-PORT
MAKE-GENERIC-OUTPUT-PORT
MAKE-I/O-PORT
MAKE-INPUT-PORT
MAKE-OUTPUT-PORT
MATCH-UTF8-CHAR-IN-ALPHABET
OUTPUT-PORT/CHANNEL
OUTPUT-PORT/COPY
OUTPUT-PORT/CUSTOM-OPERATION
OUTPUT-PORT/OPERATION
OUTPUT-PORT/OPERATION
OUTPUT-PORT/OPERATION-NAMES
OUTPUT-PORT/STATE
PATHNAME-END-OF-LINE-STRING
PATHNAME-NEWLINE-TRANSLATION
SET-ENCAPSULATED-PORT/STATE!
SET-INPUT-PORT/STATE!
SET-OUTPUT-PORT/STATE!
The following port operations have been eliminated:
BUFFERED-INPUT-CHARS
BUFFERED-OUTPUT-CHARS
CHARS-REMAINING
DISCARD-CHAR
DISCARD-CHARS
FRESH-LINE
INPUT-BUFFER-SIZE
OUTPUT-BUFFER-SIZE
PEEK-CHAR
READ-STRING
REST->STRING
SET-INPUT-BUFFER-SIZE
SET-OUTPUT-BUFFER-SIZE
\f
To do:
* locking
* column tracking
* convert parser from peek/discard to read/unread
* [?] integrate parser-buffer support (port.scm/input.scm)
* change buffer I/O ports to handle line endings as needed
Change arg structure of:
char-ready? port operation
input-port/char-ready?
make-generic-i/o-port
make-input-buffer
make-output-buffer
open-tcp-stream-socket
open-unix-stream-socket
subprocess-i/o-port
tcp-server-connection-accept
Renamed variables:
os/default-end-of-line-translation => default-line-ending
os/file-end-of-line-translation => file-line-ending
New variables:
channel-has-input?
channel-write-byte-block
condition-type:char-decoding-error
condition-type:char-encoding-error
condition-type:not-8-bit-char
console-i/o-port?
eof-object-port
error:char-decoding
error:char-encoding
error:not-8-bit-char
guarantee-wide-substring
input-port/read-external-substring
input-port/read-wide-substring
input-port/unread-char
match-parser-buffer-char-in-alphabet
match-parser-buffer-char-in-alphabet-no-advance
match-parser-buffer-char-not-in-alphabet
match-parser-buffer-char-not-in-alphabet-no-advance
match-parser-buffer-char-not-in-set
match-parser-buffer-char-not-in-set-no-advance
output-port/write-external-substring
output-port/write-wide-substring
port/coding
port/line-ending
port/set-coding
port/set-line-ending
port=?
set-channel-port!
unread-char
wide-string->parser-buffer
wide-substring
wide-substring->parser-buffer
New port operations:
coding
line-ending
read-external-substring
read-wide-substring
set-coding
set-line-ending
write-external-substring
write-wide-substring
Chris Hanson [Fri, 6 Feb 2004 18:15:40 +0000 (18:15 +0000)]
Fix typo.
Chris Hanson [Wed, 4 Feb 2004 05:02:12 +0000 (05:02 +0000)]
Fix typos.
Chris Hanson [Wed, 4 Feb 2004 05:01:32 +0000 (05:01 +0000)]
Fix CLOSE-ENOUGH?.
Chris Hanson [Tue, 3 Feb 2004 18:46:50 +0000 (18:46 +0000)]
Don't set super/hyper bucky bits based on modifier keys.
Chris Hanson [Sat, 31 Jan 2004 02:16:53 +0000 (02:16 +0000)]
Don't specially handle control/meta-modified alphabetic keys; this
appears to be a broken optimization from long ago. Thanks to Joe
Marshall for figuring it out.