birchwood-abbey.net Git - mit-scheme.git/commit

The I/O subsystem has once again been redesigned.  The primary goal of
this large change is to integrate support for Unicode and character
coding directly into the I/O subsystem.  Secondary goals are to
improve I/O performance, to simplify the design, and to provide
flexibility for future enhancement.

This change set has received cursory testing, and no doubt a number of
problems remain.  Additionally, there are several unfinished aspects
to the change.  But this version works well enough to run Edwin.

Detailed changes
----------------

The term "line translation" is everywhere replaced with "line ending".
A line ending is now specified by a symbol, such as 'crlf or 'lf;
previously it was a string.  I/O files now support a single line
ending for both input and output sides; previously there were two
independent line translations.

The I/O buffers have been completely redesigned.  They now operate in
three stages: one stage does byte-stream I/O, the second manages
coding (e.g. UTF-8), and the third manages line endings.  Only bytes
are buffered.  As a consequence, READ-CHAR and WRITE-CHAR will now
handle any Unicode character, provided the port's coding is set to an
appropriate value.

The READ-SUBSTRING port operation can now assume that its START
argument is strictly less than its END argument.  Likewise for the new
operations READ-WIDE-SUBSTRING and READ-EXTERNAL-SUBSTRING.

The WRITE-SUBSTRING port operation now returns either #F or a
non-negative integer.  It can also now assume that its START argument
is strictly less than its END argument.  Both of these properties are
true for the new WRITE-WIDE-SUBSTRING and WRITE-EXTERNAL-SUBSTRING.

The WRITE-CHAR port operation now returns either #F, 0, or 1, as if it
was a call to WRITE-SUBSTRING with a one-char string.

The CHAR-READY? port operation and the INPUT-PORT/CHAR-READY?
procedure no longer accept a second "interval" argument.  Handling of
the timeout interval is instead implemented directly in the
CHAR-READY? procedure.

Strings are always considered to be encoded using ISO-8859-1.

The parser-buffer datatype has been widened to handle all Unicode
characters.

All ports now support the FRESH-LINE operation, which is implemented
as a layer on top of the supplied operations.  Similarly, the
PEEK-CHAR, DISCARD-CHAR, and new UNREAD-CHAR operations are
implemented for all ports.

End-of-file objects now have an associated port.

RUN-SHELL-COMMAND and RUN-SYNCHRONOUS-SUBPROCESS now accept a keyword
argument LINE-ENDING, which replaces the old options
INPUT-LINE-TRANSLATION and OUTPUT-LINE-TRANSLATION.

Transcript support has been moved into the core port abstraction.
Consequently, it is no longer necessary to encapsulate a port in order
to get transcript support.  Encapsulated ports have been eliminated,
as this was their only use.

The procedures OPEN-TCP-STREAM-SOCKET, OPEN-UNIX-STREAM-SOCKET,
SUBPROCESS-I/O-PORT, and TCP-SERVER-CONNECTION-ACCEPT have changed
their argument structure.  All arguments dealing with buffer size and
line translation have been eliminated.  In the new implementation, the
buffer size is fixed, and handling of line endings is changed by
calling PORT/SET-LINE-ENDING.

The following variables have been eliminated:

CHANNEL-WRITE-CHAR-BLOCK
CHANNEL-WRITE-STRING-BLOCK
ENCAPSULATED-PORT/PORT
ENCAPSULATED-PORT/STATE
ENCAPSULATED-PORT?
GUARANTEE-ENCAPSULATED-PORT
INPUT-PORT/CHANNEL
INPUT-PORT/COPY
INPUT-PORT/CUSTOM-OPERATION
INPUT-PORT/OPERATION
INPUT-PORT/OPERATION
INPUT-PORT/OPERATION-NAMES
INPUT-PORT/STATE
MAKE-ENCAPSULATED-PORT
MAKE-GENERIC-INPUT-PORT
MAKE-GENERIC-OUTPUT-PORT
MAKE-I/O-PORT
MAKE-INPUT-PORT
MAKE-OUTPUT-PORT
MATCH-UTF8-CHAR-IN-ALPHABET
OUTPUT-PORT/CHANNEL
OUTPUT-PORT/COPY
OUTPUT-PORT/CUSTOM-OPERATION
OUTPUT-PORT/OPERATION
OUTPUT-PORT/OPERATION
OUTPUT-PORT/OPERATION-NAMES
OUTPUT-PORT/STATE
PATHNAME-END-OF-LINE-STRING
PATHNAME-NEWLINE-TRANSLATION
SET-ENCAPSULATED-PORT/STATE!
SET-INPUT-PORT/STATE!
SET-OUTPUT-PORT/STATE!

The following port operations have been eliminated:

BUFFERED-INPUT-CHARS
BUFFERED-OUTPUT-CHARS
CHARS-REMAINING
DISCARD-CHAR
DISCARD-CHARS
FRESH-LINE
INPUT-BUFFER-SIZE
OUTPUT-BUFFER-SIZE
PEEK-CHAR
READ-STRING
REST->STRING
SET-INPUT-BUFFER-SIZE
SET-OUTPUT-BUFFER-SIZE
\f
To do:

* locking
* column tracking
* convert parser from peek/discard to read/unread
* [?] integrate parser-buffer support (port.scm/input.scm)
* change buffer I/O ports to handle line endings as needed

Change arg structure of:
char-ready? port operation
input-port/char-ready?
make-generic-i/o-port
make-input-buffer
make-output-buffer
open-tcp-stream-socket
open-unix-stream-socket
subprocess-i/o-port
tcp-server-connection-accept

Renamed variables:
os/default-end-of-line-translation => default-line-ending
os/file-end-of-line-translation => file-line-ending

New variables:
channel-has-input?
channel-write-byte-block
condition-type:char-decoding-error
condition-type:char-encoding-error
condition-type:not-8-bit-char
console-i/o-port?
eof-object-port
error:char-decoding
error:char-encoding
error:not-8-bit-char
guarantee-wide-substring
input-port/read-external-substring
input-port/read-wide-substring
input-port/unread-char
match-parser-buffer-char-in-alphabet
match-parser-buffer-char-in-alphabet-no-advance
match-parser-buffer-char-not-in-alphabet
match-parser-buffer-char-not-in-alphabet-no-advance
match-parser-buffer-char-not-in-set
match-parser-buffer-char-not-in-set-no-advance
output-port/write-external-substring
output-port/write-wide-substring
port/coding
port/line-ending
port/set-coding
port/set-line-ending
port=?
set-channel-port!
unread-char
wide-string->parser-buffer
wide-substring
wide-substring->parser-buffer

New port operations:
coding
line-ending
read-external-substring
read-wide-substring
set-coding
set-line-ending
write-external-substring
write-wide-substring

author	Chris Hanson <org/chris-hanson/cph>
	Mon, 16 Feb 2004 05:39:37 +0000 (05:39 +0000)
committer	Chris Hanson <org/chris-hanson/cph>
	Mon, 16 Feb 2004 05:39:37 +0000 (05:39 +0000)
commit	d125b052fc813686a5d1333a1126589629b5efeb
tree	d0a8f761ed28da8b1cfabc425bf24c0c7691ce29	tree \| snapshot
parent	fd19785c25c06583e18d55ae409409264eb5bd7d	commit \| diff

v7/src/runtime/dosprm.scm		diff \| blob \| history
v7/src/runtime/dospth.scm		diff \| blob \| history
v7/src/runtime/emacs.scm		diff \| blob \| history
v7/src/runtime/error.scm		diff \| blob \| history
v7/src/runtime/fileio.scm		diff \| blob \| history
v7/src/runtime/genio.scm		diff \| blob \| history
v7/src/runtime/input.scm		diff \| blob \| history
v7/src/runtime/io.scm		diff \| blob \| history
v7/src/runtime/mime-codec.scm		diff \| blob \| history
v7/src/runtime/ntprm.scm		diff \| blob \| history
v7/src/runtime/os2prm.scm		diff \| blob \| history
v7/src/runtime/output.scm		diff \| blob \| history
v7/src/runtime/parse.scm		diff \| blob \| history
v7/src/runtime/parser-buffer.scm		diff \| blob \| history
v7/src/runtime/pathnm.scm		diff \| blob \| history
v7/src/runtime/port.scm		diff \| blob \| history
v7/src/runtime/process.scm		diff \| blob \| history
v7/src/runtime/rep.scm		diff \| blob \| history
v7/src/runtime/runtime.pkg		diff \| blob \| history
v7/src/runtime/socket.scm		diff \| blob \| history
v7/src/runtime/string.scm		diff \| blob \| history
v7/src/runtime/strnin.scm		diff \| blob \| history
v7/src/runtime/strott.scm		diff \| blob \| history
v7/src/runtime/strout.scm		diff \| blob \| history
v7/src/runtime/syncproc.scm		diff \| blob \| history
v7/src/runtime/tscript.scm		diff \| blob \| history
v7/src/runtime/ttyio.scm		diff \| blob \| history
v7/src/runtime/unicode.scm		diff \| blob \| history
v7/src/runtime/unxprm.scm		diff \| blob \| history
v7/src/runtime/unxpth.scm		diff \| blob \| history