Input Procedures (MIT/GNU Scheme Pucked Reference Manual)

14.5 Input Procedures

This section describes the procedures that read input. Input procedures can read either from the current input port or from a given port. Remember that to read from a file, you must first open a port to the file.

Input ports can be divided into two types, called interactive and non-interactive. Interactive input ports are ports that read input from a source that is time-dependent; for example, a port that reads input from a terminal or from another program. Non-interactive input ports read input from a time-independent source, such as an ordinary file or a character string.

In this section, all optional arguments called port default to the current input port.

standard procedure: read [port]

The read procedure converts external representations of Scheme objects into the objects themselves. It returns the next object parsable from the given textual input port, updating port to point to the first character past the end of the external representation of the object.

Implementations may support extended syntax to represent record types or other types that do not have datum representations.

If an end of file is encountered in the input before any characters are found that can begin an object, then an end-of-file object is returned. The port remains open, and further attempts to read will also return an end-of-file object. If an end of file is encountered after the beginning of an object’s external representation, but the external representation is incomplete and therefore not parsable, an error that satisfies read-error? is signaled.

The port remains open, and further attempts to read will also return an end-of-file object. If an end of file is encountered after the beginning of an object’s written representation, but the written representation is incomplete and therefore not parsable, an error is signalled.

standard procedure: read-char [port]

Returns the next character available from the textual input port, updating port to point to the following character. If no more characters are available, an end-of-file object is returned.

In MIT/GNU Scheme, if port is an interactive input port and no characters are immediately available, read-char will hang waiting for input, even if the port is in non-blocking mode.

procedure: read-char-no-hang [port]: This procedure behaves exactly like read-char except when port is an interactive port in non-blocking mode, and there are no characters immediately available. In that case this procedure returns #f without blocking.

procedure: unread-char char [port]

The given char must be the most-recently read character from the textual input port. This procedure “unreads” the character, updating port as if the character had never been read.

Note that this only works with characters returned by read-char or read-char-no-hang.

standard procedure: peek-char [port]

Returns the next character available from the textual input port, without updating port to point to the following character. If no more characters are available, an end-of-file object is returned.

Note: The value returned by a call to peek-char is the same as the value that would have been returned by a call to read-char on the same port. The only difference is that the very next call to read-char or peek-char on that port will return the value returned by the preceding call to peek-char. In particular, a call to peek-char on an interactive port will hang waiting for input whenever a call to read-char would have hung.

standard procedure: read-line [port]

Returns the next line of text available from the textual input port, updating the port to point to the following character. If an end of line is read, a string containing all of the text up to (but not including) the end of line is returned, and the port is updated to point just past the end of line. If an end of file is encountered before any end of line is read, but some characters have been read, a string containing those characters is returned. If an end of file is encountered before any characters are read, an end-of-file object is returned. For the purpose of this procedure, an end of line consists of either a linefeed character, a carriage return character, or a sequence of a carriage return character followed by a linefeed character. Implementations may also recognize other end of line characters or sequences.

In MIT/GNU Scheme, if port is an interactive input port and no characters are immediately available, read-line will hang waiting for input, even if the port is in non-blocking mode.

standard procedure: eof-object? object: Returns #t if object is an end-of-file object, otherwise returns #f. The precise set of end-of-file objects will vary among implementations, but in any case no end-of-file object will ever be an object that can be read in using read.

standard procedure: eof-object: Returns an end-of-file object, not necessarily unique.

standard procedure: char-ready? [port]

Returns #t if a character is ready on the textual input port and returns #f otherwise. If char-ready? returns #t then the next read-char operation on the given port is guaranteed not to hang. If the port is at end of file then char-ready? returns #t.

Rationale: The char-ready? procedure exists to make it possible for a program to accept characters from interactive ports without getting stuck waiting for input. Any input editors associated with such ports must ensure that characters whose existence has been asserted by char-ready? cannot be removed from the input. If char-ready? were to return #f at end of file, a port at end of file would be indistinguishable from an interactive port that has no ready characters.

standard procedure: read-string k [port]

Reads the next k characters, or as many as are available before the end of file, from the textual input port into a newly allocated string in left-to-right order and returns the string. If no characters are available before the end of file, an end-of-file object is returned.

Note: MIT/GNU Scheme previously defined this procedure differently, and this alternate usage is deprecated; please use read-delimited-string instead. For now, read-string will redirect to read-delimited-string as needed, but this redirection will be eliminated in a future release.

procedure: read-string! string [port [start [end]]]

Reads the next end-start characters, or as many as are available before the end of file, from the textual input port into string in left-to-right order beginning at the start position. If end is not supplied, reads until the end of string has been reached. If start is not supplied, reads beginning at position 0. Returns the number of characters read. If no characters are available, an end-of-file object is returned.

In MIT/GNU Scheme, if port is an interactive port in non-blocking mode and no characters are immediately available, #f is returned without any modification of string.

However, if one or more characters are immediately available, the region is filled using the available characters. The procedure then returns the number of characters filled in, without waiting for further characters, even if the number of filled characters is less than the size of the region.

obsolete procedure: read-substring! string start end [port]: This procedure is deprecated; use read-string! instead.

standard procedure: read-u8 [port]

Returns the next byte available from the binary input port, updating the port to point to the following byte. If no more bytes are available, an end-of-file object is returned.

In MIT/GNU Scheme, if port is an interactive input port in non-blocking mode and no characters are immediately available, read-u8 will return #f.

standard procedure: peek-u8 [port]

Returns the next byte available from the binary input port, but without updating the port to point to the following byte. If no more bytes are available, an end-of-file object is returned.

In MIT/GNU Scheme, if port is an interactive input port in non-blocking mode and no characters are immediately available, peek-u8 will return #f.

standard procedure: u8-ready? [port]: Returns #t if a byte is ready on the binary input port and returns #f otherwise. If u8-ready? returns #t then the next read-u8 operation on the given port is guaranteed not to hang. If the port is at end of file then u8-ready? returns #t.

standard procedure: read-bytevector k [port]

Reads the next k bytes, or as many as are available before the end of file, from the binary input port into a newly allocated bytevector in left-to-right order and returns the bytevector. If no bytes are available before the end of file, an end-of-file object is returned.

In MIT/GNU Scheme, if port is an interactive input port in non-blocking mode and no characters are immediately available, read-bytevector will return #f.

However, if one or more bytes are immediately available, they are read and returned as a bytevector, without waiting for further bytes, even if the number of bytes is less than k.

standard procedure: read-bytevector! bytevector [port [start [end]]]

Reads the next end-start bytes, or as many as are available before the end of file, from the binary input port into bytevector in left-to-right order beginning at the start position. If end is not supplied, reads until the end of bytevector has been reached. If start is not supplied, reads beginning at position 0. Returns the number of bytes read. If no bytes are available, an end-of-file object is returned.

In MIT/GNU Scheme, if port is an interactive input port in non-blocking mode and no characters are immediately available, read-bytevector! will return #f.

However, if one or more bytes are immediately available, the region is filled using the available bytes. The procedure then returns the number of bytes filled in, without waiting for further bytes, even if the number of filled bytes is less than the size of the region.

procedure: read-delimited-string char-set [port]

Reads characters from port until it finds a terminating character that is a member of char-set (see Character Sets) or encounters end of file. The port is updated to point to the terminating character, or to end of file if no terminating character was found. read-delimited-string returns the characters, up to but excluding the terminating character, as a newly allocated string.

This procedure ignores the blocking mode of the port, blocking unconditionally until it sees either a delimiter or end of file. If end of file is encountered before any characters are read, an end-of-file object is returned.

On many input ports, this operation is significantly faster than the following equivalent code using peek-char and read-char:

(define (read-delimited-string char-set port)
  (let ((char (peek-char port)))
    (if (eof-object? char)
        char
        (list->string
         (let loop ((char char))
           (if (or (eof-object? char)
                   (char-in-set? char char-set))
               '()
               (begin
                 (read-char port)
                 (cons char
                       (loop (peek-char port))))))))))

14.5.1 Reader Controls

The following parameters control the behavior of the read procedure.

parameter: param:reader-radix

This parameter defines the radix used by the reader when it parses numbers. This is similar to passing a radix argument to string->number. The value of the parameter must be one of 2, 8, 10, or 16; an error is signaled if the parameter is bound to any other value.

Note that much of the number syntax is invalid for radixes other than 10. The reader detects cases where such invalid syntax is used and signals an error. However, problems can still occur when param:reader-radix is bound to 16, because syntax that normally denotes symbols can now denote numbers (e.g. abc). Because of this, it is usually undesirable to bind this parameter to anything other than the default.

The default value of this parameter is 10.

parameter: param:reader-fold-case?

This parameter controls whether the parser folds the case of symbols, character names, and certain other syntax. If it is bound to its default value of #t, symbols read by the parser are case-folded prior to being interned. Otherwise, symbols are interned without folding.

At present, it is a bad idea to use this feature, as it doesn’t really make Scheme case-sensitive, and therefore can break features of the Scheme runtime that depend on case-folded symbols. Instead, use the #!fold-case or #!no-fold-case markers in your code.

obsolete variable: *parser-radix*
obsolete variable: *parser-canonicalize-symbols?*: These variables are deprecated; instead use the corresponding parameter objects.