@node Strings, Lists, Characters, Top
@chapter Strings
-@cindex string, character (defn)
-@findex char-ascii?
-A @dfn{string} is a mutable sequence of characters. In the current
-implementation of MIT/GNU Scheme, the elements of a string must all
-satisfy the predicate @code{char-ascii?}; if someone ports MIT/GNU
-Scheme to a non-@acronym{ASCII} operating system this requirement will
-change.
+@menu
+* Searching Strings::
+* Matching Strings::
+* Regular Expressions::
+@end menu
+@cindex string, character (defn)
@cindex external representation, for string
@cindex " as external representation
@cindex double quote, as external representation
@cindex backslash, as escape character in string
@cindex escape character, for string
@findex "
-@findex \
-A string is written as a sequence of characters enclosed within double
-quotes @code{" "}. To include a double quote inside a string, precede
-the double quote with a backslash @code{\} (escape it), as in
-
-@example
-"The word \"recursion\" has many meanings."
-@end example
-
-@noindent
-The printed representation of this string is
+Strings are sequences of characters. Strings are written as sequences
+of characters enclosed within quotation marks (@code{"}). Within a
+string literal, various escape sequences represent characters other
+than themselves. Escape sequences always start with a backslash
+(@code{\}):
-@example
-The word "recursion" has many meanings.
-@end example
+@display
+@group
+@code{\a} : alarm, U+0007
+@code{\b} : backspace, U+0008
+@code{\t} : character tabulation, U+0009
+@code{\n} : linefeed, U+000A
+@code{\r} : return, U+000D
+@code{\"} : double quote, U+0022
+@code{\\} : backslash, U+005C
+@code{\|} : vertical line, U+007C
+@code{\}@var{intraline-whitespace}* @var{line-ending} @var{intraline-whitespace}*
+ : nothing
+@code{\x}@var{hex-scalar-value}@code{;}
+ : specified character (note the terminating semi-colon).
+@end group
+@end display
+@findex \a
+@findex \b
+@findex \t
+@findex \n
+@findex \r
+@findex \"
+@findex \\
+@findex \|
+@findex \x
-@noindent
-To include a backslash inside a string, precede it with another
-backslash; for example,
+The result is unspecified if any other character in a string occurs
+after a backslash.
-@example
-"Use #\\Control-q to quit."
-@end example
+Except for a line ending, any character outside of an escape sequence
+stands for itself in the string literal. A line ending which is
+preceded by @code{\}@var{intraline-whitespace} expands to nothing
+(along with any trailing intraline whitespace), and can be used to
+indent strings for improved legibility. Any other line ending has the
+same effect as inserting a @code{\n} character into the string.
-@noindent
-The printed representation of this string is
+Examples:
@example
-Use #\Control-q to quit.
+@group
+"The word \"recursion\" has many meanings."
+"Another example:\ntwo lines of text"
+"Here's text \
+ containing just one line"
+"\x03B1; is named GREEK SMALL LETTER ALPHA."
+@end group
@end example
-@findex \t
-@findex \n
-@findex \f
-@findex #\tab
-@findex #\newline
-@findex #\page
-The effect of a backslash that doesn't precede a double quote or
-backslash is unspecified in standard Scheme, but MIT/GNU Scheme specifies
-the effect for three other characters: @code{\t}, @code{\n}, and
-@code{\f}. These escape sequences are respectively translated into the
-following characters: @code{#\tab}, @code{#\newline}, and @code{#\page}.
-Finally, a backslash followed by exactly three octal digits is
-translated into the character whose @acronym{ISO-8859-1} code is those
-digits.
-
-If a string literal is continued from one line to another, the string
-will contain the newline character (@code{#\newline}) at the line break.
-Standard Scheme does not specify what appears in a string literal at a
-line break.
-
@cindex length, of string (defn)
@cindex index, of string (defn)
@cindex valid index, of string (defn)
@cindex string length (defn)
@cindex string index (defn)
-The @dfn{length} of a string is the number of characters that it
-contains. This number is an exact non-negative integer that is
-established when the string is created
-(but @pxref{Variable-Length Strings}).
-Each character in a string has an @dfn{index}, which is a
-number that indicates the character's position in the string. The index
-of the first (leftmost) character in a string is 0, and the index of the
-last character is one less than the length of the string. The
-@dfn{valid indexes} of a string are the exact non-negative integers less
-than the length of the string.
-
-@cindex substring (defn)
-@cindex start, of substring (defn)
-@cindex end, of substring (defn)
-A number of the string procedures operate on substrings. A
-@dfn{substring} is a segment of a @var{string}, which is specified by
-two integers @var{start} and @var{end} satisfying these relationships:
-
-@example
-0 <= @var{start} <= @var{end} <= (string-length @var{string})
-@end example
-
-@noindent
-@var{Start} is the index of the first character in the substring, and
-@var{end} is one greater than the index of the last character in the
-substring. Thus if @var{start} and @var{end} are equal, they refer to
-an empty substring, and if @var{start} is zero and @var{end} is the
-length of @var{string}, they refer to all of @var{string}.
+The @emph{length} of a string is the number of characters that it
+contains. This number is an exact, non-negative integer that is fixed
+when the string is created. The @dfn{valid indexes} of a string are
+the exact non-negative integers less than the length of the string.
+The first character of a string has index 0, the second has index 1,
+and so on.
@cindex case sensitivity, of string operations
@cindex -ci, in string procedure name
Some of the procedures that operate on strings ignore the difference
-between uppercase and lowercase. The versions that ignore case include
-@samp{-ci} (for ``case insensitive'') in their names.
-
-@menu
-* Construction of Strings::
-* Selecting String Components::
-* Comparison of Strings::
-* Alphabetic Case in Strings::
-* Cutting and Pasting Strings::
-* Searching Strings::
-* Matching Strings::
-* Regular Expressions::
-* Modification of Strings::
-* Variable-Length Strings::
-* Byte Vectors::
-@end menu
+between upper and lower case. The names of the versions that ignore
+case end with @samp{-ci} (for ``case insensitive'').
-@node Construction of Strings, Selecting String Components, Strings, Strings
-@section Construction of Strings
-@cindex construction, of string
+Implementations may forbid certain characters from appearing in
+strings. However, with the exception of @code{#\null}, ASCII
+characters must not be forbidden. For example, an implementation
+might support the entire Unicode repertoire, but only allow characters
+U+0001 to U+00FF (the Latin-1 repertoire without @code{#\null}) in
+strings.
-@deffn {procedure} make-string k [char]
-Returns a newly allocated string of length @var{k}. If you specify
-@var{char}, all elements of the string are initialized to @var{char},
-otherwise the contents of the string are unspecified. @var{Char} must
-satisfy the predicate @code{char-ascii?}.
-
-@example
-(make-string 10 #\x) @result{} "xxxxxxxxxx"
-@end example
-@end deffn
-
-@deffn procedure string char @dots{}
-Returns a newly allocated string consisting of the specified characters.
-The arguments must all satisfy @code{char-ascii?}.
-
-@example
-@group
-(string #\a) @result{} "a"
-(string #\a #\b #\c) @result{} "abc"
-(string #\a #\space #\b #\space #\c) @result{} "a b c"
-(string) @result{} ""
-@end group
-@end example
-@end deffn
+Implementation note: MIT/GNU Scheme allows any ``bitless'' character
+to be stored in a string. In effect this means any character with a
+Unicode code point, including surrogates.
-@deffn procedure list->string char-list
-@cindex list, converting to string
-@findex string->list
-@var{Char-list} must be a list of @acronym{ISO-8859-1} characters.
-@code{list->string} returns a newly allocated string formed from the
-elements of @var{char-list}. This is equivalent to @code{(apply string
-@var{char-list})}. The inverse of this operation is
-@code{string->list}.
+It is an error to pass such a forbidden character to
+@code{make-string}, @code{string}, @code{string-set!}, or
+@code{string-fill!}, as part of the list passed to
+@code{list->string}, or as part of the vector passed to
+@code{vector->string}, or in UTF-8 encoded form within a bytevector
+passed to @code{utf8->string}. It is also an error for a procedure
+passed to @code{string-map} to return a forbidden character, or for
+@code{read-string} to attempt to read one.
-@example
-@group
-(list->string '(#\a #\b)) @result{} "ab"
-(string->list "Hello") @result{} (#\H #\e #\l #\l #\o)
-@end group
-@end example
+@deffn {standard procedure} string? obj
+Returns @code{#t} if @var{obj} is a string, otherwise returns @code{#f}.
@end deffn
-@deffn {procedure} string-copy string
-@cindex copying, of string
-Returns a newly allocated copy of @var{string}.
-
-Note regarding variable-length strings: the maximum length of the result
-depends only on the length of @var{string}, not its maximum length. If
-you wish to copy a string and preserve its maximum length, do the
-following:
-
-@example
-@group
-(define (string-copy-preserving-max-length string)
- (let ((length))
- (dynamic-wind
- (lambda ()
- (set! length (string-length string))
- (set-string-length! string
- (string-maximum-length string)))
- (lambda ()
- (string-copy string))
- (lambda ()
- (set-string-length! string length)))))
-@end group
-@end example
+@deffn {standard procedure} make-string k [char]
+The @code{make-string} procedure returns a newly allocated string of
+length @var{k}. If @var{char} is given, then all the characters of the string
+are initialized to @var{char}, otherwise the contents of the
+string are unspecified.
@end deffn
-@node Selecting String Components, Comparison of Strings, Construction of Strings, Strings
-@section Selecting String Components
-@cindex selection, of string component
-@cindex component selection, of string
-
-@deffn procedure string? object
-@cindex type predicate, for string
-Returns @code{#t} if @var{object} is a string; otherwise returns
-@code{#f}.
-
-@example
-@group
-(string? "Hi") @result{} #t
-(string? 'Hi) @result{} #f
-@end group
-@end example
+@deffn {standard procedure} string char @dots
+Returns a newly allocated string composed of the arguments. It is
+analogous to @code{list}.
@end deffn
-@deffn procedure string-length string
-Returns the length of @var{string} as an exact non-negative integer.
-
-@example
-@group
-(string-length "") @result{} 0
-(string-length "The length") @result{} 10
-@end group
-@end example
+@deffn {standard procedure} string-length string
+Returns the number of characters in the given @var{string}.
@end deffn
-@deffn procedure string-null? string
-@cindex empty string, predicate for
-@cindex null string, predicate for
-Returns @code{#t} if @var{string} has zero length; otherwise returns
-@code{#f}.
+@deffn {standard procedure} string-ref string k
+It is an error if @var{k} is not a valid index of @var{string}.
-@example
-@group
-(string-null? "") @result{} #t
-(string-null? "Hi") @result{} #f
-@end group
-@end example
+The @code{string-ref} procedure returns character @var{k} of
+@var{string} using zero-origin indexing. There is no requirement for
+this procedure to execute in constant time.
@end deffn
-@deffn procedure string-ref string k
-Returns character @var{k} of @var{string}. @var{K} must be a valid index
-of @var{string}.
-
-@example
-@group
-(string-ref "Hello" 1) @result{} #\e
-(string-ref "Hello" 5) @error{} 5 not in correct range
-@end group
-@end example
-@end deffn
+@deffn {standard procedure} string-set! string k char
+It is an error if @var{k} is not a valid index of @var{string}.
-@deffn {procedure} string-set! string k char
-Stores @var{char} in element @var{k} of @var{string} and returns an
-unspecified value. @var{K} must be a valid index of @var{string}, and
-@var{char} must satisfy the predicate @code{char-ascii?}.
+The @code{string-set!} procedure stores @var{char} in element @var{k} of @var{string}.
+There is no requirement for this procedure to execute in constant time.
@example
@group
-(define str "Dog") @result{} @r{unspecified}
-(string-set! str 0 #\L) @result{} @r{unspecified}
-str @result{} "Log"
-(string-set! str 3 #\t) @error{} 3 not in correct range
+(define (f) (make-string 3 #\*))
+(define (g) "***")
+(string-set! (f) 0 #\?) @result{} @r{@i{unspecified}}
+(string-set! (g) 0 #\?) @result{} @r{@i{error}}
+(string-set! (symbol->string 'immutable) 0 #\?) @result{} @r{@i{error}}
@end group
@end example
@end deffn
-@need 1000
-@node Comparison of Strings, Alphabetic Case in Strings, Selecting String Components, Strings
-@section Comparison of Strings
-@cindex ordering, of strings
-@cindex comparison, of strings
-
-@deffn procedure string=? string1 string2
-@deffnx procedure substring=? string1 start end string2 start end
-@deffnx {procedure} string-ci=? string1 string2
-@deffnx procedure substring-ci=? string1 start end string2 start end
-@cindex equivalence predicate, for strings
-Returns @code{#t} if the two strings (substrings) are the same length
-and contain the same characters in the same (relative) positions;
-otherwise returns @code{#f}. @code{string-ci=?} and
-@code{substring-ci=?} don't distinguish uppercase and lowercase letters,
-but @code{string=?} and @code{substring=?} do.
-
-@example
-@group
-(string=? "PIE" "PIE") @result{} #t
-(string=? "PIE" "pie") @result{} #f
-(string-ci=? "PIE" "pie") @result{} #t
-(substring=? "Alamo" 1 3 "cola" 2 4) @result{} #t @r{; compares "la"}
-@end group
-@end example
+@deffn {standard procedure} string=? string1 string2 string @dots
+Returns @code{#t} if all the strings are the same length and contain
+exactly the same characters in the same positions, otherwise returns
+@code{#f}.
@end deffn
-@deffn procedure string<? string1 string2
-@deffnx procedure substring<? string1 start1 end1 string2 start2 end2
-@deffnx procedure string>? string1 string2
-@deffnx procedure string<=? string1 string2
-@deffnx procedure string>=? string1 string2
-@deffnx {procedure} string-ci<? string1 string2
-@deffnx procedure substring-ci<? string1 start1 end1 string2 start2 end2
-@deffnx {procedure} string-ci>? string1 string2
-@deffnx {procedure} string-ci<=? string1 string2
-@deffnx {procedure} string-ci>=? string1 string2
-These procedures compare strings (substrings) according to the order of
-the characters they contain (also @pxref{Characters}).
-The arguments are compared using a lexicographic (or dictionary) order.
-If two strings differ in length but are the same up to the length of the
-shorter string, the shorter string is considered to be less than the
-longer string.
-
-@example
-@group
-(string<? "cat" "dog") @result{} #t
-(string<? "cat" "DOG") @result{} #f
-(string-ci<? "cat" "DOG") @result{} #t
-(string>? "catkin" "cat") @result{} #t @r{; shorter is lesser}
-@end group
-@end example
+@deffn {standard procedure} string-ci=? string1 string2 string @dots
+Returns @code{#t} if, after case-folding, all the strings are the same
+length and contain the same characters in the same positions,
+otherwise returns @code{#f}. Specifically, these procedures behave as
+if @code{string-foldcase} were applied to their arguments before
+comparing them.
+@end deffn
+
+@deffn {standard procedure} string<? string1 string2 string @dots
+@deffnx {standard procedure} string-ci<? string1 string2 string @dots
+@deffnx {standard procedure} string>? string1 string2 string @dots
+@deffnx {standard procedure} string-ci>? string1 string2 string @dots
+@deffnx {standard procedure} string<=? string1 string2 string @dots
+@deffnx {standard procedure} string-ci<=? string1 string2 string @dots
+@deffnx {standard procedure} string>=? string1 string2 string @dots
+@deffnx {standard procedure} string-ci>=? string1 string2 string @dots
+These procedures return @code{#t} if their arguments are (respectively):
+monotonically increasing, monotonically decreasing,
+monotonically non-decreasing, or monotonically non-increasing.
+
+These predicates are required to be transitive.
+
+These procedures compare strings in an implementation-defined way.
+One approach is to make them the lexicographic extensions to strings
+of the corresponding orderings on characters. In that case,
+@code{string<?} would be the lexicographic ordering on strings
+induced by the ordering @code{char<?} on characters, and if the two
+strings differ in length but are the same up to the length of the
+shorter string, the shorter string would be considered to be
+lexicographically less than the longer string. However, it is also
+permitted to use the natural ordering imposed by the implementation's
+internal representation of strings, or a more complex locale-specific
+ordering.
+
+In all cases, a pair of strings must satisfy exactly one of
+@code{string<?}, @code{string=?}, and @code{string>?}, and must satisfy
+@code{string<=?} if and only if they do not satisfy @code{string>?} and
+@code{string>=?} if and only if they do not satisfy @code{string<?}.
+
+The @samp{-ci} procedures behave as if they applied
+@code{string-foldcase} to their arguments before invoking the
+corresponding procedures without @samp{-ci}.
@end deffn
@deffn procedure string-compare string1 string2 if-eq if-lt if-gt
@end example
@end deffn
-@deffn procedure string-hash string
-@deffnx procedure string-hash-mod string k
-@cindex hashing, of string
-@findex string=?
-@findex =
-@code{string-hash} returns an exact non-negative integer that can be used
-for storing the specified @var{string} in a hash table. Equal strings
-(in the sense of @code{string=?}) return equal (@code{=}) hash codes,
-and non-equal but similar strings are usually mapped to distinct hash
-codes.
+@deffn {standard procedure} string-upcase string
+@deffnx {standard procedure} string-downcase string
+@deffnx {standard procedure} string-foldcase string
+These procedures apply the Unicode full string uppercasing,
+lowercasing, and case-folding algorithms to their arguments and return
+the result. In certain cases, the result differs in length from the
+argument. If the result is equal to the argument in the sense of
+@code{string=?}, the argument may be returned. Note that
+language-sensitive mappings and foldings are not used.
-@code{string-hash-mod} is like @code{string-hash}, except that it limits
-the result to a particular range based on the exact non-negative integer
-@var{k}. The following are equivalent:
+The Unicode Standard prescribes special treatment of the Greek letter
+@math{\Sigma}, whose normal lower-case form is @math{\sigma} but which
+becomes @math{\varsigma} at the end of a word. See
+@uref{http://www.unicode.org/reports/tr44/, UAX #44} (part of the
+Unicode Standard) for details. However, implementations of @code
+{string-downcase} are not required to provide this behavior, and may
+choose to change @math{\Sigma} to @math{\sigma} in all cases.
+@end deffn
+
+@deffn procedure string-upper-case? string
+@deffnx procedure string-lower-case? string
+These procedures return @code{#t} if all the letters in the string are
+lower case or upper case, otherwise they return @code{#f}. The string
+must contain at least one letter or the procedures return @code{#f}.
@example
@group
-(string-hash-mod @var{string} @var{k})
-(modulo (string-hash @var{string}) @var{k})
+(map string-upper-case? '("" "A" "art" "Art" "ART"))
+ @result{} (#f #t #f #f #t)
@end group
@end example
@end deffn
-@node Alphabetic Case in Strings, Cutting and Pasting Strings, Comparison of Strings, Strings
-@section Alphabetic Case in Strings
-@cindex alphabetic case, of string
-@cindex case, of string
-@cindex capitalization, of string
-@cindex uppercase, in string
-@cindex lowercase, in string
-
-@deffn procedure string-capitalized? string
-@deffnx procedure substring-capitalized? string start end
-These procedures return @code{#t} if the first word in the string
-(substring) is capitalized, and any subsequent words are either lower
-case or capitalized. Otherwise, they return @code{#f}. A word is
-defined as a non-null contiguous sequence of alphabetic characters,
-delimited by non-alphabetic characters or the limits of the string
-(substring). A word is capitalized if its first letter is upper case
-and all its remaining letters are lower case.
+@deffn {standard procedure} substring string start end
+The @code{substring} procedure returns a newly allocated string formed
+from the characters of @var{string} beginning with index @var{start}
+and ending with index @var{end}.
-@example
-@group
-(map string-capitalized? '("" "A" "art" "Art" "ART"))
- @result{} (#f #t #f #t #f)
-@end group
-@end example
+This is equivalent to calling @code{string-copy} with the same
+arguments, but is provided for backward compatibility and stylistic
+flexibility.
@end deffn
-@deffn procedure string-upper-case? string
-@deffnx procedure substring-upper-case? string start end
-@deffnx procedure string-lower-case? string
-@deffnx procedure substring-lower-case? string start end
-These procedures return @code{#t} if all the letters in the string
-(substring) are of the correct case, otherwise they return @code{#f}.
-The string (substring) must contain at least one letter or the
-procedures return @code{#f}.
+@deffn {standard procedure} string-append string @dots
+@deffnx procedure string-append* strings
+Returns a newly allocated string whose characters are the
+concatenation of the characters in the given strings.
-@example
-@group
-(map string-upper-case? '("" "A" "art" "Art" "ART"))
- @result{} (#f #t #f #f #t)
-@end group
-@end example
+The non-standard procedure @code{string-append*} is identical to
+@code{string-append} but takes a single argument that's a list of
+strings, rather than multiple string arguments.
@end deffn
-@deffn procedure string-capitalize string
-@deffnx procedure string-capitalize! string
-@deffnx procedure substring-capitalize! string start end
-@code{string-capitalize} returns a newly allocated copy of @var{string}
-in which the first alphabetic character is uppercase and the remaining
-alphabetic characters are lowercase. For example, @code{"abcDEF"}
-becomes @code{"Abcdef"}. @code{string-capitalize!} is the destructive
-version of @code{string-capitalize}: it alters @var{string} and returns
-an unspecified value. @code{substring-capitalize!} destructively
-capitalizes the specified part of @var{string}.
-@end deffn
-
-@deffn procedure string-downcase string
-@deffnx procedure string-downcase! string
-@deffnx procedure substring-downcase! string start end
-@code{string-downcase} returns a newly allocated copy of @var{string} in
-which all uppercase letters are changed to lowercase.
-@code{string-downcase!} is the destructive version of
-@code{string-downcase}: it alters @var{string} and returns an
-unspecified value. @code{substring-downcase!} destructively changes the
-case of the specified part of @var{string}.
+@deffn {standard procedure} string->list string [start [end]]
+@deffnx {standard procedure} list->string list
+It is an error if any element of @var{list} is not a character.
-@example
-@group
-(define str "ABCDEFG") @result{} @r{unspecified}
-(substring-downcase! str 3 5) @result{} @r{unspecified}
-str @result{} "ABCdeFG"
-@end group
-@end example
+The @code{string->list} procedure returns a newly allocated list of
+the characters of @var{string} between @var{start} and @var{end}.
+@code{list->string} returns a newly allocated string formed from the
+elements in the list @var{list}. In both procedures, order is
+preserved. @code{string->list} and @code{list->string} are inverses
+so far as @code{equal?} is concerned.
@end deffn
-@deffn procedure string-upcase string
-@deffnx procedure string-upcase! string
-@deffnx procedure substring-upcase! string start end
-@code{string-upcase} returns a newly allocated copy of @var{string} in
-which all lowercase letters are changed to uppercase.
-@code{string-upcase!} is the destructive version of
-@code{string-upcase}: it alters @var{string} and returns an unspecified
-value. @code{substring-upcase!} destructively changes the case of the
-specified part of @var{string}.
+@deffn {standard procedure} string-copy string [start [end]]
+Returns a newly allocated copy of the part of the given @var{string}
+between @var{start} and @var{end}.
@end deffn
-@node Cutting and Pasting Strings, Searching Strings, Alphabetic Case in Strings, Strings
-@section Cutting and Pasting Strings
-@cindex cutting, of string
-@cindex pasting, of strings
+@deffn {standard procedure} string-copy! to at from [start [end]]
+It is an error if @var{at} is less than zero or greater than the
+length of @var{to}. It is also an error if @code{(- (string-length
+@var{to}) @var{at})} is less than @code{(- @var{end} @var{start})}.
-@deffn {procedure} string-append string @dots{}
-@cindex appending, of strings
-Returns a newly allocated string made from the concatenation of the given
-strings. With no arguments, @code{string-append} returns the empty
-string (@code{""}).
+Copies the characters of string @var{from} between @var{start} and
+@var{end} to string @var{to}, starting at @var{at}. The order in
+which characters are copied is unspecified, except that if the source
+and destination overlap, copying takes place as if the source is first
+copied into a temporary string and then into the destination. This
+can be achieved without allocating storage by making sure to copy in
+the correct direction in such circumstances.
@example
@group
-(string-append) @result{} ""
-(string-append "*" "ace" "*") @result{} "*ace*"
-(string-append "" "" "") @result{} ""
-(eq? str (string-append str)) @result{} #f @r{; newly allocated}
+(define a "12345")
+(define b (string-copy "abcde"))
+(string-copy! b 1 a 0 2)
+b @result{} "a12de"%
@end group
@end example
@end deffn
-@deffn procedure substring string start end
-Returns a newly allocated string formed from the characters of
-@var{string} beginning with index @var{start} (inclusive) and ending
-with @var{end} (exclusive).
+@deffn {standard procedure} string-fill! string fill [start [end]]
+It is an error if @var{fill} is not a character.
+
+The @code{string-fill!} procedure stores @var{fill} in the elements of
+@var{string} between @var{start} and @var{end}.
+@end deffn
+
+@deffn procedure string-slice string [start [end]]
+@cindex slice, of string
+@cindex string slice
+Returns a @dfn{slice} of @var{string}, restricted to the range of
+characters specified by @var{start} and @var{end}.
+
+A slice is a kind of string that provides a view into another string.
+The slice behaves like any other string, but changes to a slice are
+reflected in the original string and vice versa.
@example
@group
-(substring "" 0 0) @result{} ""
-(substring "arduous" 2 5) @result{} "duo"
-(substring "arduous" 2 8) @error{} 8 not in correct range
+(define foo (string #\a #\b #\c #\d #\e))
+foo @result{} "abcde"
+
+(define bar (string-slice foo 1 4))
+bar @result{} "bcd"
-(define (string-copy s)
- (substring s 0 (string-length s)))
+(string-set! foo 2 #\z)
+foo @result{} "abzde"
+bar @result{} "bzd"
+
+(string-set! bar 1 #\y)
+bar @result{} "byd"
+foo @result{} "abyde"
@end group
@end example
@end deffn
-@deffn procedure string-head string end
-Returns a newly allocated copy of the initial substring of @var{string},
-up to but excluding @var{end}. It could have been defined by:
+@ignore
+
+@deffn string object @dots{}
+@deffn string* objects
+@deffn string->vector string [start [end]]
+@deffn vector->string vector [start [end]]
+
+@deffn string-joiner infix [prefix [suffix]]
+@deffn string-joiner* infix [prefix [suffix]]
+@deffn string-splitter delimiter [allow-runs?]
+
+@deffn string-any proc string1 string @dots{}
+@deffn string-count proc string1 string @dots{}
+@deffn string-every proc string1 string @dots{}
+@deffn string-find-first-index proc string1 string @dots{}
+@deffn string-find-last-index proc string1 string @dots{}
+@deffn string-for-each proc string1 string @dots{}
+@deffn string-map proc string1 string @dots{}
+
+@end ignore
+
+@deffn procedure string-null? string
+@cindex empty string, predicate for
+@cindex null string, predicate for
+Returns @code{#t} if @var{string} has zero length; otherwise returns
+@code{#f}.
@example
@group
-(define (string-head string end)
- (substring string 0 end))
+(string-null? "") @result{} #t
+(string-null? "Hi") @result{} #f
@end group
@end example
@end deffn
-@deffn procedure string-tail string start
-Returns a newly allocated copy of the final substring of @var{string},
-starting at index @var{start} and going to the end of @var{string}. It
-could have been defined by:
+@deffn procedure string-hash string [modulus]
+@cindex hashing, of string
+@findex string=?
+@findex =
+@code{string-hash} returns an exact non-negative integer that can be used
+for storing the specified @var{string} in a hash table. Equal strings
+(in the sense of @code{string=?}) return equal (@code{=}) hash codes,
+and non-equal but similar strings are usually mapped to distinct hash
+codes.
-@example
-@group
-(define (string-tail string start)
- (substring string start (string-length string)))
+If the optional argument @var{modulus} is specified, it must be an
+exact positive integer, and the result of @code{string-hash} is
+restricted to be less than that value. This is equivalent to calling
+@code{modulo} on the result, but may be faster.
+@end deffn
-(string-tail "uncommon" 2) @result{} "common"
-@end group
-@end example
+@deffn procedure string-head string end
+Equivalent to @code{(string-copy @var{string} 0 @var{end})}.
+@end deffn
+
+@deffn procedure string-tail string start
+Equivalent to @code{(string-copy @var{string} @var{start})}.
@end deffn
@deffn procedure string-pad-left string k [char]
@end example
@end deffn
+@deffn procedure string-replace string char1 char2
+Returns a newly allocated string containing the same characters as
+@var{string} except that all instances of @var{char1} have been
+replaced by @var{char2}.
+@end deffn
+
+@deffn procedure reverse-string string
+Returns a newly allocated string with the same characters as
+@var{string} but in the reverse order.
+
+@example
+@group
+(reverse-string "foo bar baz") @result{} "zab rab oof"
+(reverse-string (string-slice "foo bar baz" 4 7)) @result{} "rab"
+@end group
+@end example
+@end deffn
+
@node Searching Strings, Matching Strings, Cutting and Pasting Strings, Strings
@section Searching Strings
@cindex searching, of string
algorithm is used. For longer patterns, the Boyer-Moore string-search
algorithm is used.
-@deffn procedure string-search-forward pattern string
-@deffnx procedure substring-search-forward pattern string start end
+@deffn procedure string-search-forward pattern string [start [end]]
@var{Pattern} must be a string. Searches @var{string} for the leftmost
occurrence of the substring @var{pattern}. If successful, the index of
the first character of the matched substring is returned; otherwise,
@code{#f} is returned.
-@code{substring-search-forward} limits its search to the specified
-substring of @var{string}; @code{string-search-forward} searches all of
-@var{string}.
-
@example
@group
(string-search-forward "rat" "pirate")
@result{} 2
(string-search-forward "rat" "pirate rating")
@result{} 2
-(substring-search-forward "rat" "pirate rating" 4 13)
+(string-search-forward "rat" "pirate rating" 4 13)
@result{} 7
-(substring-search-forward "rat" "pirate rating" 9 13)
+(string-search-forward "rat" "pirate rating" 9 13)
@result{} #f
@end group
@end example
@end deffn
-@deffn procedure string-search-backward pattern string
-@deffnx procedure substring-search-backward pattern string start end
+@deffn procedure string-search-backward pattern string [start [end]]
@var{Pattern} must be a string. Searches @var{string} for the rightmost
occurrence of the substring @var{pattern}. If successful, the index to
the right of the last character of the matched substring is returned;
otherwise, @code{#f} is returned.
-@code{substring-search-backward} limits its search to the specified
-substring of @var{string}; @code{string-search-backward} searches all of
-@var{string}.
-
@example
@group
(string-search-backward "rat" "pirate")
@result{} 5
(string-search-backward "rat" "pirate rating")
@result{} 10
-(substring-search-backward "rat" "pirate rating" 1 8)
+(string-search-backward "rat" "pirate rating" 1 8)
@result{} 5
-(substring-search-backward "rat" "pirate rating" 9 13)
+(string-search-backward "rat" "pirate rating" 9 13)
@result{} #f
@end group
@end example
@end deffn
-@deffn procedure string-search-all pattern string
-@deffnx procedure substring-search-all pattern string start end
+@deffn procedure string-search-all pattern string [start [end]]
@var{Pattern} must be a string. Searches @var{string} to find all
occurrences of the substring @var{pattern}. Returns a list of the
occurrences; each element of the list is an index pointing to the first
character of an occurrence.
-@code{substring-search-all} limits its search to the specified substring
-of @var{string}; @code{string-search-all} searches all of @var{string}.
-
@example
@group
(string-search-all "rat" "pirate")
@result{} (2)
(string-search-all "rat" "pirate rating")
@result{} (2 7)
-(substring-search-all "rat" "pirate rating" 4 13)
+(string-search-all "rat" "pirate rating" 4 13)
@result{} (7)
-(substring-search-all "rat" "pirate rating" 9 13)
+(string-search-all "rat" "pirate rating" 9 13)
@result{} ()
@end group
@end example
@cindex matching, of strings
@deffn procedure string-match-forward string1 string2
-@deffnx procedure substring-match-forward string1 start end string2 start end
@deffnx procedure string-match-forward-ci string1 string2
-@deffnx procedure substring-match-forward-ci string1 start end string2 start end
-Compares the two strings (substrings), starting from the beginning, and
-returns the number of characters that are the same. If the two strings
-(substrings) start differently, returns 0. The @code{-ci} procedures
-don't distinguish uppercase and lowercase letters.
+Compares the two strings, starting from the beginning, and returns the
+number of characters that are the same. If the two strings start
+differently, returns 0. The @code{-ci} procedures don't distinguish
+uppercase and lowercase letters.
@example
@group
@end deffn
@deffn procedure string-match-backward string1 string2
-@deffnx procedure substring-match-backward string1 start end string2 start end
@deffnx procedure string-match-backward-ci string1 string2
-@deffnx procedure substring-match-backward-ci string1 start end string2 start end
-Compares the two strings (substrings), starting from the end and
-matching toward the front, returning the number of characters that are
-the same. If the two strings (substrings) end differently, returns 0.
-The @code{-ci} procedures don't distinguish uppercase and lowercase
-letters.
+Compares the two strings, starting from the end and matching toward
+the front, returning the number of characters that are the same. If
+the two strings end differently, returns 0. The @code{-ci} procedures
+don't distinguish uppercase and lowercase letters.
@example
@group
@end deffn
@deffn procedure string-prefix? string1 string2
-@deffnx procedure substring-prefix? string1 start1 end1 string2 start2 end2
@deffnx procedure string-prefix-ci? string1 string2
-@deffnx procedure substring-prefix-ci? string1 start1 end1 string2 start2 end2
@cindex prefix, of string
-These procedures return @code{#t} if the first string (substring) forms
-the prefix of the second; otherwise returns @code{#f}. The @code{-ci}
-procedures don't distinguish uppercase and lowercase letters.
+These procedures return @code{#t} if the first string forms the prefix
+of the second; otherwise returns @code{#f}. The @code{-ci} procedures
+don't distinguish uppercase and lowercase letters.
@example
@group
@end deffn
@deffn procedure string-suffix? string1 string2
-@deffnx procedure substring-suffix? string1 start1 end1 string2 start2 end2
@deffnx procedure string-suffix-ci? string1 string2
-@deffnx procedure substring-suffix-ci? string1 start1 end1 string2 start2 end2
@cindex suffix, of string
-These procedures return @code{#t} if the first string (substring) forms
-the suffix of the second; otherwise returns @code{#f}. The @code{-ci}
-procedures don't distinguish uppercase and lowercase letters.
+These procedures return @code{#t} if the first string forms the suffix
+of the second; otherwise returns @code{#f}. The @code{-ci} procedures
+don't distinguish uppercase and lowercase letters.
@example
@group
but is insensitive to character case. This has no equivalent in
standard regular-expression notation.
@end deffn
-
-@node Modification of Strings, Variable-Length Strings, Regular Expressions, Strings
-@section Modification of Strings
-@cindex modification, of string
-@cindex replacement, of string component
-@cindex filling, of string
-@cindex moving, of string elements
-
-@deffn procedure string-replace string char1 char2
-@deffnx procedure substring-replace string start end char1 char2
-@deffnx procedure string-replace! string char1 char2
-@deffnx procedure substring-replace! string start end char1 char2
-These procedures replace all occurrences of @var{char1} with @var{char2}
-in the original string (substring). @code{string-replace} and
-@code{substring-replace} return a newly allocated string containing the
-result. @code{string-replace!} and @code{substring-replace!}
-destructively modify @var{string} and return an unspecified value.
-
-@example
-@group
-(define str "a few words") @result{} @r{unspecified}
-(string-replace str #\space #\-) @result{} "a-few-words"
-(substring-replace str 2 9 #\space #\-) @result{} "a few-words"
-str @result{} "a few words"
-(string-replace! str #\space #\-) @result{} @r{unspecified}
-str @result{} "a-few-words"
-@end group
-@end example
-@end deffn
-
-@deffn {procedure} string-fill! string char
-Stores @var{char} in every element of @var{string} and returns an
-unspecified value.
-@end deffn
-
-@deffn procedure substring-fill! string start end char
-Stores @var{char} in elements @var{start} (inclusive) to @var{end}
-(exclusive) of @var{string} and returns an unspecified value.
-
-@example
-@group
-(define s (make-string 10 #\space)) @result{} @r{unspecified}
-(substring-fill! s 2 8 #\*) @result{} @r{unspecified}
-s @result{} " ****** "
-@end group
-@end example
-@end deffn
-
-@deffn procedure substring-move-left! string1 start1 end1 string2 start2
-@deffnx procedure substring-move-right! string1 start1 end1 string2 start2
-@findex eqv?
-Copies the characters from @var{start1} to @var{end1} of @var{string1}
-into @var{string2} at the @var{start2}-th position. The characters are
-copied as follows (note that this is only important when @var{string1}
-and @var{string2} are @code{eqv?}):
-
-@table @code
-@item substring-move-left!
-The copy starts at the left end and moves toward the right (from smaller
-indices to larger). Thus if @var{string1} and @var{string2} are the
-same, this procedure moves the characters toward the left inside the
-string.
-
-@item substring-move-right!
-The copy starts at the right end and moves toward the left (from larger
-indices to smaller). Thus if @var{string1} and @var{string2} are the
-same, this procedure moves the characters toward the right inside the
-string.
-@end table
-
-The following example shows how these procedures can be used to build up
-a string (it would have been easier to use @code{string-append}):
-@example
-@group
-(define answer (make-string 9 #\*)) @result{} @r{unspecified}
-answer @result{} "*********"
-(substring-move-left! "start" 0 5 answer 0) @result{} @r{unspecified}
-answer @result{} "start****"
-(substring-move-left! "-end" 0 4 answer 5) @result{} @r{unspecified}
-answer @result{} "start-end"
-@end group
-@end example
-@end deffn
-
-@deffn procedure reverse-string string
-@deffnx procedure reverse-substring string start end
-@deffnx procedure reverse-string! string
-@deffnx procedure reverse-substring! string start end
-Reverses the order of the characters in the given string or substring.
-@code{reverse-string} and @code{reverse-substring} return newly
-allocated strings; @code{reverse-string!} and @code{reverse-substring!}
-modify their argument strings and return an unspecified value.
-
-@example
-@group
-(reverse-string "foo bar baz") @result{} "zab rab oof"
-(reverse-substring "foo bar baz" 4 7) @result{} "rab"
-(let ((foo "foo bar baz"))
- (reverse-string! foo)
- foo) @result{} "zab rab oof"
-(let ((foo "foo bar baz"))
- (reverse-substring! foo 4 7)
- foo) @result{} "foo rab baz"
-@end group
-@end example
-@end deffn
-
-@node Variable-Length Strings, Byte Vectors, Modification of Strings, Strings
-@section Variable-Length Strings
-
-@cindex length, of string
-@cindex maximum length, of string (defn)
-MIT/GNU Scheme allows the length of a string to be dynamically adjusted in a
-limited way. When a new string is allocated, by whatever method, it has
-a specific length. At the time of allocation, it is also given a
-@dfn{maximum length}, which is guaranteed to be at least as large as the
-string's length. (Sometimes the maximum length will be slightly larger
-than the length, but it is a bad idea to count on this. Programs should
-assume that the maximum length is the same as the length at the time of
-the string's allocation.) After the string is allocated, the operation
-@code{set-string-length!} can be used to alter the string's length to
-any value between 0 and the string's maximum length, inclusive.
-
-@deffn procedure string-maximum-length string
-Returns the maximum length of @var{string}. The following is
-guaranteed:
-
-@example
-@group
-(<= (string-length string)
- (string-maximum-length string)) @result{} #t
-@end group
-@end example
-@findex string-length
-
-The maximum length of a string never changes.
-@end deffn
-
-@deffn procedure set-string-length! string k
-Alters the length of @var{string} to be @var{k}, and returns an
-unspecified value. @var{K} must be less than or equal to the maximum
-length of @var{string}. @code{set-string-length!} does not change the
-maximum length of @var{string}.
-@end deffn
-
-@node Byte Vectors, , Variable-Length Strings, Strings
-@section Byte Vectors
-@cindex byte vector
-@cindex vector, byte
-
-@findex string-ref
-MIT/GNU Scheme implements strings as packed vectors of 8-bit
-@acronym{ISO-8859-1} bytes. Most of the string operations, such as
-@code{string-ref}, coerce these 8-bit codes into character objects.
-However, some lower-level operations are made available for use.
-
-@deffn procedure vector-8b-ref string k
-Returns character @var{k} of @var{string} as an @acronym{ISO-8859-1}
-code. @var{K} must be a valid index of @var{string}.
-
-@example
-@group
-(vector-8b-ref "abcde" 2) @result{} 99 @r{;c}
-@end group
-@end example
-@end deffn
-
-@deffn procedure vector-8b-set! string k code
-Stores @var{code} in element @var{k} of @var{string} and returns an
-unspecified value. @var{K} must be a valid index of @var{string}, and
-@var{code} must be a valid @acronym{ISO-8859-1} code.
-@end deffn
-
-@deffn procedure vector-8b-fill! string start end code
-Stores @var{code} in elements @var{start} (inclusive) to @var{end}
-(exclusive) of @var{string} and returns an unspecified value.
-@var{Code} must be a valid @acronym{ISO-8859-1} code.
-@end deffn
-
-@deffn procedure vector-8b-find-next-char string start end code
-@deffnx procedure vector-8b-find-next-char-ci string start end code
-Returns the index of the first occurrence of @var{code} in the given
-substring; returns @code{#f} if @var{code} does not appear. The index
-returned is relative to the entire string, not just the substring.
-@var{Code} must be a valid @acronym{ISO-8859-1} code.
-
-@code{vector-8b-find-next-char-ci} doesn't distinguish uppercase and
-lowercase letters.
-@end deffn
-
-@deffn procedure vector-8b-find-previous-char string start end code
-@deffnx procedure vector-8b-find-previous-char-ci string start end code
-Returns the index of the last occurrence of @var{code} in the given
-substring; returns @code{#f} if @var{code} does not appear. The index
-returned is relative to the entire string, not just the substring.
-@var{Code} must be a valid @acronym{ISO-8859-1} code.
-
-@code{vector-8b-find-previous-char-ci} doesn't distinguish uppercase and
-lowercase letters.
-@end deffn