From: Chris Hanson Date: Sat, 6 May 2017 20:55:07 +0000 (-0700) Subject: Clarify some details about the regsexp implementation. X-Git-Tag: mit-scheme-pucked-9.2.12~14^2~76 X-Git-Url: https://birchwood-abbey.net/git?a=commitdiff_plain;h=7639be138daf0099e7291c2f8cb66507e69d1522;p=mit-scheme.git Clarify some details about the regsexp implementation. --- diff --git a/doc/ref-manual/strings.texi b/doc/ref-manual/strings.texi index d09044a88..568bf91f0 100644 --- a/doc/ref-manual/strings.texi +++ b/doc/ref-manual/strings.texi @@ -1170,9 +1170,10 @@ s-expression syntax, which we call a @dfn{regular s-expression}, abbreviated as @dfn{regsexp}. Previous releases of MIT/GNU Scheme provided a regular-expression -mechanism nearly identical to that of GNU Emacs version 18. This -mechanism still exists but is deprecated and will be removed in a -future release. +implementation nearly identical to that of GNU Emacs version 18. This +implementation supported only 8-bit strings, which made it unsuitable +for use with Unicode strings. This implementation still exists but is +deprecated and will be removed in a future release. @menu * Regular S-Expressions:: @@ -1535,6 +1536,11 @@ match ends, and each @var{register} is a pair @code{(@var{key} . @var{contents})} where @var{key} is the register's name and @var{contents} is the contents of that register as a string. +In order to get reliable results, the string arguments to these +procedures must be in Unicode Normalization Form C. The string +implementation keeps most strings in this form by default; in other +cases the caller must convert the string using @code{string->nfc}. + @deffn procedure regsexp-match-string crse string [start [end]] The @var{crse} argument must be a value returned by @code{compile-regsexp}. The @var{string} argument must satisfy