Unicode standard for the details and for information about why one
normalization form is preferable for a specific purpose.
+When doing search and match operations, it is recommended that the
+argument strings be in @acronym{NFC}. Without normalization, strings
+that should match may not, if they have inconsistent encodings for one
+or more characters.
+
@deffn procedure string-in-nfc? string
@deffnx procedure string-in-nfd? string
These procedures return @code{#t} if @var{string} is in Unicode
character or a substring, and matching two strings to one another.
@deffn procedure string-search-forward pattern string [start [end]]
-The arguments @var{pattern} and @var{string} must satisfy
-@code{string-in-nfc?}.
-
Searches @var{string} for the leftmost occurrence of the substring
@var{pattern}. If successful, the index of the first character of the
matched substring is returned; otherwise, @code{#f} is returned.
@end deffn
@deffn procedure string-search-backward pattern string [start [end]]
-The arguments @var{pattern} and @var{string} must satisfy
-@code{string-in-nfc?}.
-
Searches @var{string} for the rightmost occurrence of the substring
@var{pattern}. If successful, the index to the right of the last
character of the matched substring is returned; otherwise, @code{#f}
@end deffn
@deffn procedure string-search-all pattern string [start [end]]
-The arguments @var{pattern} and @var{string} must satisfy
-@code{string-in-nfc?}.
-
Searches @var{string} to find all occurrences of the substring
@var{pattern}. Returns a list of the occurrences; each element of the
list is an index pointing to the first character of an occurrence.
@deffn procedure string-find-first-index proc string string @dots{}
@deffnx procedure string-find-last-index proc string string @dots{}
-Each @var{string} must satisfy @code{string-in-nfc?}, and @var{proc}
-must accept as many arguments as there are @var{string}s.
+The @var{proc} argument must accept as many arguments as there are
+@var{string}s.
These procedures apply @var{proc} element-wise to the elements of the
@var{string}s and return the first or last index for which @var{proc}
@deffn procedure string-find-next-char string char [start [end]]
@deffnx procedure string-find-next-char-ci string char [start [end]]
@deffnx procedure string-find-next-char-in-set string char-set [start [end]]
-The argument @var{string} must satisfy @code{string-in-nfc?}.
-
These procedures search @var{string} for a matching character,
starting from @var{start} and moving forwards to @var{end}. If there
is a matching character, the procedures stop the search and return the
@deffn procedure string-find-previous-char string char [start [end]]
@deffnx procedure string-find-previous-char-ci string char [start [end]]
@deffnx procedure string-find-previous-char-in-set string char-set [start [end]]
-The argument @var{string} must satisfy @code{string-in-nfc?}.
-
These procedures search @var{string} for a matching character,
starting from @var{end} and moving backwards to @var{start}. If there
is a matching character, the procedures stop the search and return the
@end deffn
@deffn procedure string-match-forward string1 string2
-The arguments @var{string1} and @var{string2} must satisfy
-@code{string-in-nfc?}.
-
Compares the two strings, starting from the beginning, and returns the
number of characters that are the same. If the two strings start
differently, returns 0.
@end deffn
@deffn procedure string-match-backward string1 string2
-The arguments @var{string1} and @var{string2} must satisfy
-@code{string-in-nfc?}.
-
Compares the two strings, starting from the end and matching toward
the front, returning the number of characters that are the same. If
the two strings end differently, returns 0.
@deffn procedure regsexp-match-string crse string [start [end]]
The @var{crse} argument must be a value returned by
-@code{compile-regsexp}. The @var{string} argument must satisfy
-@code{string-in-nfc?}.
+@code{compile-regsexp}.
Matches @var{string} against @var{crse} and returns the result.
@end deffn
@deffn procedure regsexp-search-string-forward crse string [start [end]]
The @var{crse} argument must be a value returned by
-@code{compile-regsexp}. The @var{string} argument must satisfy
-@code{string-in-nfc?}.
+@code{compile-regsexp}.
Searches @var{string} from left to right for a match against
@var{crse} and returns the result.
(%regexp-matches? re #f string start end 'regexp-matches-some?))
(define-integrable (%regexp-matches? re match-all? string start end caller)
- (guarantee nfc-string? string caller)
(let* ((end (fix:end-index end (string-length string) caller))
(start (fix:start-index start end caller)))
(and (run-matcher (regexp-impl (regexp re)) match-all? #f start
(%regexp-matches re #f string start end 'regexp-matches-some))
(define-integrable (%regexp-matches re match-all? string start end caller)
- (guarantee nfc-string? string caller)
(let* ((end (fix:end-index end (string-length string) caller))
(start (fix:start-index start end caller)))
(%regexp-match (regexp re) match-all? #t start string start end)))
(regexp-submatch-keys regexp)))))
(define (regexp-search re string #!optional start end)
- (guarantee nfc-string? string 'regexp-search)
(let* ((end (fix:end-index end (string-length string) 'regexp-search))
(start (fix:start-index start end 'regexp-search)))
(%regexp-search (regexp re) start string start end)))
;;;; Fold
(define (regexp-fold re kons knil string #!optional finish start end ignore?)
- (guarantee nfc-string? string 'regexp-fold)
(let ((regexp (regexp re))
(end (fix:end-index end (string-length string) 'regexp-fold))
(ignore? (if (default-object? ignore?) #f ignore?)))
(define (regexp-fold-right re kons knil string
#!optional finish start end ignore?)
- (guarantee nfc-string? string 'regexp-fold-right)
(let ((regexp (regexp re))
(end (fix:end-index end (string-length string) 'regexp-fold-right))
(ignore? (if (default-object? ignore?) #f ignore?)))
;;;; Match
(define (string-match-forward string1 string2)
- (guarantee nfc-string? string1 'string-match-forward)
- (guarantee nfc-string? string2 'string-match-forward)
(let ((end1 (string-length string1))
(end2 (string-length string2)))
(let ((end (fix:min end1 end2)))
i)))))
(define (string-match-backward string1 string2)
- (guarantee nfc-string? string1 'string-match-backward)
- (guarantee nfc-string? string2 'string-match-backward)
(let ((s1 (fix:- (string-length string1) 1)))
(let loop ((i s1) (j (fix:- (string-length string2) 1)))
(if (and (fix:>= i 0)
(define (string->nfc string)
(if (and (ustring? string)
(%ustring-immutable? string))
- (if (ustring-in-nfc-set? string)
+ (if (and (ustring-in-nfc-set? string)
+ (ustring-in-nfc? string))
string
(let ((nfc
(case (string-nfc-qc string 'string->nfc)
(define-integrable (string-matcher caller naive kmp)
(lambda (pattern text #!optional start end)
- (guarantee nfc-string? pattern caller)
- (guarantee nfc-string? text caller)
(let ((pend (string-length pattern)))
(if (fix:= 0 pend)
(error:bad-range-argument pend caller))
(proc i)))))))
(define (string-find-first-index proc string . strings)
- (guarantee nfc-string? string 'string-find-first-index)
- (guarantee-list-of nfc-string? strings 'string-find-first-index)
(receive (n proc) (mapper-values proc string strings)
(let loop ((i 0))
(and (fix:< i n)
(loop (fix:+ i 1)))))))
(define (string-find-last-index proc string . strings)
- (guarantee nfc-string? string 'string-find-last-index)
- (guarantee-list-of nfc-string? strings 'string-find-last-index)
(receive (n proc) (mapper-values proc string strings)
(let loop ((i (fix:- n 1)))
(and (fix:>= i 0)