Characters are objects that represent printed characters such as letters and digits. MIT/GNU Scheme supports the full Unicode character repertoire.
• Character implementation: | ||
• Unicode: | ||
• Character Sets: |
Characters are written using the notation #\character
or
#\character-name
or #\xhex-scalar-value
.
The following standard character names are supported:
#\alarm ; U+0007 #\backspace ; U+0008 #\delete ; U+007F #\escape ; U+001B #\newline ; the linefeed character, U+000A #\null ; the null character, U+0000 #\return ; the return character, U+000D #\space ; the preferred way to write a space, U+0020 #\tab ; the tab character, U+0009
Here are some additional examples:
#\a ; lowercase letter #\A ; uppercase letter #\( ; left parenthesis #\ ; the space character
Case is significant in #\character
, and in
#\character-name
, but not in
#\xhex-scalar-value
. If character in
#\character
is alphabetic, then any character immediately
following character cannot be one that can appear in an
identifier. This rule resolves the ambiguous case where, for example,
the sequence of characters ‘#\space’ could be taken to be either
a representation of the space character or a representation of the
character ‘#\s’ followed by a representation of the symbol
‘pace’.
Characters written in the #\
notation are self-evaluating.
That is, they do not have to be quoted in programs.
Some of the procedures that operate on characters ignore the difference between upper case and lower case. The procedures that ignore case have ‘-ci’ (for “case insensitive”) embedded in their names.
MIT/GNU Scheme allows a character name to include one or more bucky bit prefixes to indicate that the character includes one or more of the keyboard shift keys Control, Meta, Super, or Hyper (note that the Control bucky bit prefix is not the same as the ASCII control key). The bucky bit prefixes and their meanings are as follows (case is not significant):
Key Bucky bit prefix Bucky bit --- ---------------- --------- Meta M- or Meta- 1 Control C- or Control- 2 Super S- or Super- 4 Hyper H- or Hyper- 8
For example,
#\c-a ; Control-a #\meta-b ; Meta-b #\c-s-m-h-A ; Control-Meta-Super-Hyper-A
Returns a string corresponding to the printed representation of
char. This is the character, character-name, or
xhex-scalar-value
component of the external
representation, combined with the appropriate bucky bit prefixes.
(char->name #\a) ⇒ "a" (char->name #\space) ⇒ "space" (char->name #\c-a) ⇒ "C-a" (char->name #\control-a) ⇒ "C-a"
Converts a string that names a character into the character specified.
If string does not name any character, name->char
signals
an error.
(name->char "a") ⇒ #\a (name->char "space") ⇒ #\space (name->char "SPACE") ⇒ #\space (name->char "c-a") ⇒ #\C-a (name->char "control-a") ⇒ #\C-a
Returns #t
if object is a character, otherwise returns
#f
.
These procedures return #t
if the results of passing their
arguments to char->integer
are respectively equal,
monotonically increasing, monotonically decreasing, monotonically
non-decreasing, or monotonically non-increasing.
These predicates are transitive.
These procedures are similar to char=?
et cetera, but they
treat upper case and lower case letters as the same. For example,
(char-ci=? #\A #\a)
returns #t
.
Specifically, these procedures behave as if char-foldcase
were
applied to their arguments before they were compared.
These procedures return #t
if their arguments are alphabetic,
numeric, whitespace, upper case, or lower case characters
respectively, otherwise they return #f
.
Specifically, they return #t
when applied to characters with
the Unicode properties Alphabetic, Numeric_Decimal, White_Space,
Uppercase, or Lowercase respectively, and #f
when applied to
any other Unicode characters. Note that many Unicode characters are
alphabetic but neither upper nor lower case.
Returns #t
if char is either alphabetic or numeric,
otherwise it returns #f
.
This procedure returns the numeric value (0 to 9) of its argument
if it is a numeric digit (that is, if char-numeric?
returns #t
),
or #f
on any other character.
(digit-value #\3) ⇒ 3 (digit-value #\x0664) ⇒ 4 (digit-value #\x0AE6) ⇒ 0 (digit-value #\x0EA6) ⇒ #f
Given a Unicode character, char->integer
returns an exact
integer between 0
and #xD7FF
or between #xE000
and #x10FFFF
which is equal to the Unicode scalar value of that
character. Given a non-Unicode character, it returns an exact integer
greater than #x10FFFF
.
Given an exact integer that is the value returned by a character when
char->integer
is applied to it, integer->char
returns
that character.
Implementation note: MIT/GNU Scheme allows any Unicode code point, not just scalar values.
Implementation note: If the argument to char->integer
or
integer->char
is a constant, the MIT/GNU Scheme compiler will
constant-fold the call, replacing it with the corresponding result.
This is a very useful way to denote unusual character constants or
ASCII codes.
The char-upcase
procedure, given an argument that is the
lowercase part of a Unicode casing pair, returns the uppercase member
of the pair. Note that language-sensitive casing pairs are not used.
If the argument is not the lowercase member of such a pair, it is
returned.
The char-downcase
procedure, given an argument that is the
uppercase part of a Unicode casing pair, returns the lowercase member
of the pair. Note that language-sensitive casing pairs are not used.
If the argument is not the uppercase member of such a pair, it is
returned.
The char-foldcase
procedure applies the Unicode simple
case-folding algorithm to its argument and returns the result. Note
that language-sensitive folding is not used. See
UAX #44 (part of the
Unicode Standard) for details.
Note that many Unicode lowercase characters do not have uppercase equivalents.
If char is a character representing a digit in the given
radix, returns the corresponding integer value. If radix
is specified (which must be an exact integer between 2 and 36
inclusive), the conversion is done in that base, otherwise it is done
in base 10. If char doesn’t represent a digit in base
radix, char->digit
returns #f
.
Note that this procedure is insensitive to the alphabetic case of char.
(char->digit #\8) ⇒ 8 (char->digit #\e 16) ⇒ 14 (char->digit #\e) ⇒ #f
Returns a character that represents digit in the radix given by radix. The radix argument, if given, must be an exact integer between 2 and 36 (inclusive); it defaults to 10. The digit argument must be an exact non-negative integer strictly less than radix.
(digit->char 8) ⇒ #\8 (digit->char 14 16) ⇒ #\E