Go to the previous, next section.
Characters are objects that represent printed characters, such as letters and digits.(5)
Characters are written using the notation #\character
or
#\character-name
. For example:
#\a ; lowercase letter #\A ; uppercase letter #\( ; left parenthesis #\space ; the space character #\newline ; the newline character
Case is significant in #\character
, but not in
#\character-name
. If character in
#\character
is a letter, character must be followed
by a delimiter character such as a space or parenthesis. Characters
written in the #\
notation are self-evaluating; you don't need to
quote them.
A character name may include one or more bucky bit prefixes to indicate that the character includes one or more of the keyboard shift keys Control, Meta, Super, Hyper, or Top (note that the Control bucky bit prefix is not the same as the ASCII control key). The bucky bit prefixes and their meanings are as follows (case is not significant):
Key Bucky bit prefix Bucky bit --- ---------------- --------- Meta M- or Meta- 1 Control C- or Control- 2 Super S- or Super- 4 Hyper H- or Hyper- 8 Top T- or Top- 16
For example,
#\c-a ; Control-a #\meta-b ; Meta-b #\c-s-m-h-a ; Control-Meta-Super-Hyper-A
The following character-names are supported, shown here with their ASCII equivalents:
Character Name ASCII Name -------------- ---------- altmode ESC backnext US backspace BS call SUB linefeed LF page FF return CR rubout DEL space tab HT
In addition, #\newline
is either #\linefeed
or
#\return
, depending on the operating system that Scheme is
running under. All of the standard ASCII names for non-printing
characters are supported:
NUL SOH STX ETX EOT ENQ ACK BEL BS HT LF VT FF CR SO SI DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US DEL
procedure+: char->name char [slashify?]
Returns a string corresponding to the printed representation of char. This is the character or character-name component of the external representation, combined with the appropriate bucky bit prefixes.
(char->name #\a) => "a" (char->name #\space) => "Space" (char->name #\c-a) => "C-a" (char->name #\control-a) => "C-a"
Slashify?, if specified and true, says to insert the necessary
backslash characters in the result so that read
will parse it
correctly. In other words, the following generates the external
representation of char:
(string-append "#\\" (char->name char #t))
If slashify? is not specified, it defaults to #f
.
Converts a string that names a character into the character specified.
If string does not name any character, name->char
signals
an error.
(name->char "a") => #\a (name->char "space") => #\Space (name->char "c-a") => #\C-a (name->char "control-a") => #\C-a
procedure: char<=? char1 char2
procedure: char>=? char1 char2
procedure: char-ci=? char1 char2
procedure: char-ci char1 char2
procedure: char-ci>? char1 char2
procedure: char-ci<=? char1 char2
procedure: char-ci>=? char1 char2
Returns #t
if the specified characters are have the appropriate
order relationship to one another; otherwise returns #f
. The
-ci
procedures don't distinguish uppercase and lowercase letters.
Character ordering follows these rules:
(char<? #\0 #\9)
returns
#t
.
(char<? #\A
#\B)
returns #t
.
(char<? #\a
#\b)
returns #t
.
In addition, MIT Scheme orders those characters that satisfy
char-standard?
the same way that ASCII does. Specifically,
all the digits precede all the uppercase letters, and all the upper-case
letters precede all the lowercase letters.
Characters are ordered by first comparing their bucky bits part and then their code part. In particular, characters without bucky bits come before characters with bucky bits.
Returns #t
if object is a character; otherwise returns
#f
.
Returns the uppercase or lowercase equivalent of char if
char is a letter; otherwise returns char. These procedures
return a character char2 such that (char-ci=? char
char2)
.
procedure+: char->digit char [radix]
If char is a character representing a digit in the given
radix, returns the corresponding integer value. If you specify
radix (which must be an exact integer between 2 and 36 inclusive),
the conversion is done in that base, otherwise it is done in base 10.
If char doesn't represent a digit in base radix,
char->digit
returns #f
.
Note that this procedure is insensitive to the alphabetic case of char.
(char->digit #\8) => 8 (char->digit #\e 16) => 14 (char->digit #\e) => #f
procedure+: digit->char digit [radix]
Returns a character that represents digit in the radix given by
radix. Radix must be an exact integer between 2 and 36
(inclusive), and defaults to 10. Digit, which must be an
exact non-negative integer, should be less than radix; if
digit is greater than or equal to radix, digit->char
returns #f
.
(digit->char 8) => #\8 (digit->char 14 16) => #\E
An MIT Scheme character consists of a code part and a bucky bits part. The MIT Scheme set of characters can represent more characters than ASCII can; it includes characters with Super, Hyper, and Top bucky bits, as well as Control and Meta. Every ASCII character corresponds to some MIT Scheme character, but not vice versa.(6)
MIT Scheme uses a 7-bit ASCII character code with 5 bucky bits. The least significant bucky bit, Meta, is stored adjacent to the MSB of the character code, allowing the least significant 8 bits of a character object to be interpreted as ordinary ASCII with a meta bit. This is compatible with standard practice for 8-bit characters when meta bits are employed.
procedure+: make-char code bucky-bits
Builds a character from code and bucky-bits. Both
code and bucky-bits must be exact non-negative integers in
the appropriate range. Use char-code
and char-bits
to
extract the code and bucky bits from the character. If 0
is
specified for bucky-bits, make-char
produces an ordinary
character; otherwise, the appropriate bits are turned on as follows:
1 Meta 2 Control 4 Super 8 Hyper 16 Top
For example,
(make-char 97 0) => #\a (make-char 97 1) => #\M-a (make-char 97 2) => #\C-a (make-char 97 3) => #\C-M-a
Returns the exact integer representation of char's bucky bits. For example,
(char-bits #\a) => 0 (char-bits #\m-a) => 1 (char-bits #\c-a) => 2 (char-bits #\c-m-a) => 3
Returns the character code of char, an exact integer. For example,
(char-code #\a) => 97 (char-code #\c-a) => 97
These variables define the (exclusive) upper limits for the character code and bucky bits (respectively). The character code and bucky bits are always exact non-negative integers, and are strictly less than the value of their respective limit variable.
char->integer
returns the character code representation for
char. integer->char
returns the character whose character
code representation is k.
In MIT Scheme, if (char-ascii? char)
is true, then
(eqv? (char->ascii char) (char->integer char))
However, this behavior is not required by the Scheme standard, and code that depends on it is not portable to other implementations.
These procedures implement order isomorphisms between the set of
characters under the char<=?
ordering and some subset of the
integers under the <=
ordering. That is, if
(char<=? a b) => #t and (<= x y) => #t
and x
and y
are in the range of char->integer
,
then
(<= (char->integer a) (char->integer b)) => #t (char<=? (integer->char x) (integer->char y)) => #t
The range of char->integer
is defined to be the exact
non-negative integers that are less than the value of this variable
(exclusive).
MIT Scheme internally uses ASCII codes for I/O, and stores character objects in a fashion that makes it convenient to convert between ASCII codes and characters. Also, character strings are implemented as byte vectors whose elements are ASCII codes; these codes are converted to character objects when accessed. For these reasons it is sometimes desirable to be able to convert between ASCII codes and characters.
Not all characters can be represented as ASCII codes. A character that has an equivalent ASCII representation is called an ASCII character.
Returns the ASCII code for char if char has an
ASCII representation; otherwise returns #f
.
In the current implementation, the characters that satisfy this
predicate are those in which the Control, Super, Hyper, and Top bucky
bits are turned off. All characters for which the char-bits
procedure returns 0
or 1
(i.e. no bucky bits, or just
Meta) count as legal ASCII characters.
Returns the ASCII code for char. An error
condition-type:bad-range-argument
is signalled if char
doesn't have an ASCII representation.
Code must be the exact integer representation of an ASCII code. This procedure returns the character corresponding to code.
MIT Scheme's character-set abstraction is used to represent groups of characters, such as the letters or digits. Character sets may contain only ASCII characters; in the future this may be changed to allow the full range of characters.
There is no meaningful external representation for character sets; use
char-set-members
to examine their contents. There is (at
present) no specific equivalence predicate for character sets; use
equal?
for this purpose.
Returns #t
if object is a character set; otherwise returns
#f
.(7)
variable+: char-set:upper-case
variable+: char-set:lower-case
variable+: char-set:alphabetic
variable+: char-set:alphanumeric
variable+: char-set:whitespace
variable+: char-set:not-whitespace
variable+: char-set:not-graphic
These variables contain predefined character sets.
To see the contents of one of these sets, use char-set-members
.
Alphabetic characters are the 52 upper and lower case letters.
Numeric characters are the 10 decimal digits. Alphanumeric
characters are those in the union of these two sets. Whitespace
characters are #\space
, #\tab
, #\page
,
#\linefeed
, and #\return
. Graphic characters are
the printing characters and #\space
. Standard characters
are the printing characters, #\space
, and #\newline
.
These are the printing characters:
! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~
procedure: char-upper-case? char
procedure: char-lower-case? char
procedure: char-alphabetic? char
procedure+: char-alphanumeric? char
procedure: char-whitespace? char
procedure+: char-graphic? char
procedure+: char-standard? object
These predicates are defined in terms of the respective character sets defined above.
procedure+: char-set-members char-set
Returns a newly allocated list of the characters in char-set.
procedure+: char-set-member? char-set char
Returns #t
if the char is in char-set; otherwise
returns #f
.
Returns a character set consisting of the specified ASCII
characters. With no arguments, char-set
returns an empty
character set.
procedure+: chars->char-set chars
Returns a character set consisting of chars, which must be a list
of ASCII characters. This is equivalent to (apply char-set
chars)
.
procedure+: ascii-range->char-set lower upper
Lower and upper must be exact non-negative integers representing ASCII character codes, and lower must be less than or equal to upper. This procedure creates and returns a new character set consisting of the characters whose ASCII codes are between lower (inclusive) and upper (exclusive).
procedure+: predicate->char-set predicate
Predicate must be a procedure of one argument.
predicate->char-set
creates and returns a character set
consisting of the ASCII characters for which predicate is
true.
procedure+: char-set-difference char-set1 char-set2
Returns a character set consisting of the characters that are in char-set1 but aren't in char-set2.
procedure+: char-set-intersection char-set1 char-set2
Returns a character set consisting of the characters that are in both char-set1 and char-set2.
procedure+: char-set-union char-set1 char-set2
Returns a character set consisting of the characters that are in one or both of char-set1 and char-set2.
procedure+: char-set-invert char-set
Returns a character set consisting of the ASCII characters that are not in char-set.
Go to the previous, next section.