Next: , Previous: , Up: Miscellaneous Datatypes   [Contents][Index]


10.2 Symbols

MIT/GNU Scheme provides two types of symbols: interned and uninterned. Interned symbols are far more common than uninterned symbols, and there are more ways to create them. Interned symbols have an external representation that is recognized by the procedure read; uninterned symbols do not.8

Interned symbols have an extremely useful property: any two interned symbols whose names are the same, in the sense of string=?, are the same object (i.e. they are eq? to one another). The term interned refers to the process of interning by which this is accomplished. Uninterned symbols do not share this property.

The names of interned symbols are not distinguished by their alphabetic case. Because of this, MIT/GNU Scheme converts all alphabetic characters in the name of an interned symbol to a specific case (lower case) when the symbol is created. When the name of an interned symbol is referenced (using symbol->string) or written (using write) it appears in this case. It is a bad idea to depend on the name being lower case. In fact, it is preferable to take this one step further: don’t depend on the name of a symbol being in a uniform case.

The rules for writing an interned symbol are the same as the rules for writing an identifier (see Identifiers). Any interned symbol that has been returned as part of a literal expression, or read using the read procedure and subsequently written out using the write procedure, will read back in as the identical symbol (in the sense of eq?).

Usually it is also true that reading in an interned symbol that was previously written out produces the same symbol. An exception are symbols created by the procedures string->symbol and intern; they can create symbols for which this write/read invariance may not hold because the symbols’ names contain special characters or letters in the non-standard case.9

The external representation for uninterned symbols is special, to distinguish them from interned symbols and prevent them from being recognized by the read procedure:

(string->uninterned-symbol "foo")
     ⇒  #[uninterned-symbol 30 foo]

In this section, the procedures that return symbols as values will either always return interned symbols, or always return uninterned symbols. The procedures that accept symbols as arguments will always accept either interned or uninterned symbols, and do not distinguish the two.

procedure: symbol? object

Returns #t if object is a symbol, otherwise returns #f.

(symbol? 'foo)                                  ⇒  #t
(symbol? (car '(a b)))                          ⇒  #t
(symbol? "bar")                                 ⇒  #f
procedure: symbol->string symbol

Returns the name of symbol as a string. If symbol was returned by string->symbol, the value of this procedure will be identical (in the sense of string=?) to the string that was passed to string->symbol. It is an error to apply mutation procedures such as string-set! to strings returned by this procedure.

(symbol->string 'flying-fish)           ⇒  "flying-fish"
(symbol->string 'Martin)                ⇒  "martin"
(symbol->string (string->symbol "Malvina"))
                                        ⇒  "Malvina"

Note that two distinct uninterned symbols can have the same name.

procedure: intern string

Returns the interned symbol whose name is string. Converts string to the standard alphabetic case before generating the symbol. This is the preferred way to create interned symbols, as it guarantees the following independent of which case the implementation uses for symbols’ names:

(eq? 'bitBlt (intern "bitBlt")) ⇒     #t

The user should take care that string obeys the rules for identifiers (see Identifiers), otherwise the resulting symbol cannot be read as itself.

procedure: intern-soft string

Returns the interned symbol whose name is string. Converts string to the standard alphabetic case before generating the symbol. If no such interned symbol exists, returns #f.

This is exactly like intern, except that it will not create an interned symbol, but only returns symbols that already exist.

procedure: string->symbol string

Returns the interned symbol whose name is string. Although you can use this procedure to create symbols with names containing special characters or lowercase letters, it’s usually a bad idea to create such symbols because they cannot be read as themselves. See symbol->string.

(eq? 'mISSISSIppi 'mississippi)         ⇒  #t
(string->symbol "mISSISSIppi")
     ⇒  the symbol with the name "mISSISSIppi"
(eq? 'bitBlt (string->symbol "bitBlt")) ⇒  #f
(eq? 'JollyWog
      (string->symbol
        (symbol->string 'JollyWog)))    ⇒  #t
(string=? "K. Harper, M.D."
           (symbol->string
             (string->symbol
               "K. Harper, M.D.")))     ⇒  #t
procedure: string->uninterned-symbol string

Returns a newly allocated uninterned symbol whose name is string. It is unimportant what case or characters are used in string.

Note: this is the fastest way to make a symbol.

procedure: generate-uninterned-symbol [object]

Returns a newly allocated uninterned symbol that is guaranteed to be different from any other object. The symbol’s name consists of a prefix string followed by the (exact non-negative integer) value of an internal counter. The counter is initially zero, and is incremented after each call to this procedure.

The optional argument object is used to control how the symbol is generated. It may take one of the following values:

(generate-uninterned-symbol)
     ⇒  #[uninterned-symbol 31 G0]
(generate-uninterned-symbol)
     ⇒  #[uninterned-symbol 32 G1]
(generate-uninterned-symbol 'this)
     ⇒  #[uninterned-symbol 33 this2]
(generate-uninterned-symbol)
     ⇒  #[uninterned-symbol 34 G3]
(generate-uninterned-symbol 100)
     ⇒  #[uninterned-symbol 35 G100]
(generate-uninterned-symbol)
     ⇒  #[uninterned-symbol 36 G101]
procedure: symbol-append symbol …

Returns the interned symbol whose name is formed by concatenating the names of the given symbols. This procedure preserves the case of the names of its arguments, so if one or more of the arguments’ names has non-standard case, the result will also have non-standard case.

(symbol-append 'foo- 'bar)              ⇒  foo-bar
;; the arguments may be uninterned:
(symbol-append 'foo- (string->uninterned-symbol "baz"))
                                        ⇒  foo-baz
;; the result has the same case as the arguments:
(symbol-append 'foo- (string->symbol "BAZ"))    ⇒  foo-BAZ
procedure: symbol-hash symbol

Returns a hash number for symbol, which is computed by calling string-hash on symbol’s name. The hash number is an exact non-negative integer.

procedure: symbol-hash-mod symbol modulus

Modulus must be an exact positive integer. Equivalent to

(modulo (symbol-hash symbol) modulus)

This procedure is provided for convenience in constructing hash tables. However, it is normally preferable to use make-strong-eq-hash-table to build hash tables keyed by symbols, because eq? hash tables are much faster.

procedure: symbol<? symbol1 symbol2

This procedure computes a total order on symbols. It is equivalent to

(string<? (symbol->string symbol1)
          (symbol->string symbol2))

Footnotes

(8)

In older dialects of Lisp, uninterned symbols were fairly important. This was true because symbols were complicated data structures: in addition to having value cells (and sometimes, function cells), these structures contained property lists. Because of this, uninterned symbols were often used merely for their property lists — sometimes an uninterned symbol used this way was referred to as a disembodied property list. In MIT/GNU Scheme, symbols do not have property lists, or any other components besides their names. There is a different data structure similar to disembodied property lists: one-dimensional tables (see 1D Tables). For these reasons, uninterned symbols are not very useful in MIT/GNU Scheme. In fact, their primary purpose is to simplify the generation of unique variable names in programs that generate Scheme code.

(9)

MIT/GNU Scheme reserves a specific set of interned symbols for its own use. If you use these reserved symbols it is possible that you could break specific pieces of software that depend on them. The reserved symbols all have names beginning with the characters ‘#[’ and ending with the character ‘]’; thus none of these symbols can be read by the procedure read and hence are not likely to be used by accident. For example, (intern "#[unnamed-procedure]") produces a reserved symbol.


Next: , Previous: , Up: Miscellaneous Datatypes   [Contents][Index]