Back Contents Next

3. Basic Concepts

3.1 Variables and Regions

An identifier names a location where a value can be stored and is called a variable. Varibles are said to be bound to their location. The set of all visible bindings in effect at some point in a program is known as the environment in effect at that point. The value stored in the location to which a variable is bound is called the variable's value. By abuse of terminology, the variable is sometimes said to name the value or to be bound to the value. this is not quite accurate, but confusion rarely results from this practice.

Certain expression types create new locations and bind variables to those locations. These expression types are called binding constructs. The most fundamental of the variable binding constructs are the lambda and mu expressions, because all other variable binding constructs can be explained in terms of them. Some other variable binding constructs are let, let*, letrec, and do expressions.

Like Algol and Pascal, and unlike most other dialects of Lisp except for Common Lisp and Scheme, Better Scheme is a statically scoped language with block structure. To each place where an identifier is bound in a program there corresponds a region of the program text within which the binding is visible. The region is determined by the particular binding construct that establishes the binding; if the binding is established by a lambda expression, for example, then its region is the entire lambda expression. Every mention of an identifier refers to the binding of the identifier that established the innermost of the regions containing the use. If there is no binding of the identifier whose region contains the use, then the use refers to the binding for the variable in the top level environment, if any (see sections 4. Expressions and 5.1 Binding Constructs); if there is no binding for the identifier, it is said to be unbound.

3.2 Disjointness of Types

No object satisfies more than one of the following predicates:

boolean?pair?
symbol?number?
char?string?
vector?macro?
function?continuation?
null?void?

These predicates define the types boolean, pair, symbol, number, char (or character), string, vector, macro, function, continuation, null and void. The empty list is the only object of type null and the value #void is the only value of type void.

Although there is a separate boolean type, any Better Scheme value can be used as a boolean value for the purpose of a conditional test. As explained in section 5.6.1 Booleans, all values count as true in such a test except for #f. This report uses the word "true" to refer to any Better Scheme value except #f, and the word "false" to refer to #f.

3.3 External Representations

An important concept in Better Scheme (and Lisp) is that of the external representation of an object as a sequence of characters. For example, an external representation of the integer 28 is the sequence of characters "28", and an external representation of a list consisting of the integers 8 and 13 is the sequence of characters "(8 13)".

The external representation of an object is not necessarily unique. The integer 28 also has representations "#e28.000" and "#x1c", and the list in the previous paragraph also has the representations "( 08 13 )" and "(8 . (13 . ()))" (see section 5.6.2 Pairs and Lists).

Many objects have standard external representations, but some, such as functions, do not have standard representations (although particular implementations may define representations for them).

An external representation may be written in a program to obtain the corresponding object (see section 4.2 Literal Expressions).

External representations can also be used for input and output. The procedure 'read' (section see section 6.6.2 Input) parses external representations, and the procedure 'write' (see section 6.6.3 Output) generates them. Together, they provide an elegant and powerful input/output facility.

Note that the sequence of characters "(+ 2 6)" is not an external representation of the integer 8, even though it is an expression evaluating to the integer 8; rather, it is an external representation of a three-element list, the elements of which are the symbol + and the integers 2 and 6. Better Scheme's syntax has the property that any sequence of characters that is an expression is also the external representation of some object. This can lead to confusion, since it may not be obvious out of context whether a given sequence of characters is intended to denote data or program, but it is also a source of power, since it facilitates writing programs such as interpreters and compilers that treat programs as data (or vice versa).

The syntax of external representations of various kinds of objects is described in section 2.3 Literals.

3.4 Storage Model

Variables and objects such as pairs, vectors, and strings implicitly denote locations or sequences of locations. A string, for example, denotes as many locations as there are characters in the string. (These locations need not correspond to a full machine word.) A new value may be stored into one of these locations using the string-set! function, but the string continues to denote the same locations as before.

An object fetched from a location, by a variable reference or by a function such as car, vector-ref, or string-ref, is the same object as the object last stored in the location before the fetch and so is equal to it in the sense of eq? (see section 5.4 Equivalence Predicates).

Every location is marked to show whether it is in use. No variable or object ever refers to a location that is not in use. Whenever this report speaks of storage being allocated for a variable or object, what is meant is that an appropriate number of locations are chosen from the set of locations that are not in use, and the chosen locations are marked to indicate that they are now in use before the variable or object is made to denote them.

3.5 Proper Tail Recursion

Implementations of Better Scheme are required to be properly tail-recursive. Function calls that occur in certain syntactic contexts defined below are tail calls. A Better Scheme implementation is properly tail-recursive if it supports an unbounded number of active tail calls. A call is active if the called function may still return. Note that this includes calls that may be returned from either by the current continuation or by continuations captured earlier by call/cc that are later invoked. In the absence of captured continuations, calls could return at most once and the active calls would be those that had not yet returned.

Rationale:

Intuitively, no space is needed for an active tail call because the continuation that is used in the tail call has the same semantics as the continuation passed to the function containing the call. Although an improper implementation might use a new continuation in the call, a return to this new continuation would be followed immediately by a return to the continuation passed to the function. A properly tail-recursive implementation returns to that continuation directly.

Proper tail recursion was one of the central ideas in Steele and Sussman's original version of Scheme. Their first Scheme interpreter implemented both functions and actors. Control flow was expressed using actors, which differed from functions in that they passed their results on to another actor instead of returning to a caller. In the terminology of this section, each actor finished with a tail call to another actor.

Steele and Sussman later observed that in their interpreter the code for dealing with actors was identical to that for functions and thus there was no need to include both in the language.

A tail call is a procedure call that occurs in a tail context. Tail contexts are defined inductively. Note that a tail context is always determined with respect to a particular lambda expression.

Certain built-in procedures are also required to perform tail calls. The first argument passed to apply and to call/cc, and the second argument passed to call-with-values, must be called via a tail call. Similarly, eval must evaluate its argument as if it were in tail position within the eval procedure.

In the following example the only tail call is the call to f. None of the calls to g or h are tail calls. The reference to x is in a tail context, but it is not a call and thus is not a tail call.

(lambda () (if (g) (let ((x (h))) x) (and (g) (f))))

Note: Implementations are allowed, but not required, to recognize that some non-tail calls, such as the call to h above, can be evaluated as though they were tail calls. In the example above, the let expression could be compiled as a tail call to h. (The possibility of h returning an unexpected number of values can be ignored, because in that case the effect of the let is explicitly unspecified and implementation-dependent.)

3.6 Currying

Currying is a method of implementing multi-argument functions in languages allowing only single argument functions which are first class. Essentially it involves nesting functions to create a single function. Each nested function takes another argument until the body of the innermost function is reached and there is now an execuation context in which all the arguments are avalible. Currying also produces a number of useful effects for the programmer. These effects are why Better Scheme includes currying. Currying's primary benifit is that it allows a programmer to invoke a function with fewer arguments than expected, thereby creating a new, possibly very useful function. For example, a function add which takes to arguments and adds them could be invoked with a single argument as (add 1) to create a new function which adds one to its argument.

Better Scheme extends currying concepts to support zero argument and unrestricted functions. For zero argument functions it isn't really possible or interesting to pass fewer than zero arguments. However, for unrestricted functions it is useful to pass a partial list of arguments. A special syntax is provided for this. An unrestricted function call can be left "open" (i.e. still awaiting arguments) by terminating its argument collection improperly with void, as (function arg-1 arg-2 ... arg-n . #void).

While currying is most useful when fewer arguments are passed then when expected, in order to make nesting functions and multi-argument functions equivalent we must consider what happens when more agruments are passed than expected. In languages which rely on currying such as the lambda calculus, invocation is ussually written are simple



Back Contents Next

jwalker@cs.oberlin.edu