On S-expressions and the utility of the programming language Scheme in their manipulation
In many cases it would be nice to have a concise and unambiguous notation
for symmetry and other operations involving the structures presented in the
previous chapter. Also, when examining their orbit-profiles it's useful
to have some empirical data before trying to derive mathematical conclusions.
Naturally, such data can be obtained with computer, by programming the
rotations, reflections and other operations as procedures that act on
data structures implementing various manifestations presented in Chapter I.
E.g., a simple-minded aficionado of the object-oriented programming (OOP)
paradigm might devise a separate class for each such manifestation, with
certain inheritance between the classes, e.g. a class of binary trees might
be a subclass of general trees, and non-crossing handshakes and set-partitions
might diverge from a common class of chord-diagrams.
The objects instantiating the said classes would then hide each
its own set of private elements, for whose manipulation specific
methods would be needed.
Clearly this approach is far too baroque, when in the previous chapter
I have already shown that these various interpretations for
Catalan numbers are just different manifestations of the one and
same combinatorial structure. This means that to effect, say a rotation
of Euler's polygon divisions, it's enough to define a certain
operation on binary trees, as long as those two manifestations
are unambiguously bound together in some way, the most natural
of which was explained in the chapter I.
So, instead of tailoring multiple custom data types, a single
universal one should suffice.
It turns out that S-expressions, introduced by John McCarthy
in his programming language Lisp [JMC, 1960] naturally fit the bill.
Being originally just designed for the manipulation of such
data structures, it contains a powerful set of primitives
for their manipulation, which have in turn been inherited
by its dialect Scheme (invented by Guy Steele and Gerald Sussman),
and to a lesser extent also by Prolog, and certain more recent functional
programming languages derived from these.
S-expressions
Definition. There exists various definitions for S-expressions
(see e.g. [Holm], [Sottile]),
however in this paper we adopt a convention that an
internal, "dotted" representation of an S-expression can be any
of the following:
- a symbol. For example, define, pair?, else and + are symbols.
- an integer. For example, 144 and -1 are integers.
- an empty list ( )
or
- an ordered pair of the form (X . Y)
where the element X (traditionally called car) can be any
of the cases (a-d), while the element Y (traditionally called cdr)
can be either the case (c) (an empty list) or the case (d), other such
pair.
This definition is isomorphic with the external, "dotless" representation,
in which an S-expression can be any of the following:
- a symbol.
- an integer.
or
- (<dotfree-sexpr>*)
The last case says that a parenthesis-delimited list of zero or more
other such external representations is also an external, dot-free representation.
From now on, when it's not the question of a single symbol or integer, we shall refer
to these dotless representations of S-expressions as lists.
The isomorphism between these two representations is realized
by the function f which maps from the "internal" to "external"
representation as follows:
| symbol | --> | symbol |
| integer | --> | integer |
| ( ) | --> | ( ) |
| (X . ( )) |
--> |
( f(X) ) |
| (X . Y) |
--> |
a list obtained by inserting f(X) as the first element
to a front of a list obtained with f(Y) |
For example, the function f would map the internal, dotted S-expression
(a . ((( ) . (2 . ( ))) . (b . ( ))))
to external, dotless list
(a (( ) 2) b)
This definition is sufficient for explaining how to define
functions in our minimal subset of Scheme.
Furthermore, we define a subset of symbolless S-expressions. Their
internal, dotted representation is defined that they are either:
- an empty list ( )
or
- an ordered pair of the form (X . Y)
where either element, X or Y can be either an empty list
or other such pair.
When we apply the above map f to this subset, we realize
we have a missing link connecting our two sub-families
of the Catalan clan, represented in the previous section.
Namely, the former definition is equivalent (*) to the
definition of planar binary trees and binary bracketings,
and the corresponding external, dot-free representations are lists
composed of parentheses only, i.e. parenthesizations
surrounded by an extra pair of parentheses.
A brief introduction to the programming language Scheme
Note: Most of what is said below applies also to Lisp,
with just minor changes in the names of some primitives and functions.
We might gloss over some details, as our purpose is just to describe
a subset of the language sufficient for implementing gatomorphisms.
The syntax of the language is itself based on S-expressions:
A program written in Scheme is a set of functions invoking
each other and/or built-in functions, where each function
defined by the programmer is represented as a single
S-expression of the special form, beginning with the
symbol define.
Almost every computation in Scheme is effected
by evaluating a function call,
and each such function call is represented
as a list of length n+1, where n
is the arity of the function invoked.
For example, a function call with three arguments,
like f(a, b, c) is represented as a list
of four elements: (f a b c),
and a nested call like f(g(a, b), c)
in turn represented by a nested list structure (S-expression):
(f (g a b) c).
Here is a list functions and primitives we need to implement
gatomorphisms in Scheme.
- (cons Sexpr1 Sexpr2)
- Returns a new cons cell whose car-side contains Sexpr1 and cdr-side Sexpr2.
- Examples:
(cons 'a 'b) = (a . b)
(cons '() '(() . ()))
= (() . (() . ())) [Result in the internal, dotted-pair format.]
= (() ()) [Result in the external format.]
- (car Pair)
- Returns the car-side of the cons-cell Pair. An error results if car is applied to an empty list ( ).
- Examples:
(car '(a . b)) = a
(car '((a b) c d)) = (a b)
- (cdr Pair)
- Returns the cdr-side of the cons-cell Pair. An error results if cdr is applied to an empty list ( ).
- Examples:
(cdr '(a . b)) = b
(cdr '((a b) c d)) = (c d)
- (list Sexpr1 Sexpr2 ... Sexprn)
- Returns a list constructed from its arguments.
- Examples:
(list) = () [With no arguments gives an empty list.]
(list '(a b)) = ((a b)) [Unary form always surrounds with extra parentheses.]
(list 'a 'b) = (a b)
- (null? Sexpr)
- Returns #t (true) if its argument is an empty list ( ), otherwise #f (false).
- Examples:
(null? (cons 'a 'b)) = #f
(null? (list)) = #t
- (pair? Sexpr)
- Returns #t (true) if its argument is a cons cell, otherwise #f (false).
- Examples:
(pair? (cons 'a 'b)) = #t
(pair? (list)) = #f
- (not Sexpr)
- Returns #t (true) if its argument is #f (false), and for all other values returns #f (false).
- Examples:
(not (pair? (cons 'a 'b))) = #f
(not (pair? (list))) = #t
- Note: As long as we restrict our attention to symbolless S-expressions
(as in the inplementations of gatomorphisms below),
(null? Sexpr) is equivalent to (not (pair? Sexpr))
and vice versa,
(pair? Sexpr) is equivalent to (not (null? Sexpr))
- (if Expr1 Expr2 Expr3)
- Returns the value of Expr2 if Expr1 evaluates to a non-false value, otherwise evaluates Expr3 and returns its value. Note that if always evaluates either Expr2 or Expr3, never both.
- Examples:
(if (null? (list)) (list 'it 'is 'empty) (list 'it 'is 'not 'empty))
= (it is empty)
(if (pair? (list)) (list 'it 'is 'a 'pair) (list 'it 'is 'not 'pair))
= (it is not pair)
- (cond (TestExprA ExprA1 ... ExprAn)
(TestExprB ExprB1 ... ExprBn)
...
(else Expr_else1 ... Expr_elsen)
)
- If TestExprA evaluates to a non-false value, executes expressions
ExprA1 ... ExprAn listed after it, returning the value
of the last one as the value of whole cond-expression, without
evaluating any expressions.
Otherwise, checks whether TestExprB evaluates to a non-false value,
and if so, executes expressions ExprB1 ... ExprBn listed
after it, again returning the value of the last one as the value of
whole cond-expression. Et cetera.
If none of the test-expressions evaluate to a non-false value before else,
then the expressions Expr_else1 ... Expr_elsen are evaluated,
with the value of the last one returned as the value of the whole cond-expression.
- Examples:
(cond ((null? (list)) (list 'it 'is 'empty))
(else (list 'it 'is 'not 'empty))
)
-> (it is empty)
(cond ((pair? (list)) (list 'it 'is 'a 'pair))
(else (list 'it 'is 'not 'pair))
)
-> (it is not pair)
- Note: As can seen from above examples
(if Expr1 Expr2 Expr3)
is equivalent to
(cond (Expr1 Expr2) (else Expr3))
- (define (funcname arg1 arg2 ... argn)
expr1
...
exprn
)
- Defines a new function with name funcname with formal argument names
arg1 arg2 ... argn, which when invoked, will bind the values
of the actual arguments to the formal arguments, and then execute each of
the expressions expr1 .. exprn in turn, returning the value
of the last one as the result of the function invocation.
- Examples:
(define (xcons x y)
(cons y x)
)
(xcons 'a 'b) -> (b . a) [We defined a "crossed" cons.]
- (let ((var1 init1)
...
(varn initn)
)
expr1
...
exprn
)
- Evaluates the expressions init1 ... initn and bounds variables
var1 ... varn temporarily to their values,
then executes each of the expressions expr1 .. exprn in turn,
returning the value of the last one as the result of the whole let-expression.
Note that this is the standard way to define temporary variables inside
function definitions.
- (append Sexpr1 Sexpr2 ... Sexprn)
- Returns a list concatenated together from its arguments.
- Examples:
(append) = () [With no arguments gives an empty list.]
(append '(a b) '(c)) = (a b c)
(append '() '(a b) '() '(c d) '()) = (a b c d) [Empty lists are silently ignored.]
- (set-car! Pair Sexpr)
- Modifies physically the contents of cons cell Pair, such that
its car-side is overwritten with Sexpr. An error results if the first argument is not a cons cell.
- (set-cdr! Pair Sexpr)
- Modifies physically the contents of cons cell Pair, such that
its cdr-side is overwritten with Sexpr. An error results if the first argument is not a cons cell.
Difference between constructive and destructive implemetations of gatomorphisms
If we exclude set-car! and set-cdr! from the
set of primitives listed above, we can only define gatomorphisms
and other functions that leave intact the physical structure of
the S-expression(s) given as their argument(s).
In case of gatomorphisms, if it is not a question about
a mere identity function, then at least on some values
of its argument it must return an S-expression with different
structure than the argument has. Because by its definition
is a bijection on the set of structures of the same size,
merely invoking the accessor-functions car and/or cdr
is impossible, as then the result's size (in cons cell nodes)
would be less than that of the argument.
So to effect changes which return an S-expression of the same size
but of different structure, it must invoke the function cons,
or some other functions like append or list which calls
cons implicitly.
For this reason these are called constructive implementations
of gatomorphisms.
On the other hand, if an implementation of a gatomorphism invokes
set-car! and/or set-cdr! or any other function
invoking them, it is called destructive.
This is because such functions make direct modifications
to their argument's physical structure, and thus destroy
information about their original structure.
Traditionally, use of such destructive functions has been
considered extremely dangerous, and thus Scheme has adopted
the convention of suffixing the names of such functions
with an exclamation mark (!). For example, there is
a function called append! which is a destructive
version of the function append described above.
However, it turns out that in our application the use
of strictly destructive implementations of gatomorphisms
(or functions of suspected of being ones)
have certain merits, when they are subjected to mathematical
analysis. Namely, the bijectivity of such a function is much
easier to proive, if we know that it does neither "lose"
any existing cons cells, nor allocate any new ones
with constructive operations. More about this in the last
section of this paper.
For most of the gatomorphisms represented below we thus
give both constructive and destructive definition. We follow
the Scheme practice of suffixing the names of the latter
with '!'.
For example, here is a destructive version of
gatomorphisms A069770 whose constructive version
was defined above:
(define (gma069770! s)
(if (null? s)
s
(let ((org-car (car s)))
(set-car! s (cdr s))
(set-cdr! s org-car)
s
)
)
)
Note that we first check whether the argument is
an empty S-expression ( ), which is returned back intact,
while for real S-expressions consisting of cons cells we first
have to store the original contents of the car-side of the
argument to a temporary variable org-car, as otherwise
its contents would be lost immediately after set-car! has
overwritten it with the cdr-side of the argument.
The destructive version of A057163 can then be written
simply as:
(define (gma057163! s)
(cond ((pair? s)
(gma069770! s)
(gma057163! (car s))
(gma057163! (cdr s))
)
)
s
)
i.e. we apply A069770 to every cons cell we encounter
(i.e. swap their sides), and recurse down to both car- and cdr-side,
each branch as long as we encounter ( ).