Common Lisp Basics

The material in this chapter will serve as an introduction to Common Lisp. I have attempted to make this book a self contained resource for learning Common Lisp and to provide code examples to perform common tasks. If you already know Common Lisp and bought this book for the code examples later in this book then you can probably skip this chapter.

For working through this chapter we will be using the interactive shell, or repl, built into SBCL and other Common Lisp systems. For this chapter it is sufficient for you to download and install SBCL. Please install SBCL right now, if you have not already done so.

Getting Started with SBCL

When we start SBCL, we see an introductory message and then an input prompt. We will start with a short tutorial, walking you through a session using SBCL repl (other Common LISP systems are very similar). A repl is an interactive console where you type expressions and see the results of evaluating these expressions. An expression can be a large block of code pasted into the repl, using the load function to load Lisp code into the repl, calling functions to test them, etc. Assuming that SBCL is installed on your system, start SBCL by running the SBCL program:

 1 % sbcl
 2 (running SBCL from: /Users/markw/sbcl)
 3 This is SBCL 2.0.2, an implementation of ANSI Common Lisp.
 4 More information about SBCL is available at <http://www.sbcl.org/>.
 5 
 6 SBCL is free software, provided as is, with absolutely no warranty.
 7 It is mostly in the public domain; some portions are provided under
 8 BSD-style licenses.  See the CREDITS and COPYING files in the
 9 distribution for more information.
10 
11 * (defvar x 1.0)
12 
13 X
14 * x
15 
16 1.0
17 * (+ x 1)
18 
19 2.0
20 * x
21 
22 1.0
23 * (setq x (+ x 1))
24 
25 2.0
26 * x
27 
28 2.0
29 * (setq x "the dog chased the cat")
30 
31 "the dog chased the cat"
32 * x
33 
34 "the dog chased the cat"
35 * (quit)

We started by defining a new variable x in line 11. Notice how the value of the defvar macro is the symbol that is defined. The Lisp reader prints X capitalized because symbols are made upper case (we will look at the exception later).

In Lisp, a variable can reference any data type. We start by assigning a floating point value to the variable x, using the + function to add 1 to x in line 17, using the setq function to change the value of x in lines 23 and 29 first to another floating point value and finally setting x to a string value. One thing that you will have noticed: function names always occur first, then the arguments to a function. Also, parenthesis is used to separate expressions.

I learned to program Lisp in 1976 and my professor half-jokingly told us that Lisp was an acronym for “Lots-of Irritating Superfluous Parenthesis.” There may be some truth in this when you are just starting with Lisp programming, but you will quickly get used to the parenthesis, especially if you use an editor like Emacs that automatically indents Lisp code for you and highlights the opening parenthesis for every closing parenthesis that you type. Many other editors support coding in Lisp but I personally use Emacs or sometimes VScode (with Common Lisp plugins) to edit Lisp code.

Before you proceed to the next chapter, please take the time to install SBCL on your computer and try typing some expressions into the Lisp listener. If you get errors, or want to quit, try using the quit function:

1 * (+ 1 2 3 4)
2 
3 10
4 * (quit)
5 Bye.

If you get an error you can enter help to get options for handling an error. When I get an error and have a good idea of what caused the error then I just enter 🅰️ to abort out of the error).

As we discussed in the introduction, there are many different Lisp programming environments that you can choose from. I recommend a free set of tools: Emacs, Quicklisp, slime, and SBCL. Emacs is a fine text editor that is extensible to work well with many programming languages and document types (e.g., HTML and XML). Slime is an Emacs extension package that greatly facilitates Lisp development. SBCL is a robust Common Lisp compiler and runtime system that is often used in production.

We will cover the Quicklisp package manager and using Quicklisp to setup Slime and Emacs in a later chapter.

I will not spend much time covering the use of Emacs as a text editor in this book since you can try most of the example code snippets in the book text by copying and then pasting them into a SBCL repl and by loading the book example source files directly into a repl. If you already use Emacs then I recommend that you do set up Slime sooner rather than later and start using it for development. If you are not already an Emacs user and do not mind spending the effort to learn Emacs, then search the web first for an Emacs tutorial. That said, you will easily be able to use the example code from this book using any text editor you like with a SBCL repl. I don’t use the vi or vim editors but if vi is your weapon of choice for editing text then a web search for “common lisp vi vim repl” should get you going for developing Common Lisp code with vi or vim. If you are not already an Emacs or vi user then using VSCode with a Common Lisp plugin is recommended.

Here, we will assume that under Windows, Unix, Linux, or Mac OS X you will use one command window to run SBCL and a separate editor that can edit plain text files.

Making the repl Nicer using rlwrap

While reading the last section you (hopefully!) played with the SBCL interactive repl. If you haven’t played with the repl, I won’t get too judgmental except to say that if you do not play with the examples as you read you will not get the full benefit from this book.

Did you notice that the backspace key does not work in the SBCL repl? The way to fix this is to install the GNU rlwrap utility. On OS X, assuming that you have homebrew installed, install rlwrap with:

1 brew install rlwrap

If you are running Ubuntu Linux, install rlwrap using:

1 sudo apt-get install rlwrap

You can then create an alias for bash or zsh using something like the following to define a command rsbcl:

1 alias rsbcl='rlwrap sbcl'

This is fine, just remember to run sbcl if you don’t need rlwrap command line editing or run rsbcl when you do need command line editing. That said, I find that I always want to run SBCL with command line editing, so I redefine sbcl on my computers using:

1 ->  ~  which sbcl
2 /Users/markw/sbcl/sbcl
3 ->  ~  alias sbcl='rlwrap /Users/markw/sbcl/sbcl'

This alias is different on my laptops and servers, since I don’t usually install SBCL in the default installation directory. For each of my computers, I add an appropriate alias in my .zshrc file (if I am running zsh) or my .bashrc file (if I am running bash).

The Basics of Lisp Programming

Although we will use SBCL in this book, any Common Lisp environment will do fine. In previous sections, we saw the top-level Lisp prompt and how we could type any expression that would be evaluated:

 1 * 1
 2 1
 3 * 3.14159
 4 3.14159
 5 * "the dog bit the cat"
 6 "the dog bit the cat"
 7 * (defun my-add-one (x)
 8 (+ x 1))
 9 MY-ADD-ONE
10 * (my-add-one -10)
11 -9

Notice that when we defined the function my-add-one in lines 7 and 8, we split the definition over two lines and on line 8 you don’t see the “*” prompt from SBCL–this lets you know that you have not yet entered a complete expression. The top level Lisp evaluator counts parentheses and considers a form to be complete when the number of closing parentheses equals the number of opening parentheses and an expression is complete when the parentheses match. I tend to count in my head, adding one for every opening parentheses and subtracting one for every closing parentheses–when I get back down to zero then the expression is complete. When we evaluate a number (or a variable), there are no parentheses, so evaluation proceeds when we hit a new line (or carriage return).

The Lisp reader by default tries to evaluate any form that you enter. There is a reader macro that prevents the evaluation of an expression. You can either use the character or quote:

1 * (+ 1 2)
2 3
3 * '(+ 1 2)
4 (+ 1 2)
5 * (quote (+ 1 2))
6 (+ 1 2)
7 * 

Lisp supports both global and local variables. Global variables can be declared using defvar:

 1 * (defvar *x* "cat")
 2 *X*
 3 * *x*
 4 "cat"
 5 * (setq *x* "dog")
 6 "dog"
 7 * *x*
 8 "dog"
 9 * (setq *x* 3.14159)
10 3.14159
11 * *x*
12 3.14159

One thing to be careful of when defining global variables with defvar: the declared global variable is dynamically scoped. We will discuss dynamic versus lexical scoping later, but for now a warning: if you define a global variable avoid redefining the same variable name inside functions. Lisp programmers usually use a global variable naming convention of beginning and ending dynamically scoped global variables with the * character. If you follow this naming convention and also do not use the * character in local variable names, you will stay out of trouble. For convenience, I do not always follow this convention in short examples in this book.

Lisp variables have no type. Rather, values assigned to variables have a type. In this last example, the variable x was set to a string, then to a floating-point number. Lisp types support inheritance and can be thought of as a hierarchical tree with the type t at the top. (Actually, the type hierarchy is a DAG, but we can ignore that for now.) Common Lisp also has powerful object oriented programming facilities in the Common Lisp Object System (CLOS) that we will discuss in a later chapter.

Here is a partial list of types (note that indentation denotes being a subtype of the preceding type):

 1 t  [top level type (all other types are a sub-type)]
 2      sequence
 3           list
 4           array
 5                vector
 6                     string
 7      number
 8           float
 9           rational
10                integer
11                ratio
12           complex
13      character
14      symbol
15      structure
16      function
17      hash-table

We can use the typep function to test the type of value of any variable or expression or use type-of to get type information of any value):

 1 * (setq x '(1 2 3))
 2 (1 2 3)
 3 * (typep x 'list)
 4 T
 5 * (typep x 'sequence)
 6 T
 7 * (typep x 'number)
 8 NIL
 9 * (typep (+ 1 2 3) 'number)
10 T
11 * (type-of 3.14159)
12 single-float
13 * (type-of "the dog ran quickly")
14 (simple-array character (19))
15 * (type-of 100193)
16 (integer 0 4611686018427387903)

A useful feature of all ANSI standard Common Lisp implementations’ top-level listener is that it sets * to the value of the last expression evaluated. For example:

1 * (+ 1 2 3 4 5)
2 15
3 * *
4 15
5 * (setq x *)
6 15
7 * x
8 15

All Common Lisp environments set * to the value of the last expression evaluated. This example may be slightly confusing because * is also the prompt character in the SBCL repl that indicates that you can enter a new expression for evaluation. For example in line 3, the first * character is the repl prompt and the second * we type in to see that value of the previous expression that we typed into the repl.

Frequently, when you are interactively testing new code, you will call a function that you just wrote with test arguments; it is useful to save intermediate results for later testing. It is the ability to create complex data structures and then experiment with code that uses or changes these data structures that makes Lisp programming environments so effective.

Common Lisp is a lexically scoped language that means that variable declarations and function definitions can be nested and that the same variable names can be used in nested let forms; when a variable is used, the current let form is searched for a definition of that variable and if it is not found, then the next outer let form is searched. Of course, this search for the correct declaration of a variable is done at compile time so there need not be extra runtime overhead. We should not nest defun special form inside each other or inside let expressions. Instead we use the special forms flet and labels to define functions inside a scoped environment. Functions defined inside a labels special form can be recursive while functions defined inside a flet special form cannot be recursive. Consider the following example in the file nested.lisp (all example files are in the src directory):

 1 (flet ((add-one (x)
 2          (+ x 1))
 3        (add-two (x)
 4          (+ x 2)))
 5   (format t "redefined variables: ~A  ~A~%" (add-one 100) (add-two 100)))
 6 
 7 (let ((a 3.14))
 8   (defun test2 (x) ; this works, but don't do it!
 9     (print x))
10   (test2 a))
11 
12 (test2 50)
13 
14 (let ((x 1)
15       (y 2))
16   ;; properly define a test function nested inside a let statement:
17   (flet ((test (a b)
18            (let ((z (+ a b)))
19              ;; define a helper function nested inside a let/function/let:
20              (flet ((nested-function (a)
21                       (+ a a)))
22                (nested-function z)))))
23     ;; call nested function 'test':
24     (format t "test result is ~A~%" (test x y))))
25 
26 (let ((z 10))
27   (labels ((test-recursion (a)
28              (format t "test-recursion ~A~%" (+ a z))
29              (if (> a 0)
30                  (test-recursion (- a 1)))))
31     (test-recursion 5)))

We define a top level flet special form in lines 1-5 that defines two nested functions add-one and add-two and then calls each nested function in the body of the flet special form. For many years I have used nested defun special forms inside let expressions for defining local functions but I now try to avoid doing this. Functions defined inside defun special forms have global visibility so they are not hidden in the local context where they are defined. The example of a nested defun in lines 7-12 shows that the function test2 has global visibility inside the current package.

Functions defined inside of a flet special form have access to variables defined in the outer scope containing the flet (also applies to labels). We see this in lines 14-24 where the local variables x and y defined in the let expression are visible inside the function nested-function defined inside the flet.

The final example in lines 26-31 shows a recursive function defined inside a labels special form.

Assuming that we started SBCL in the src directory we can then use the Lisp load function to evaluate the contents of the file nested.lisp in the sub-directory code_snippets_for_book using the load function:

 1 * (load "./code_snippets_for_book/nested.lisp")
 2 redefined variables: 101  102
 3 
 4 3.14 
 5 50 test result is 6
 6 test-recursion 15
 7 test-recursion 14
 8 test-recursion 13
 9 test-recursion 12
10 test-recursion 11
11 test-recursion 10
12 T
13 *

The function load returned a value of t (prints in upper case as T) after successfully loading the file.

We will use Common Lisp vectors and arrays frequently in later chapters, but will also briefly introduce them here. A singly dimensioned array is also called a vector. Although there are often more efficient functions for handling vectors, we will just look at generic functions that handle any type of array, including vectors. Common Lisp provides support for functions with the same name that take different argument types; we will discuss this in some detail when we cover this in the later chapter on CLOS. We will start by defining three vectors v1, v2, and v3:

1 * (setq v1 (make-array '(3)))
2 #(NIL NIL NIL)
3 * (setq v2 (make-array '(4) :initial-element "lisp is good"))
4 #("lisp is good" "lisp is good" "lisp is good" "lisp is good")
5 * (setq v3 #(1 2 3 4 "cat" '(99 100)))
6 #(1 2 3 4 "cat" '(99 100))

In line 1, we are defining a one-dimensional array, or vector, with three elements. In line 3 we specify the default value assigned to each element of the array v2. In line 5 I use the form for specifying array literals using the special character #. The function aref can be used to access any element in an array:

1 * (aref v3 3)
2 4
3 * (aref v3 5)
4 '(99 100)
5 * 

Notice how indexing of arrays is zero-based; that is, indices start at zero for the first element of a sequence. Also notice that array elements can be any Lisp data type. So far, we have used the special operator setq to set the value of a variable. Common Lisp has a generalized version of setq called setf that can set any value in a list, array, hash table, etc. You can use setf instead of setq in all cases, but not vice-versa. Here is a simple example:

1 * v1
2 #(NIL NIL NIL)
3 * (setf (aref v1 1) "this is a test") 
4 "this is a test"
5 * v1
6 #(NIL "this is a test" NIL)
7 * 

When writing new code or doing quick programming experiments, it is often easiest (i.e., quickest to program) to use lists to build interesting data structures. However, as programs mature, it is common to modify them to use more efficient (at runtime) data structures like arrays and hash tables.

Symbols

We will discuss symbols in more detail the Chapter on Common Lisp Packages. For now, it is enough for you to understand that symbols can be names that refer to variables. For example:

1 > (defvar *cat* "bowser")
2 *CAT*
3 * *cat*
4 "bowser"
5 * (defvar *l* (list *cat*))
6 *L*
7 * *l*
8 ("bowser")
9 *

Note that the first defvar returns the defined symbol as its value. Symbols are almost always converted to upper case. An exception to this “upper case rule” is when we define symbols that may contain white space using vertical bar characters:

1 * (defvar |a symbol with Space Characters| 3.14159)
2 |a symbol with Space Characters|
3 * |a symbol with Space Characters|
4 3.14159
5 * 

Operations on Lists

Lists are a fundamental data structure of Common Lisp. In this section, we will look at some of the more commonly used functions that operate on lists. All of the functions described in this section have something in common: they do not modify their arguments.

In Lisp, a cons cell is a data structure containing two pointers. Usually, the first pointer in a cons cell will point to the first element in a list and the second pointer will point to another cons representing the start of the rest of the original list.

The function cons takes two arguments that it stores in the two pointers of a new cons data structure. For example:

1 * (cons 1 2)
2 (1 . 2)
3 * (cons 1 '(2 3 4))
4 (1 2 3 4)
5 * 

The first form evaluates to a cons data structure while the second evaluates to a cons data structure that is also a proper list. The difference is that in the second case the second pointer of the freshly created cons data structure points to another cons cell.

First, we will declare two global variables l1 and l2 that we will use in our examples. The list l1 contains five elements and the list l2 contains four elements:

 1 * (defvar l1 '(1 2 (3) 4 (5 6)))
 2 L1
 3 * (length l1)
 4 
 5 5
 6 * (defvar l2 '(the "dog" calculated 3.14159))
 7 L2
 8 * l1
 9 (1 2 (3) 4 (5 6))
10 * l2
11 (THE "dog" CALCULATED 3.14159)
12 >

You can also use the function list to create a new list; the arguments passed to function list are the elements of the created list:

1 * (list 1 2 3 'cat "dog")
2 (1 2 3 CAT "dog")
3 *

The function car returns the first element of a list and the function cdr returns a list with its first element removed (but does not modify its argument):

1 * (car l1)
2 1
3 * (cdr l1)
4 (2 (3) 4 (5 6))
5 *

Using combinations of car and cdr calls can be used to extract any element of a list:

1 * (car (cdr l1))
2 2
3 * (cadr l1)
4 2
5 *

Notice that we can combine calls to car and cdr into a single function call, in this case the function cadr. Common Lisp defines all functions of the form cXXr, cXXXr, and cXXXXr where X can be either a or d.

Suppose that we want to extract the value 5 from the nested list l1. Some experimentation with using combinations of car and cdr gets the job done:

 1 * l1
 2 (1 2 (3) 4 (5 6))
 3 * (cadr l1)
 4 2
 5 * (caddr l1)
 6 (3)
 7 (car (caddr l1))
 8 3
 9 * (caar (last l1))
10 5
11 * (caar (cddddr l1))
12 
13 5
14 *

The function last returns the last cdr of a list (i.e., the last element, in a list):

1 * (last l1)
2 ((5 6))
3 *

Common list supplies alternative functions to car and cdr that you might find more readable: first, second, third, fourth, and rest. Here are some examples:

 1 * (defvar *x* '(1 2 3 4 5))
 2 
 3 *X*
 4 * (first *x*)
 5 
 6 1
 7 * (rest *x*)
 8 
 9 (2 3 4 5)
10 * (second *x*)
11 
12 2
13 * (third *x*)
14 
15 3
16 * (fourth *x*)
17 
18 4

The function nth takes two arguments: an index of a top-level list element and a list. The first index argument is zero based:

1 * l1
2 (1 2 (3) 4 (5 6))
3 * (nth 0 l1)
4 1
5 * (nth 1 l1)
6 2
7 * (nth 2 l1)
8 (3)
9 *

The function cons adds an element to the beginning of a list and returns as its value a new list (it does not modify its arguments). An element added to the beginning of a list can be any Lisp data type, including another list:

1 * (cons 'first l1)
2 (FIRST 1 2 (3) 4 (5 6))
3 * (cons '(1 2 3) '(11 22 33))
4 ((1 2 3) 11 22 33)
5 * 

The function append takes two lists as arguments and returns as its value the two lists appended together:

1 * l1
2 (1 2 (3) 4 (5 6))
3 * l2
4 ('THE "dog" 'CALCULATED 3.14159)
5 * (append l1 l2)
6 (1 2 (3) 4 (5 6) THE "dog" CALCULATED 3.14159)
7 * (append '(first) l1)
8 (FIRST 1 2 (3) 4 (5 6))
9 * 

A frequent error that beginning Lisp programmers make is not understanding shared structures in lists. Consider the following example where we generate a list y by reusing three copies of the list x:

 1 * (setq x '(0 0 0 0))
 2 (0 0 0 0)
 3 * (setq y (list x x x))
 4 ((0 0 0 0) (0 0 0 0) (0 0 0 0))
 5 * (setf (nth 2 (nth 1 y)) 'x)
 6 X
 7 * x
 8 (0 0 X 0)
 9 * y
10 ((0 0 X 0) (0 0 X 0) (0 0 X 0))
11 * (setq z '((0 0 0 0) (0 0 0 0) (0 0 0 0)))
12 ((0 0 0 0) (0 0 0 0) (0 0 0 0))
13 * (setf (nth 2 (nth 1 z)) 'x)
14 X
15 * z
16 ((0 0 0 0) (0 0 X 0) (0 0 0 0))
17 * 

When we change the shared structure referenced by the variable x that change is reflected three times in the list y. When we create the list stored in the variable z we are not using a shared structure.

Using Arrays and Vectors

Using lists is easy but the time spent accessing a list element is proportional to the length of the list. Arrays and vectors are more efficient at runtime than long lists because list elements are kept on a linked-list that must be searched. Accessing any element of a short list is fast, but for sequences with thousands of elements, it is faster to use vectors and arrays.

By default, elements of arrays and vectors can be any Lisp data type. There are options when creating arrays to tell the Common Lisp compiler that a given array or vector will only contain a single data type (e.g., floating point numbers) but we will not use these options in this book.

Vectors are a specialization of arrays; vectors are arrays that only have one dimension. For efficiency, there are functions that only operate on vectors, but since array functions also work on vectors, we will concentrate on arrays. In the next section, we will look at character strings that are a specialization of vectors.

We could use the generalized make-sequence function to make a singularly dimensioned array (i.e., a vector). Restart sbcl and try:

1 * (defvar x (make-sequence 'vector 5 :initial-element 0))
2 X
3 * x
4 #(0 0 0 0 0)
5 * 

In this example, notice the print format for vectors that looks like a list with a proceeding # character. As seen in the last section, we use the function make-array to create arrays:

1 * (defvar y (make-array '(2 3) :initial-element 1))
2 Y
3 * y
4 #2A((1 1 1) (1 1 1))
5 >

Notice the print format of an array: it looks like a list proceeded by a # character and the integer number of dimensions.

Instead of using make-sequence to create vectors, we can pass an integer as the first argument of make-array instead of a list of dimension values. We can also create a vector by using the function vector and providing the vector contents as arguments:

1 * (make-array 10)  
2 #(NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL)
3 * (vector 1 2 3 'cat)
4 #(1 2 3 CAT)
5 * 

The function aref is used to access sequence elements. The first argument is an array and the remaining argument(s) are array indices. For example:

 1 * x
 2 #(0 0 0 0 0)
 3 * (aref x 2)
 4 0
 5 * (setf (aref x 2) "parrot")
 6 "parrot"
 7 * x
 8 #(0 0 "parrot" 0 0)
 9 * (aref x 2)
10 "parrot"
11 * y
12 #2A((1 1 1) (1 1 1))
13 * (setf (aref y 1 2) 3.14159)
14 3.14159
15 * y
16 #2A((1 1 1) (1 1 3.14159))
17 * 

Using Strings

It is likely that even your first Lisp programs will involve the use of character strings. In this section, we will cover the basics: creating strings, concatenating strings to create new strings, for substrings in a string, and extracting substrings from longer strings. The string functions that we will look at here do not modify their arguments; rather, they return new strings as values. For efficiency, Common Lisp does include destructive string functions that do modify their arguments but we will not discuss these destructive functions here.

We saw earlier that a string is a type of vector, which in turn is a type of array (which in turn is a type of sequence). A full coverage of the Common Lisp type system is outside the scope of this tutorial introduction to Common Lisp; a very good treatment of Common Lisp types is in Guy Steele’s “Common Lisp, The Language” which is available both in print and for free on the web. Many of the built in functions for handling strings are actually more general because they are defined for the type sequence. The Common Lisp Hyperspec is another great free resource that you can find on the web. I suggest that you download an HTML version of Guy Steele’s excellent reference book and the Common Lisp Hyperspec and keep both on your computer. If you continue using Common Lisp, eventually you will want to read all of Steele’s book and use the Hyperspec for reference.

The following text was captured from input and output from a Common Lisp repl. First, we will declare two global variables s1 and space that contain string values:

1 * (defvar s1 "the cat ran up the tree")
2 S1
3 * (defvar space " ")
4 SPACE
5 * 

One of the most common operations on strings is to concatenate two or more strings into a new string:

1 * (concatenate 'string s1 space "up the tree")
2 "the cat ran up the tree up the tree"
3 *

Notice that the first argument of the function concatenate is the type of the sequence that the function should return; in this case, we want a string. Another common string operation is search for a substring:

1 * (search "ran" s1)
2 8
3 * (search "zzzz" s1)
4 NIL
5 *

If the search string (first argument to function search) is not found, function search returns nil, otherwise search returns an index into the second argument string. Function search takes several optional keyword arguments (see the next chapter for a discussion of keyword arguments):

1   (search search-string a-longer-string :from-end :test
2                                         :test-not :key
3                                         :start1 :start2
4                                         :end1 :end2)

For our discussion, we will just use the keyword argument :start2 for specifying the starting search index in the second argument string and the :from-end flag to specify that search should start at the end of the second argument string and proceed backwards to the beginning of the string:

1 * (search " " s1)
2 3
3 * (search " " s1 :start2 5)
4 7
5 * (search " " s1 :from-end t)
6 18
7 *

The sequence function subseq can be used for strings to extract a substring from a longer string:

1 * (subseq s1 8)
2 "ran up the tree"
3 >

Here, the second argument specifies the starting index; the substring from the starting index to the end of the string is returned. An optional third index argument specifies one greater than the last character index that you want to extract:

1 * (subseq s1 8 11)
2 "ran"
3 *

It is frequently useful to remove white space (or other) characters from the beginning or end of a string:

1 * (string-trim '(#\space #\z #\a) " a boy said pez")
2 "boy said pe"
3 * 

The character #\space is the space character. Other common characters that are trimmed are #\tab and #\newline. There are also utility functions for making strings upper or lower case:

1 * (string-upcase "The dog bit the cat.")
2 "THE DOG BIT THE CAT."
3 * (string-downcase "The boy said WOW!")
4 "the boy said wow!"
5 >

We have not yet discussed equality of variables. The function eq returns true if two variables refer to the same data in memory. The function eql returns true if the arguments refer to the same data in memory or if they are equal numbers or characters. The function equal is more lenient: it returns true if two variables print the same when evaluated. More formally, function equal returns true if the car and cdr recursively equal to each other. An example will make this clearer:

 1 * (defvar x '(1 2 3))
 2 X
 3 * (defvar y '(1 2 3))
 4 Y
 5 * (eql x y)
 6 NIL
 7 * (equal x y)
 8 T
 9 * x
10 (1 2 3)
11 * y
12 (1 2 3)
13 * 

For strings, the function string= is slightly more efficient than using the function equal:

1 * (eql "cat" "cat")
2 NIL
3 * (equal "cat" "cat")
4 T
5 * (string= "cat" "cat")
6 T
7 * 

Common Lisp strings are sequences of characters. The function char is used to extract individual characters from a string:

1 * s1
2 "the cat ran up the tree"
3 * (char s1 0)
4 #\t
5 * (char s1 1)
6 #\h
7 * 

Using Hash Tables

Hash tables are an extremely useful data type. While it is true that you can get the same effect by using lists and the assoc function, hash tables are much more efficient than lists if the lists contain many elements. For example:

1 * (defvar x '((1 2) ("animal" "dog")))
2 X
3 * (assoc 1 x)
4 (1 2)
5 * (assoc "animal" x)
6 NIL
7 * (assoc "animal" x :test #'equal)
8 ("animal" "dog")
9 *

The second argument to function assoc is a list of cons cells. Function assoc searches for a sub-list (in the second argument) that has its car (i.e., first element) equal to the first argument to function assoc. The perhaps surprising thing about this example is that assoc seems to work with an integer as the first argument but not with a string. The reason for this is that by default the test for equality is done with eql that tests two variables to see if they refer to the same memory location or if they are identical if they are numbers. In the last call to assoc we used “:test #’equal” to make assoc use the function equal to test for equality.

The problem with using lists and assoc is that they are very inefficient for large lists. We will see that it is no more difficult to code with hash tables.

A hash table stores associations between key and value pairs, much like our last example using the assoc function. By default, hash tables use eql to test for equality when looking for a key match. We will duplicate the previous example using hash tables:

 1 * (defvar h (make-hash-table))
 2 H
 3 * (setf (gethash 1 h) 2)
 4 2
 5 * (setf (gethash "animal" h) "dog")
 6 "dog"
 7 * (gethash 1 h)
 8 2 ;
 9 T
10 * (gethash "animal" h)
11 NIL ;
12 NIL
13 *

Notice that gethash returns multiple values: the first value is the value matching the key passed as the first argument to function gethash and the second returned value is true if the key was found and nil otherwise. The second returned value could be useful if hash values are nil.

Since we have not yet seen how to handle multiple returned values from a function, we will digress and do so here (there are many ways to handle multiple return values and we are just covering one of them):

1 * (multiple-value-setq (a b) (gethash 1 h))
2 2
3 * a
4 2
5 * b
6 T
7 * 

Assuming that variables a and b are already declared, the variable a will be set to the first returned value from gethash and the variable b will be set to the second returned value.

If we use symbols as hash table keys, then using eql for testing for equality with hash table keys is fine:

1 * (setf (gethash 'bb h) 'aa)
2 AA
3 * (gethash 'bb h)
4 AA ;
5 T
6 * 

However, we saw that eql will not match keys with character string values. The function make-hash-table has optional key arguments and one of them will allow us to use strings as hash key values:

1   (make-hash-table &key :test :size :rehash-size :rehash-threshold)

Here, we are only interested in the first optional key argument :test that allows us to use the function equal to test for equality when matching hash table keys. For example:

 1 * (defvar h2 (make-hash-table :test #'equal))
 2 H2
 3 * (setf (gethash "animal" h2) "dog")
 4 "dog"
 5 * (setf (gethash "parrot" h2) "Brady")
 6 "Brady"
 7 * (gethash "parrot" h2)
 8 "Brady" ;
 9 T
10 * 

It is often useful to be able to enumerate all the key and value pairs in a hash table. Here is a simple example of doing this by first defining a function my-print that takes two arguments, a key and a value. We can then use the maphash function to call our new function my-print with every key and value pair in a hash table:

1 * (defun my-print (a-key a-value)
2         (format t "key: ~A value: ~A~\%" a-key a-value))          
3 MY-PRINT
4 * (maphash #'my-print h2)
5 key: parrot value: Brady
6 key: animal value: dog
7 NIL
8 * 

The function my-print is applied to each key/value pair in the hash table. There are a few other useful hash table functions that we demonstrate here:

 1 * (hash-table-count h2)
 2 2
 3 * (remhash "animal" h2)
 4 T
 5 * (hash-table-count h2)
 6 1
 7 * (clrhash h2)
 8 #S(HASH-TABLE EQUAL)
 9 * (hash-table-count h2)
10 0
11 * 

The function hash-table-count returns the number of key and value pairs in a hash table. The function remhash can be used to remove a single key and value pair from a hash table. The function clrhash clears out a hash table by removing all key and value pairs in a hash table.

It is interesting to note that clrhash and remhash are the first Common Lisp functions that we have seen so far that modify any of its arguments, except for setq and setf that are macros and not functions.

Using Eval to Evaluate Lisp Forms

We have seen how we can type arbitrary Lisp expressions in the Lisp repl listener and then they are evaluated. We will see in the Chapter on Input and Output that the Lisp function read evaluates lists (or forms) and indeed the Lisp repl uses function read.

In this section, we will use the function eval to evaluate arbitrary Lisp expressions inside a program. As a simple example:

1 * (defvar x '(+ 1 2 3 4 5))
2 X
3 * x
4 (+ 1 2 3 4 5)
5 * (eval x)
6 15
7 * 

Using the function eval, we can build lists containing Lisp code and evaluate generated code inside our own programs. We get the effect of “data is code”. A classic Lisp program, the OPS5 expert system tool, stored snippets of Lisp code in a network data structure and used the function eval to execute Lisp code stored in the network. A warning: the use of eval is likely to be inefficient in non-compiled code. For efficiency, the OPS5 program contained its own version of eval that only interpreted a subset of Lisp used in the network.

Using a Text Editor to Edit Lisp Source Files

I usually use Emacs, but we will briefly discuss the editor vi also. If you use vi (e.g., enter “vi nested.lisp”) the first thing that you should do is to configure vi to indicate matching opening parentheses whenever a closing parentheses is typed; you do this by typing “:set sm” after vi is running.

If you choose to learn Emacs, enter the following in your .emacs file (or your _emacs file in your home directory if you are running Windows):

1   (set-default 'auto-mode-alist
2                (append '(("\\.lisp$" . lisp-mode)
3                          ("\\.lsp$" . lisp-mode)
4                          ("\\.cl$" . lisp-mode))
5                        auto-mode-alist))

Now, whenever you open a file with the extension of “lisp”, “lsp”, or “cl” (for “Common Lisp”) then Emacs will automatically use a Lisp editing mode. I recommend searching the web using keywords “Emacs tutorial” to learn how to use the basic Emacs editing commands - we will not repeat this information here.

I do my professional Lisp programming using free software tools: Emacs, SBCL, Clozure Common Lisp, and Clojure. I will show you how to configure Emacs and Slime in the last section of the Chapter on Quicklisp.

Recovering from Errors

When you enter forms (or expressions) in a Lisp repl listener, you will occasionally make a mistake and an error will be thrown. Here is an example where I am not showing all of the output when entering help when an error is thrown:

 1 * (defun my-add-one (x) (+ x 1))
 2 
 3 MY-ADD-ONE
 4 * (my-add-one 10)
 5 
 6 11
 7 * (my-add-one 3.14159)
 8 
 9 4.14159
10 * (my-add-one "cat")
11 
12 debugger invoked on a SIMPLE-TYPE-ERROR: Argument X is not a NUMBER: "cat"
13 
14 Type HELP for debugger help, or (SB-EXT:EXIT) to exit from SBCL.
15 
16 restarts (invokable by number or by possibly-abbreviated name):
17   0: [ABORT] Exit debugger, returning to top level.
18 
19 (SB-KERNEL:TWO-ARG-+ "cat" 1)
20 0] help
21 
22 The debug prompt is square brackets, with number(s) indicating the current
23   control stack level and, if you've entered the debugger recursively, how
24   deeply recursed you are.
25 
26  ...
27 
28 Getting in and out of the debugger:
29   TOPLEVEL, TOP  exits debugger and returns to top level REPL
30   RESTART        invokes restart numbered as shown (prompt if not given).
31   ERROR          prints the error condition and restart cases.
32 
33  ...
34 
35 Inspecting frames:
36   BACKTRACE [n]  shows n frames going down the stack.
37   LIST-LOCALS, L lists locals in current frame.
38   PRINT, P       displays function call for current frame.
39   SOURCE [n]     displays frame's source form with n levels of enclosing forms.
40 
41 Stepping:
42   START Selects the CONTINUE restart if one exists and starts
43         single-stepping. Single stepping affects only code compiled with
44         under high DEBUG optimization quality. See User Manual for details.
45   STEP  Steps into the current form.
46   NEXT  Steps over the current form.
47   OUT   Stops stepping temporarily, but resumes it when the topmost frame that
48         was stepped into returns.
49   STOP  Stops single-stepping.
50 
51  ...
52 
53 0] list-locals
54 SB-DEBUG::ARG-0  =  "cat"
55 SB-DEBUG::ARG-1  =  1
56 
57 0] backtrace 2
58 
59 Backtrace for: #<SB-THREAD:THREAD "main thread" RUNNING {1002AC32F3}>
60 0: (SB-KERNEL:TWO-ARG-+ "cat" 1)
61 1: (MY-ADD-ONE "cat")
62 0] :0
63 
64 *

Here, I first used the backtrace command :bt to print the sequence of function calls that caused the error. If it is obvious where the error is in the code that I am working on then I do not bother using the backtrace command. I then used the abort command :a to recover back to the top level Lisp listener (i.e., back to the greater than prompt). Sometimes, you must type :a more than once to fully recover to the top level greater than prompt.

Garbage Collection

Like other languages like Java and Python, Common Lisp provides garbage collection (GC) or automatic memory management.

In simple terms, GC occurs to free memory in a Lisp environment that is no longer accessible by any global variable (or function closure, which we will cover in the next chapter). If a global variable *variable-1* is first set to a list and then if we later then set *variable-1* to, for example nil, and if the data referenced in the original list is not referenced by any other accessible data, then this now unused data is subject to GC.

In practice, memory for Lisp data is allocated in time ordered batches and ephemeral or generational garbage collectors garbage collect recent memory allocations far more often than memory that has been allocated for a longer period of time.

Loading your Working Environment Quickly

When you start using Common Lisp for large projects, you will likely have many files to load into your Lisp environment when you start working. Most Common Lisp implementations have a function called defsystem that works somewhat like the Unix make utility. While I strongly recommend defsystem for large multi-person projects, I usually use a simpler scheme when working on my own: I place a file loadit.lisp in the top directory of each project that I work on. For any project, its loadit.lisp file loads all source files and initializes any global data for the project.

The last two chapters of this book provide example applications that are configured to work with Quicklisp, which we will study in the next chapter.

Another good technique is to create a Lisp image containing all the code and data for all your projects. There is an example of this in the first section of the Chapter on NLP. In this example, it takes a few minutes to load the code and data for my NLP (natural language processing) library so when I am working with it I like to be able to quickly load a SBCL Lisp image.

All Common Lisp implementations have a mechanism for dumping a working image containing code and data.

Functional Programming Concepts

There are two main styles for doing Common Lisp development. Object oriented programming is well supported (see the Chapter on CLOS) as is functional programming. In a nut shell, functional programming means that we should write functions with no side effects. First let me give you a non-functional example with side effects:

1 (defun non-functional-example (car)
2   (set-color car "red"))

This example using CLOS is non-functional because we modify the value of an argument to the function. Some functional languages like the Lisp Clojure language and the Haskell language dissuade you from modifying arguments to functions. With Common Lisp you should make a decision on which approach you like to use.

Functional programming means that we avoid maintaining state inside of functions and treat data as immutable (i.e., once an object is created, it is never modified). We could modify the last example to be function by creating a new car object inside the function, copy the attributes of the car passed as an object, change the color to “red” of the new car object, and return the new car instance as the value of the function.

Functional programming prevents many types of programming errors, makes unit testing simpler, and makes programming for modern multi-core CPUs easier because read-only objects are inherently thread safe. Modern best practices for the Java language also prefer immutable data objects and a functional approach.