Read-Eval-Print-λove

Read-Eval-Print-λove
Read-Eval-Print-λove
Buy on Leanpub

A note

Read-Eval-Print-λove is an N-monthly zine of original content and curation about the Lisp family of programming languages and little-languages in general.

Pity

One day a macrolyte approached the master macrologist Pi Mu and quoted to him a poem:

The voices of torrents are from one great LISP dialect, the lions of the hills are the pure body of McCarthy.

Looking expectantly at Pi Mu the macrolyte exclaimed:

Isn’t that right?

Pi Mu looked at the macrolyte and, in a low tone, told him:

It is … But it’s a pity to say so.

Dedication

June 13, 1965 – June 17, 2009

June 13, 1965 – June 17, 2009

Introduction - UrLISP

So you want to create a Lisp? In this installment I’ll talk a little about the creation of new Lisps.

The infinitely quotable Erik Naggum had this to say about the desire to reinvent Lisp:1

… yet another demonstration of the problem of too strong an ego that cannot live with what somebody else has made.

However, in the same message Erik Naggum also said this:

Please let each person determine what is the “greater use” of their own energy, as long as it is not harmful.

The phrase above has, for better or worse, become a mantra by which many people present as a justification for doing things that are detrimental to themselves and ultimately others. Indeed, Naggum spoke directly to this condition when he concluded his post with a very poignant diatribe on the rabid drive toward reinvention in IT:

Why is the IT world defined by people who are unable to deal with the concepts of continuity, longevity, and design for long-term stability? Why do they believe they are so much smarter than everything that has gone before them, when they clearly are not and repeat mistakes that take years to undo, if not replaced by another stupid “revolution” that itself starts from scratch?

In this light, how could anyone justify the creation of yet another Lisp? First of all, Naggum spoke from a position that could be summarized as: Common Lisp is the most powerful programming language in the world and allows a programmer absolute freedom in accomplishing any of their goals. While I tend to agree with the spirit of Naggum’s position, I tend to disagree that just because Common Lisp allows you to do anything2 doesn’t mean that you should use it for everything. Sometimes a focused language is a better choice than one that gives you absolute freedom. That said, this installment is not about replacing Common Lisp with any other language. Instead, in this issue I’ll discuss the importance of understanding Lisp, especially its origins and original implementation.

So I now posit the following question: How can we avoid the problem of reinvention?

My answer to this question is simple: through the act of reinvention can we avoid the problem of reinvention.

Let me rephrase the seemingly Discordian statement above. That is, in order to avoid the problem of reinvention we need to fully (or to the best of our abilities) understand what came before. In my opinion, the best way to understand what came before is to reinvent it.

Caveat emptor

The Lisp implementation described in this installment is a snapshot in time. Therefore, the code herein might not match the code at http://www.github.com/fogus/lithp exactly. Indeed, the codebase online is subject to my changing fancies and at times may be a WiP. That said, this essay is a living document, so it too is subject to change. However, this essay is designed to stand alone and any changes to it will work to foster that goal.

Lithp

In late 2008 I had an idea to write a database in Clojure. I had just discovered Clojure and wanted a project to cut my teeth on. At the time, my previous true exposure to Lisp was during my college years and that exposure was constrained to XLISP, Common Lisp and various Schemes. I had hacked away at a bit of ELisp after college, but even that was done in the spirit of immediacy. I knew that I wanted to try Clojure, but I wanted to learn more than just its API. What I wanted was to learn why it existed in the first place.

To get to an understanding of why Clojure was created I decided to start at the beginning. In hindsight that might not have been the most efficient way of coming to an answer. That said, what came out of that exploration3 was that my database turned into a tiny Lisp interpreter named Lithp,4 written in Python.5

The Lithp interpreter provides the absolute core functions of McCarthy’s original ideal, as outlined in his classical paper, Recursive functions of symbolic expressions and their computation by machine, part I. That is, there are only seven functions and two special forms.

The Magnificent Seven

  • atom
  • car
  • cdr
  • cond
  • cons
  • eq
  • quote

Two special forms

  • label
  • lambda

In addition to the Magnificent Seven and the two special forms I’ll also implement a variant of Lithp called Lithp’ using Lithp itself.

Implementation notes

If you’re not familiar with Python then I apologize. I’ve tried to write as clear code as possible, but much of the style is circa 2008-2010 Python. If you’re interested in helping to maintain and/or modernize the Lithp code-base then please do consider sending Github pull-requests to the repository at https://github.com/fogus/lithp.

The code herein is structured and described at the level of individual Python modules and Lisp source. The parts that I will cover are as follows:

  1. atom.py - Code implementing and dealing with symbols
  2. env.py - Code implementing dynamic binding and lookup
  3. fun.py - Code implementing lambda functions
  4. lisp.py - Code for the maginificent seven
  5. lithp.py - Code implementing the Lithp REPL and code processor
  6. reader.py - Code implementing a Lisp reader
  7. seq.py - Code implementing pseudo-Cons-cells via Python lists
  8. core.lisp - Lisp in Lithp

I will not go over the entirety of the implementation because the code is both freely available and there are some parts that have nothing to do with Lispiness. That is, the code discussed in this installment are the interesting bits.

With all of that said — let’s begin.

The best place to start learning about Lithp and its implementation of McCarthy’s original vision is probably the env.py source file, which implements Lithp’s dynamic binding environment.

env.py

The Environment class represents the dynamic environment of McCarthy’s original Lisp. The creation of this class is actually an interesting story. As many of you probably know, Paul Graham wrote a paper and code for McCarthy’s original Lisp 6 and it was my first exposure to the stark simplicity of the language. The simplicity is breath-taking – not to put too fine a point on it.

However, while playing around with the code I found that in using the core functions (i.e. null., not., etc.) I was not experiencing the full effect of the original. That is, the original Lisp was dynamically scoped, but the Common Lisp used to implement and run (CLisp in the latter case) Graham’s code was lexically scoped. Therefore, by attempting to write high-level functions using only the Magnificent Seven and Graham’s core functions in the Common Lisp I was taking advantage of lexical scope; something not available to McCarthy and company. Of course, the whole reason that Graham wrote eval. was to enforce dynamic scoping (he used a list of symbol-value pairs where the dynamic variables were added to its front when introduced; which I will adopt later):

1 (eval. 'a '((a 1) (a 2)))
2 ;=> 1

So I then implemented a simple REPL in Common Lisp that fed input into eval. and maintained the current environment list. That was fun, but I wasn’t sure that I was learning anything at all. Eventually I decided to try to implement my own core environment for the Magnificent Seven to try and get a feel for what it took to build a simple language up from scratch. I suppose if I were a purist then I would have found an IBM 704, but that would be taking this exercise to the extreme. (email me if you have one that you’d like to sell for cheap)

Anyway, minus a 704, I needed to start with creating an Environment that provided dynamic scoping, and the result is monstrosity 7 described herein.

class Environment: def init(self, par=None, bnd=None): if bnd: self.binds = bnd else: self.binds = {}

1       self.parent = par
2 
3       if par:
4           self.level = self.parent.level + 1
5       else:
6           self.level = 0

The code fragment above sets up the dynamic binding environment by establishing a contextual binding map named binds that is a simple lookup from a Symbol type (described later) to a Lithp value type (e.g. a Lambda or Symbol). In addition, the stack discipline of classical dynamic binding is emulated through the parent link. The parent link is simply an instance of another Environment or None if the constructed environment is considered a root binding context.

This dynamic stack discipline is on full display when one attempts to lookup a binding within an Environment:

1   def get(self, key):
2       if key in self.binds:
3           return self.binds[key]
4       elif self.parent:
5           return self.parent.get(key)
6       else:
7           raise ValueError("Invalid symbol " + key)

The first check in a dynamic lookup is always to see if a binding exists in the current context. However, ultimately retrieving a binding potentially requires the traversal of the parent link, as shown in the elif case of the lookup check. You can see that this lookup descends down a chain of Environment instances until a binding is found, or the search bottoms-out at a root context. This is the simplest possible binding scenario and naturally the one that McCarthy and his team settled on for the initial Lisp interpreter.

Now that you understand the lookup of a dynamic binding, the establishment of a new binding should not be a mystery:

1   def set(self, key, value):
2       if key in self.binds:
3           self.binds[key] = value
4       elif self.parent:
5           self.parent.set(key,value)
6       else:
7           self.binds[key] = value

That is, the process of creating a new binding is symmetric to the process of finding.

There are or course useful functions that make the process of dealing with Environments more straight-forward:

1   def definedp(self, key):
2       if key in self.binds.keys():
3           return True
4       return False

The definedp method just traverses the binding stack and returns True or False if the binding is found, or not, respectively.

Manipulating the dynamic environment stack

You may recall that the constructor of the Environment class took a preexisting binding and assigned it as the parent link. This, for all of its faults is a stack. Likewise, the logic of the #put and #get methods ensured that these links were traversed when creating and looking up a binding. Rather than manipulating the Environment#parent links directly via construction, I decided to create a convenience method #push that returns a new Environment instance given an existing instance. Perhaps this is unnecessary, but I personally find this approach a little cleaner.

1   def push(self, bnd=None):
2       return Environment(self, bnd)

Again, whenever a variable is created its binding is pushed onto a stack. In this case, the stack is simulated through a chain of parent links. So if you were to create the following:

1 (label a nil)
2 (label frobnicate (lambda () (cons a nil)))
3 
4 ((lambda (a)
5    (frobnicate))
6  (quote x))

Then the stack would look like the figure below within the body of frobnicate:

1  |         |
2  |         |
3  | a = 'x  |
4  | ------- |
5  | a = nil |
6  +---------+

Meaning that when accessing a, frobnicate will get the binding at the top of the stack, producing the result (x). This push/pop can become difficult, so people have to do all kinds of tricks to avoid confusion (i.e. pseudo-namespace via variable naming schemes — if you’ve ever written classic ELisp, then maybe you’ll remember this approach).

Of course, any time you see a push, a pop is soon to follow:

1     def pop(self):
2         return self.parent

And that’s it for the Environment class. Looking back on this code it astonishes me how simple the dynamic environment scheme is — elegant almost. In an attempt to maintain this train of thought, I’d like to explore how the dynamic environment interacts with Lisp functions, both anonymous and not.

fun.py

As you might have imagined, McCarthy’s Lisp derives much of its power from the function. The Function class is used exclusively for built-in functions (i.e. the Magnificent Seven). Each core function is implemented as a regular Python method, each taking an Environment and its arguments.

1 class Function(Eval):
2     def __init__(self, fn):
3         self.fn = fn
4 
5     def eval(self, env, args):
6         return self.fn(env, args)

The implementation of the default function class is as simple as can be. That is, I’m cheating a bit and delegating the functionality of execution out to a Python function or lambda. However, bear in mind that this is just the tip of the iceberg. Lisp really gets interesting.

The real power of McCarthy’s Lisp springs from the influence of Alonzo Church’s λ-calculus. While the original Lisp implementation was not entirely adherent to the dictates of the λ-calculus, the spirit of Church was in attendance. First, the initializer of the Lambda class is trivial:

1 class Lambda(Eval):
2     def __init__(self, n, b):
3         self.names = n
4         self.body =  b

Interestingly, every invocation of a lambda causes its bound variables to be pushed onto the dynamic bindings stack. McCarthy only touches briefly on the idea that combining functions built from lambda is problematic. In almost a throw-away sentence he states, “different bound variables may be represented by the same symbol. This is called collision of bound variables.” If you take the time to explore core.lisp (described later herein) then you will see what this means in practice. The Elisp-like named variables should provide a hint.

Of course, the reason for these difficulties is a direct result of dynamic scoping. McCarthy suggests that a way to avoid these issues is to use point-free combinators to eliminate the need for variables entirely. This approach is a book unto itself – which is likely the reason that McCarthy skips it. That said,

1     def push_bindings(self, context, values):
2         context.push()
3         self.set_bindings(context, values)

The bindings are set one by one corresponding to the input values.

1     def set_bindings(self, context, values):
2         for i in range(len(values)):
3             val = values[i].eval(context.environment)
4             context.environment.binds[self.names[i].data] = val

The evaluation of a Lambda instance is not much more complicated than a built-in function, except that it will establish bindings in the root context. Additionally, the root context will hold all bindings, so free variables will also be in play.

1     def eval(self, env, args):
2         values = [a for a in args]
3 
4         if len(values) != len(self.names):
5             raise ValueError("""Wrong number of args,
6                              expected {0}, got {1}"""
7                              .format(len(self.names), len(args)))

OK, so I throw an exception here… sue me. It’s arg-arity checking, not rocket science – but I digress. Dynamic scope requires that names be bound on the global environment stack:

1         LITHP = env.get("__lithp__")

First exceptions, now a big honking global variable. Not only that, but this is the global to usurp all other globals. That is, the dictionary stored at env.get("__lithp__") contains all of the top-level bindings for the entire Lithp system. While horrifying in its abruptness, the global dictionary is a fairly straight-forward way to represent the same global dynamic environment found in the original Lisp implementation. That being the case and since I have a handle to the global stack, I just go ahead and push some bindings on to it:

1         self.push_bindings(LITHP, values)

If you’re like me when reading this for the first time you might be thinking “OMG” or maybe “ZOMG.” That reaction is pure and wise and never lose that sense of danger. Having said that, it’s important to know that when Lisp was originally created no one knew about any of the SOLID principles (not even McCarthy) nor had Wulf and Shaw penned their perennial classic Global Variables Considered Harmful. Self-referential programming system had to be created by gum, and they weren’t going to sit around and write themselves (yet).

I’ve pushed all of the bindings onto the global dynamic stack, so that they’re accessible to the body of the lambda, so why not go ahead and use them? I will. In fact, each form in the body is evaluated one by one, and the last determines the return value:

1         ret = FALSE
2         for form in self.body:
3             ret = form.eval(LITHP.environment)

The evaluation context for the lambda body is of course the global, dynamic context. Therefore, all applications occur at the discretion of the ever changing dynamic bindings. However, it’s not enough to just willy-nilly push bindings onto the dynamic context; I also must remove exhausted bindings when I’m done with them. In the case of the lambda, the exhausted bindings are simply those that were set as the result of passing values as arguments:

1         LITHP.pop()
2         return ret

And that effectively cleans up the lambda execution context.

Closures in brief

McCarthy’s Lisp does not define closures and in fact the presence of closures in the context of a pervasive dynamic scope is problematic. However, this fact was academic to me and didn’t really map conceptually to anything that I had experienced in the normal course of my programming life. Therefore, I added closures to see what would happen. It turns out that if you thought that bound variables caused issues then your head will explode to find out what closures do.

As a thought-experiment I can visualize why this is the case. That is, while a closure captures the contextual binding of a variable, look-ups in dynamic scoping occur on the dynamic stack. This means that you may be able to close over a variable as long as it’s unique, but the moment someone else defines a variable of the same name and attempt to look up the closed variable will resolve to the top-most binding on the dynamic stack. This assumes the the lookup occurs before the variable of the same name is popped.

Function dispatch in the context of a dynamic environment swings from elegant to, frankly, weird. The true elegance of Lisp is yet to come. One such piece of elegance is next in the idea of a “cons cell” and what I like to call “The Magnificent Seven.”

lisp.py

As mentioned earlier, the original Lisp described by McCarthy in his 1960 paper describes the following function set:

  1. atom
  2. car
  3. cdr
  4. cond
  5. cons
  6. eq
  7. quote

Plus two special forms:

  1. label
  2. lambda

The Lisp class defines the Magnificent Seven in terms of the run-time environment built thus far (i.e. dynamic scope, lambda, environments, and symbols, which are not detailed herein) and aggregated in a Lisp class starting thus:

class Lisp:

The Magnificent Seven are thus implemented as methods on this class.

eq

Equality is delegated out to the objects being tested, so I will not discuss the mechanics here.

However, examples of usage are as follows:

1 (eq nil (quote ()))
2 ;=> t
3 
4 (eq (quote (a b)) (quote (a b)))
5 ;=> t
6 
7 (eq (quote a) (quote b))
8 ;=> ()

The most expedient way to achieve this kind of equality was to simply delegate it to Python:

1 def eq(self, env, args):
2     if args[0].eval(env) == args[1].eval(env):
3         return TRUE
4     return FALSE

Re-evaluating on each equality check is less than ideal, but I’m not writing this on an IBM 704, so what me worry? Sure, eq is cool and all and while implementing programming languages equality is non-trivial I’m not going to say much on the subject here. However, I do have more to say on the matter of Lisp’s original data abstraction – the cons cell.

The humble cons cell

The original Lisp implementation was written for the IBM 704 by Steve Russell (a genius of the highest order – also the creator/discoverer of Spacewar! and continuations). The somewhat obtuse name for a function that returns the first element of an s-expression derives from the idiosyncrasies of the IBM 704 on which Lisp was first implemented. The car function was thus a shortening of the term “Contents of the Address part of Register number” that in itself has a very interesting explanation. That is, car was used to refer to the first half of the word-size addressed by the IBM 704. In this particular machine (and many others at that time and since) the word-size could address more than twice of the actual physical memory. Taking this particular nuance of the IBM 704 into account, programmers were able to efficiently create stacks by using the address of the stack’s top in one half-word and the negative of the allocated size in the other (the “Contents of Decrement part of Register number”), like so:

 1  +----------+----------+
 2  |   top    |   -size  |
 3  +----------+----------+
 4       |           |        size goes toward zero
 5       |           |                  |
 6       |           |                  |
 7       |           |                  v
 8   |   |    |      |
 9 4 |   |    |      |
10   |   V    |      |
11 3 | elem3  |      |
12   |        |      |                  ^
13 2 | elem2  |      |                  |
14   |        |      |                  |
15 1 | elem1  |<-----+           stack grows up
16   |        |
17 0 | elem0  |
18   +--------+

Whenever something was pushed onto the stack the number 1 was added to both half-words. If the decrement part of the word became zero then that signaled a stack-overflow, that was checked on each push or pop instruction. However, the use of the car/cdr half-words was used quite differently (McCarthy 1962). That is, The contents part contained a pointer to the memory location of the actual cons cell (see the documentation for the next function cdr for more information) element, and the decrement part contained a pointer to the next cell:

1 +----------+----------+    +----------+----------+
2 |   car    |   cdr    |--->|   car    |   cdr    | ...
3 +----------+----------+    +----------+----------+

The Lisp garbage collector used this structure to facilitate garbage collection by marking referenced chains of cells as negative (sign bit), thus causing them to be ignored when performing memory reclamation.

car

That said, the car function works as follows:

1 (car (quote (a b c)))
2 ;=> a

In Lithp, the car of an empty list is an error. I actually have no idea if this was the case in McCarthy’s original, but I’m beyond corrupted by modern software development practices that I can’t even see straight anymore:

1 def car(self, env, args):
2     if(len(args) > 1):
3         raise ValueError("""Wrong number of arguments,
4                    expected {0}, got {1}"""
5               .format(1, len(args)))

Of course, I don’t use pointer arithmetic to implement cons cells…

1     cell = args[0].eval(env)

…instead I define it in terms of a “sequence abstraction” (not shown herein). This is a side-effect of a hope in going further with this implementation (e.g. into Linear Lisp), but as of now it’s a bit heavy-weight for what is actually needed.

Interestingly, there are real implications to my design decision over McCarthy’s. That is, the Lithp sequences delegate insertion to the Python List implementation which internally is built on arrays (CPython anyway). The arrays backing the Lists are geometrically re-sized when new elements are added and thanks to some simple pointer arithmetic elements can be added to the end of a list simply. Therefore, inserting new elements to the head of a Python List is an amortized O(1) operation. However, McCarthy’s cons cell design and implementation guaranteed that insertions at the head of a list were O(1) operations. That is, the very nature of a cons cell was such that it was composed of two parts: a data part (CAR) and a link part (CDR) which were just pointers. As I hinted at earlier, the registers of the IBM 704 were twice the word size, so both the CAR and CDR parts fit into one register, split into “address” and “decrement” parts as mentioned above.

The data part was the thing actually contained in the cell and the link part pointed to (potentially) another cons cell. Therefore, to add an element to the head of a list required a cons cell allocation and a link manipulation that made it point to the head element of the host list. However, in the Lithp implementation adding to the head of a List not only requires an occasional array size adjustment, but it also requires that elements in the existing array be shifted down the array one at a time to make room for the new element – an O(n) operation.

But I wouldn’t be a programmer if I didn’t needlessly abstract:

1     if not isinstance(cell, Seq):
2         raise ValueError("""Function not valid on
3                          non-sequence type.""")
4 
5     return cell.car()

The internal sequence type used by Lithp is a useful tool, but it loses much of the elegance found in the original cons cell abstraction.

cdr

In the previous function definition (car) I used the term cons-cell to describe the primitive structure underlying a Lisp list. If you allow me, let me spend a few moments describing this elegant structure, and why it’s such an important abstract data type (ADT).

Lisp from the beginning was built with the philosophy that lists should be a first-class citizen of the language; not only in the realm of execution, but also generation and manipulation. If you look at my implementation of List in seq.py you’ll notice that it’s pretty standard fare. That is, like most list implementations, Lithp’s is backed by a boring sequential store where one element conceptually points to the next and blah blah blah. Boring. Where the cons-cell shines is that it is a very general purpose ADT that can be used in a number of ways, but primary among them is the ability to represent the list.

Lists in the early Lisp was precisely a chain of cons cells and the operators car and cdr pointed to very specific implementation details that over time became generalized to mean “the first thing” and “the rest of the things” respectively. But the fact remains that the cons cell solves a problem that is often difficult to do properly. That is, how could Lisp represent a container that solved a number of requirements:

  • Represents a list - car is piece of data, cdr is another cons cell
  • Represents a pair - car is a piece of data, cdr is a piece of data
  • Represents a terminator - car is the cell itself, cdr is the cell itself
  • Implementation efficiency - O(1) adds to the front
  • Heterogeneous - datum is wrapped in a generic container type

It would be interesting to learn the precise genesis of the idea behind the cons cell, as I imagine that it must have provoked a eureka moment.

I’ve already discussed how the IBM 704 hardware was especially amenable to solving this problem efficiently, but the other points bear further consideration. Lisp popularly stands for “LISt Processing language” but as I explained, the basic unit of data was instead the cons cell structure. The fact of the matter is that the cons cell serves as both the implementation detail for lists and the abstraction of a pair, all named oddly as if the implementation mattered. If Lisp had originally gone whole-hog into the abstraction game, then car and cdr would have been first and rest and would have spared the world decades of whining.

Modern implementations of Lisp rarely implement lists as chains of cons cells anymore. Indeed, you can probably tell a lot about the level of knowledge for a Lisp programmer by the way that they construct and access lists. For example, a programmer like myself whose exposure to Common Lisp has been entirely academic, you will probably see a propensity toward the use of car and cdr instead of leveraging the more expressive sequence abstractions – but I digress.

The cdr function works as follows:

1 (cdr (quote (a b c)))

;⇒ (b c)

1 (cdr (quote ()))

;⇒ ()

The implementation of cdr is as follows:

 1 def cdr(self, env, args):
 2     if(len(args) > 1):
 3         raise ValueError("""Wrong number of arguments,
 4                         expected {0}, got {1}"""
 5                         .format(1, len(args)))
 6 
 7     cell = args[0].eval(env)
 8 
 9     if not isinstance(cell, Seq):
10         raise ValueError("""Function not valid on
11                          non-sequence type.""")
12 
13     return cell.cdr()

There’s a lot of checking in this implementation, but the gist is that I eval the cons cell and take the tail of it.

As I mentioned in the previous section, by using the sequence type as the base abstraction I lost much of the elegance in the original cons cell design. The cdr implementation makes this loss more clear. That is, while taking the cdr of a traditional cons cell is simply a matter of returning a new pointer to the link part of a cell, the Lithp implementation requires that a new List instance be created with a copy of the contents of the original, minus the first element.

cons

So if Common Lisp has a more general sequence abstraction, then why would we still want to keep the cons cell? The reason is that the cons cell is more flexible than a sequence and allows for a more intuitive way to build things like trees, pairs, and to represent code structure. Indeed, it’s the last point that allows the fulfillment of one of John McCarthy’s chief characteristics of what defines Lispiness:

One of the major characteristics of Lisp is that Lisp programs are Lisp data and one can compute with Lisp programs present in the machine without parsing.

– JMC, 1998.02.08 on comp.emacs and comp.emacs.xemacs

Indeed, later in this installment I’ll build Lithp (and its hosted language Lithp’) programs directly using simple Lithp data functions and structures.

For now, the cons function simply delegates the matter of consing to the target object.

The cons function works as follows:

1 (cons (quote a) nil)
2 ;=> (a)
3 
4 (cons (quote a) (quote (b c)))
5 ;=> (a b c)
6 
7 (cons (quote a) (quote b))
8 ;=> Error

I’ve agonized long and hard over whether or not to implement McCarthy Lisp as the language described in Recursive functions… as the anecdotal version only partially described in the LISP 1.5 Programmer’s Manual and in most cases the former was my choice. The creation of “dotted pairs” (I believe) was not an aspect of the original description and therefore is not represented in Lithp. Sadly, I think that in some cases these version are mixed because I originally went down the path of creating a version of Lithp compatible with Linear Lisp and Lisp 1.5, so this is a product of some pollution in the varying ideas. That said, the implementation of cons is as follows:

 1 def cons(self, env, args):
 2     if(len(args) > 2):
 3         raise ValueError("""Wrong number of arguments,
 4                          expected {0}, got {1}"""
 5                          .format(2, len(args)))
 6 
 7     first = args[0].eval(env)
 8     second = args[1].eval(env)
 9 
10     return second.cons(first)

The eager nature of cons is apparent in that both the added element and the target are evaluated before the operation is performed.

The cons cell is an extremely powerful ADT, but it’s not the only kind of data type that was available in the original Lisp. While its supported structures were humble, they were added to support a certain paradigm: symbolic programming.

atom

In addition to the aggregating cons cell, Lisp provided additional atomic types that were singularly representative; that is, they could not be taken apart. The bridge between these two data classes, aggregating and atomic, was the atom function that was used to identify atomic types.

The atom function checks if its argument is an atom and returns a truthy value if it is. One thing to note is that the empty list () is considered an atom because it cannot be deconstructed further.

The atom function works as follows:

1 (atom (quote a))
2 ;=> t
3 
4 (atom nil)
5 ;=> t
6 
7 (atom (quote (a b c)))
8 ;=> ()

Note that the null cons cell (i.e. empty list) is a falsey value in Lithp and the original Lisp.

 1 def atom(self, env, args):
 2     if(len(args) > 1):
 3         raise ValueError("""Wrong number of arguments,
 4                          expected {0}, got {1}"""
 5                          .format(1, len(args)))
 6 
 7     first = args[0].eval(env)
 8 
 9     if first == FALSE:
10         return TRUE
11     elif isinstance(first, Symbol):
12         return TRUE
13 
14     return FALSE

Again, atom is another eager function in that it must evaluate its argument to determine if it’s an atomic data structure. However, the last of the Magnificent Seven, label disables evaluation in a way that’s in the spirit of cond, but that operates towards a very different purpose. But first, I’ll cover a function called quote that provides a way to avoid evaluation altogether.

quote

The quote built-in does one thing – it returns exactly what was given to it without evaluation:

1  (quote a)
2  ;=> a
3 
4  (quote (car (quote (a b c))))
5  ;=> (car (quote (a b c)))

Of course, the result of quote can itself be evaluated explicitly:

1 (eval (quote (car (quote (a b c)))) (quote ()))
2 ;=> a

The implementation of quote couldn’t be more simple:

1 def quote(self, env, args):
2     return args[0]

The quote, function, while interesting in its own right, is mechanical to implement and directly supported by the Lithp infrastructure built so far. However, a more interesting … thing … I’ll not call it a function, is called label.

label

Notice that in the following implementation of label the first argument (hopefully a symbol) is not evaluated. This is the key difference between a Lisp function and a special form (and macro, but I will not talk about those here). That is, in all cases the arguments to a function are evaluated from left to right before being passed into the function. Conversely, special forms have special semantics for evaluation that cannot be directly emulated or implemented using functions.

1 def label(self, env, args):
2     if(len(args) != 2):
3         raise ValueError("""Wrong number of arguments,
4                          expected {0}, got {1}"""
5                          .format(2, len(args)))
6 
7     env.set(args[0].data, args[1].eval(env))
8     return env.get(args[0].data)

Finally, the last of the Magnificent Seven is another lazy operator named cond.8

cond

Did you know that McCarthy discovered conditionals? This is only partially true. That is, Stephen Kleene defined the notion of a primitive recursive function and McCarthy built on that by defining the conditional as a way to simplify the definition of recursive functions.

How would you define a recursive function without the use of a conditional in the terminating condition? It turns out that you can define recursive functions this way (see fixed point combinators), but the use of the conditional form cond vastly simplifies the matter.

We take conditionals for granted these days so it’s difficult to imagine writing programs that were not able to use them, or used a subset of their functionality.

The cond form is used as follows in the original Lisp:

1 (cond ((atom (quote (a b))) (quote foo))
2       ((atom (quote a))     (quote bar))
3       (t (quote baz)))
4 
5 ;=> bar

The implementation is as follows:

1     def cond(self, env, args):
2         for test in args:
3             result = test.car().eval(env)
4 
5         if result == TRUE:
6             return test.data[1].eval(env)
7 
8         return FALSE

Each head (car) of the cond clauses is tested for truthiness, and as soon as one is such the tail of the corresponding clause is evaluated. Note that the TRUE and FALSE values are set to Symbol("t") and List() within Lithp. As you might surmise then, the cond form is lazy in its evaluation. This kind of lazy evaluation is not found in the original λ-calculus, so this is a major departure by McCarthy. You see, even in the early days of computing there were thoughtful trade-offs between practicality and abstractness… though, now that I think about it, McCarthy was operating in a purely academic environment when developing Lisp, so it’s safe to say that the original was in no way appropriate for modern web apps.

In any case, you might have noticed that I skipped over one of the maginificent seven, specifically the lambda special form. While I talked about the underlying class Lambda, there’s a little more work to properly create instances of it, which I’ll talk about in the next section.

Lithpiness

In this the section of my Lisp implementation I’ll cover various bits and pieces that pull all of the core pieces together into an actual language run-time. That is, this section describes briefly the implementation of the REPL and the lambda special form.

First I’ll start with the Lithp class, which is the interpreter driver, performing the following:

  1. Initialize the global environment
  2. Parse the cl arguments and act on them as appropriate
  3. Initialize the base Lisp functions
  4. Read input
  5. Evaluate
  6. Print
  7. Loop back to #4

Numbers 4-6 above are the steps involved in the implementation of the Lithp REPL – Read, Eval, Print, Loop. First, allow me to talk about the act of reading.

Reader

The implementation of the Lithp Reader class is not exactly thrilling. You have to really like the art of lexing to find that kind of stuff interesting. I wrote the thing and I’m only mildly interested in it. What I’m trying to say is that it’s a mess. However, like most Lisp readers, the Lithp reader is much more interesting in its behavior. Within the Lithp interpreter, the reader is created as such:

1 rdr = reader.Reader()
2 environment = Environment()

The environment variable is a stand-in for the Lithp global environment and is merely used for illustrating evaluation principles herein. The first principle that I’d like to discuss is that the Reader class takes in strings and returns Lithp data structures:

1 sexpr = rdr.get_sexpr("a")
2 sexpr.__class__
3 #=> <class atom.Symbol at 0x10b7bd050>

The code above shows that the string "a" will be read as a Lithp Symbol instance. This is nothing terribly ground breaking in the world of Lisp, but there is a point that I’d like to make about it. Very often you’ll hear or read that this language or that language has a REPL; however, very few languages have REPLs where the R stands for a reader such as what you’ll find in Lisp. Lisp-like readers take in a textual representation of a piece of code and return a Lisp data structure. What’s so interesting about that. Indeed, if you run something like eval(":foo") in Ruby you’ll get back a symbol instance correct? The difference between Lisp and many other languages is that the act of converting strings to types is separate from the act of evaluating them.

In many languages there is no reading step at all. Instead, languages tend to have lexing and parsing phases that output special types used only by the machinery of the lexer and parser themselves. Very rarely will you see a programming language that allows access to the lexing and parsing machinery. In fact, even if they did it’s unlikely that the structures returned would be ones that you could manipulate within the context of the language themselves. Most Lisps on the other hand use core structures like symbols, cons cells, etc. as the input to the parsing and/or evaluation phases. This is a key point to understand because it means that if you wish to write programs that manipulate Lisp programs all that you need is Lisp. This fact enables the power of Lisp macro systems. That is, the macro processing system is not hidden from view nor does it provide manipulation via a special subset or super-set of the host language, it’s just Lisp.

Keeping on the path of discussing the REPL, I’ll move on to the Evaluation part. However, before I do that it’s worth noting that all evaluation, for any language really, takes place in terms of a context. In Lithp the evaluation context is the global, dynamic environment, simulated below with environment:

1 environment.set("a", atom.Symbol("value"))
2 environment.get(atom.Symbol("a"))
3 #=> value

The code above simply sets the value of the string "a" to a symbol instance corresponding to value. This establishes a primed context that will allow me to evaluate sexpr from above:

1 sexpr.eval(environment)
2 #=> value

As shown above, evaluating sexpr in the context of the environment returns the value that the symbol a (as read earlier) returns value. This is the base case in Lithp, but structures can be read just as easy:

1 sexpr = rdr.get_sexpr("(car (quote (one two three)))")

The data type returned from the reader above is thus:

1 sexpr.__class__
2 #=> <class seq.List at 0x10b75bef0>

The type returned corresponds to Lithp’s notion of the cons-cell, List, which if you take a look you’ll notice is not a cons-cell in the strictest sense. I did this not only to leverage the Python language capabilities, but I also did it to illustrate a point later on when I talk about the implementation of core.lisp. That said, it will act like a cons-cell for the purposes of implementation:

1 sexpr.car()
2 #=> car

The car of the s-expression above is the symbol car:

1 sexpr.car().__class__
2 #=> <class atom.Symbol at 0x10b7bd050>

The cdr of the same s-expression is a little more interesting:

1 sexpr.cdr()
2 #=> ((quote (one two three)))

The cdr is of course another List instance representing the “rest” of the original s-expression:

1 sexpr.cdr().__class__
2 #=> <class seq.List at 0x10b75bef0>

So what happens if I try to evaluate the read s-expression? Observe:

1 sexpr.eval(environment)
2 # ValueError: Invalid symbol car

As I mentioned, all evaluations take place within the context of an environment. The environment instance created above currently only has a single binding a->value which is insufficient for Lithp to make sense of the s-expression supplied. As shown, it’s complaining about the car symbol which means that there is no binding for it. In the absence of a populated environment, the core Lithp interpreter only knows a few things:

  • Evaluation semantics (i.e. bindings lookup and argument evaluation)
  • Cons-cell structure
  • …and that’s it

To get any power out of Lithp requires that the evaluation context be populated with important bindings:

1 environment.set("car", Function(lithp.car))

The lithp.car method is the base implementation of the Lispy car function. It’s wrapped in an instance of the Function class which takes care of the binding lookup and evaluation logic. The Lithp REPL, during start-up, will load the global environment with the Magnificent Seven function and the two special forms. Of course, simply loading lithp.car is not enough:

1 sexpr.eval(environment)
2 # ValueError: Invalid symbol quote

I also need the quote binding:

1 environment.set("quote", Function(lithp.quote))

Now, I have a minimal environment needed to evaluate the read s-expression:

1 sexpr.eval(environment)
2 #=> one

The Lithp interpreter, will fully populate the core environment as such:

 1 self.environment.set("eq",     Function(self.eq))
 2 self.environment.set("quote",  Function(self.quote))
 3 self.environment.set("car",    Function(self.car))
 4 self.environment.set("cdr",    Function(self.cdr))
 5 self.environment.set("cons",   Function(self.cons))
 6 self.environment.set("atom",   Function(self.atom))
 7 self.environment.set("cond",   Function(self.cond))
 8 
 9 self.environment.set("lambda", Function(self.lambda_))
10 self.environment.set("label",  Function(self.label))

The rest of the REPL implementation is less interesting and mostly has to do with loading the core.lisp file, looping until full s-expressions are read, delegating evaluation to data instances, and the like. Now that a fully operational Lithp core is in place, I can now start showing the implementation of the core.lisp file, which contains some other interesting functions, including a Lisp interpreter!

Now the code described herein is not a lot of code at all. Indeed, Python is a nice language to serve as a host for a language like Lithp because it provides the essentials for hosting functionality directly in its core. It was a pleasure to work in Python, a language that I used in anger at the time, to discover for myself the stark simplicity of Lisp, a language that I would later use in my day job… so to speak.

The following core.lisp file will, unlike Lithp on Python, be implemented on Lithp which is also a language, like Python, that provides the essentials for hosting functionality directly in its core. The interpreter will almost exact implement McCarthy’s original as described below:

 1 eval[e; a] = [
 2     atom [e]  assoc [e; a];
 3     atom [car [e]]  [
 4         eq [car [e]; QUOTE]  cadr [e];
 5         eq [car [e]; ATOM]  atom [eval [cadr [e]; a]];
 6         eq [car [e]; EQ]  [eval [cadr [e]; a] = eval [caddr [e]; a]];
 7         eq [car [e]; COND]  evcon [cdr [e]; a];
 8         eq [car [e]; CAR]  car [eval [cadr [e]; a]];
 9         eq [car [e]; CDR]  cdr [eval [cadr [e]; a]];
10         eq [car [e]; CONS]  cons [eval [cadr [e]; a];
11                                    eval [caddr [e]; a]];
12         T  eval [cons [assoc [car [e]; a]; evlis [cdr [e]; a]]; a]];
13 eq [caar [e]; LABEL]  eval [cons [caddar [e]; cdr [e]];
14                              cons [list [cadar [e]; car [e]; a]];
15 eq [caar [e]; LAMBDA]  eval [caddar [e];
16                               append [pair [cadar [e]; evlis [cdr [e]; a]; a]]]

So on with it then.

core.lisp

With only the Magnificent Seven and the two special forms, one can implement many useful functions and values. Two interesting values are that of logical true and false. Historically, these values were bound to the identity t and nil as the empty list () respectively, which I do at the top of core.lisp:

1 (label t (quote t))
2 (label nil (quote ()))

Checking these values you’ll see:

1 t
2 ;;=> t
3 
4 nil
5 ;;=> ()

Now that those symbols are bound, I can use them in functions as truthiness. One such function is null that takes something and return t or () if it’s an empty list:

1 (label null (lambda (null_x)
2               (eq null_x nil)))

The implementation of null is really where the lollipop that is Lithp starts licking itself. That is, not only is it built on the core capabilities label, lambda, and eq, but it also uses the previously defined nil. This is nothing truly magical, but it hints at something, dare I say, breathtaking – Lithp itself can implement itself – I’ll call it Lithp’. Surely something like that will take just as many lines as the Python implementation right? Well, let’s see.

The Lithp’ dynamic environment

The first thing that Lithp’ needs is an environment. I started by describing the Lithp environment, so I might as well start with the same in Lithp’. However, to implement environments in Lithp I was able to leverage the host language Python’s classes to do so. However, since Lithp’ is built on Lithp there are no such language constructs as classes and methods and the like. Instead, all that Lithp has at its disposal are cons cells. Thankfully, and as I mentioned earlier, the cons cell has incredibly robust abstractive properties. Indeed, using the cons cell idea as the basis for a tuple structure is fairly straight-forwardly implemented as such:

1 (label pair (lambda (x y)
2               (cons x (cons y nil))

That is, a tuple in Lithp’ is just a pair of cons operations with a tail corresponding to nil.9 These pairs can be used as the basis for an environment where its entries consist of symbols bound to values:

1 (pair (quote a) (quote value))
2 ;;=> (a value)

Just based on this, we can build up an associative structure that maintains the spirit of the Environment class:

1 (label env nil)
2 ;;=> ()
3 
4 (cons (pair (quote a) (quote value)) env)
5 ;;=> ((a value))

Do you see an environment in there? If not, maybe showing how consing to the return of cons will help to illustrate:

1 (cons (pair (quote b) (quote valueb))
2   (cons (pair (quote a) (quote value)) env))
3 
4 ;;=> ((b valueb) (a value))

The value above represents an environment where there are two bindings:

  • The symbol a is bound to value
  • The symbol b is bound to valueb

If I want to find the value of a binding all I would need to do is write a function that performs a lookup by walking the structure above and returns the first value bound to a given symbol. Returning the first found binding is critical because I want to emulate the same evaluation semantics as Lithp. That is, Lithp’s implementation is such that bindings can override older bindings via the Environment.parent chain. I can do the same by dictating that newer bindings appear closer to the front of the Lithp’ environment list:

1 (label env+
2   (cons (pair (quote a) (quote new-a-value))
3     (cons (pair (quote b) (quote valueb))
4       (cons (pair (quote a) (quote value)) env))))
5 
6 ;;=> ((a new-a-value) (b valueb) (a value))

This structure is in every way a stack! To operate on this stack would be as easy as using cons as a push operation and cdr as a popping operation, which I’ll do later. However, what about that “lookup” function that I mentioned earlier? Observe:

1 (label lookup (lambda (name context)
2                 (cond ((eq (car (car context)) name) (car (cdr (car context))))
3                        (t (lookup name (cdr context))))))

The first part of the cond in the lookup function merely says that if the binding name at the head of the first pair (i.e. the car of the car) equals the name then return the binding value in that first pair (i.e. the car of the cdr of the car). The default case of the cond simply says to pop (i.e. cdr) the first binding pair and retry the lookup with the new environment. You can see lookup in action below:

1 (lookup (quote b) env+)
2 ;;=> valueb
3 
4 (lookup (quote a) env+)
5 ;;=> new-a-value

Indeed, the stack nature of the dynamic lookup of Lithp is preserved by Lithp’.

In case you were wondering, all of these operations are non-destructive:

1 env
2 ;;=> ()

That is, the value of env was never changed neither as a result of consing new stuff, nor by taking the cdr. I only highlight this point because it will become important in the implementation of the Lithp’ evaluation function. That is, when you’re dealing with immutable values, you’ll need ways to manage state changes. One straight-forward way available to Lithp is the label special form, but a better way is via the call stack. If you’ll notice, in the implementation of lookup I used the call stack to manage the decreasingly smaller versions of the context as binding pairs were popped off in search of a name match.

However, for the moment I can use pair to build up a seed environment for quick use:

1 (label env' (pair (pair (quote t) (quote t))
2             (pair (quote nil) nil)))
3 
4 ;;=> ((t t) (nil ()))

That is, just like Lithp, Lithp’ can start with the true and false values. Now that the environment is out of the way, I can start implementing the Lithp’ versions of the Magnificent Seven.

The Magnificent Seven’

Just as I did with Lithp, so too shall I implement the Magnificent Seven in Lithp’; it seems like a good fit. Therefore, I’ll now perform a round-about and once again talk about how I implemented the core functions on top of the Lithp interpreter.

quote’

Perhaps the simplest way to illustrate the modus operandi of the Lithp’ implementation is to show how a function quote' works. The implementation is as follows:

1 (label quote' (lambda (qexpr)
2                (car (cdr qexpr))))

That is, quote' is just another function. Like Lithp, the implementation of Lithp’ will rely on functions that exist in the host language. However, while Lithp’s host was Python, the implementation of Lithp’ will rely on Lithp as its host. The function quote' above expects an s-expression of the form (quote foo) as its input and will return the second element of the expression outright, that is, without evaluation. Actually, the quote' function doesn’t care one bit if the first element of the s-expression is the symbol quote. Instead, I’ll leave it up to another piece of code to decide when a form should dispatch into quote'. In any case, quote' works as follows:

1 (quote' (pair (quote quote) (quote a)))
2 ;;=> a

I’m using the previously defined function pair to build up an s-expression of the form (quote a) as below:

1 (pair (quote quote) (quote a))
2 ;;=> (quote a)

So that’s pretty simple no? What about a more complicated function?

atom’

The quote' function, like Lithp’s quote function is the only one that never evaluates its argument. Because of this fact I decided to show its implementation first. However, every other Lithp’ function will need access to an evaluation function to operate correctly. The function will be named simply eval and will operate, in principle, as follows:

1 (eval s-expression bindings)
2 ;;=> result

However, I’m going to hold-off writing eval until the end. Instead, I’ll just assume that it’s defined and call it inside of the Lithp’ implementation functions as needed. Therefore, the implementation of atom' is as follows:

1 (label atom' (lambda (aexpr bindings)
2                (atom (eval (car (cdr aexpr)) bindings))))

Once eval is in place, atom' will work as follows:

1 (atom' (pair (quote atom) (pair (quote quote) (quote a))) env')
2 ;;=> t
3 
4 (atom' (pair (quote atom) (quote t)) env')
5 ;;=> t
6 
7 (atom' (pair (quote atom) (pair (quote quote) (quote (a b c)))) env')
8 ;;=> ()

The examples above are a bit more complicated than for quote', but the principle of building a valid Lithp’ data structure representing a program is the same:

1 (pair (quote atom) (pair (quote quote) (quote a)))
2 ;;=> (atom (quote a))

The implementation of the next few Lithp’ implementation functions will be similar.

eq’

The Lithp’ equality operator will delegate down to the Lithp eq function after evaluating its two arguments:

1 (label eq' (lambda (eexpr ebinds)
2             (eq (eval (car (cdr eexpr)) ebinds)
3                 (eval (car (cdr (cdr eexpr))) ebinds))))

The eq' takes the whole form, so I have to throw away the first operator. There is no particular reason for doing it this way except that I will eventually place the delegation logic into eval function, separate from the handlers. Therefore, the form that comes into the eq' function can be built up programmatically for the purpose of testing:

1 (cons (quote eq)
2       (pair
3         (pair (quote quote) (quote a))
4         (pair (quote quote) (quote a))))
5 
6 ;;=> (eq (quote a) (quote a))

That is, the value returned is a Lithp list representing the code form handled by eq'. I could have just used (quote (eq (quote a) (quote a))) instead, and will hereafter, but decided to do the first couple of examples by hand (so to speak) to demonstrate that the input to Lithp’ truly is a data structure. Running the above against eq' yields:

1 (eq' (cons (quote eq)
2            (pair
3              (pair (quote quote) (quote a))
4              (pair (quote quote) (quote a))))
5      env')
6 
7 ;;=> t

Clearly, the symbol a is equal to another symbol a. The results of a failed check is thus:

1 (eq' (cons (quote eq)
2            (pair
3              (pair (quote quote) (quote a))
4              (pair (quote quote) (quote zzzzz))))
5      env')
6 
7 ;;=> ()

Now bear in mind that the t and () returned from the calls to eq' are those that exist in the context of Lithp and not Lithp’. That is, even though the bindings created in env' are the same values as the Lithp t and (), the return values of eq' are not retrieved through those bindings. This will not be a problem for the sake of correctness. Instead, I only point it out because I’ve not yet changed the semantics between Lithp and Lithp’, but I will later.

car’ / cdr’

Like eq', but car' and cdr' delegate down to Lithp as I see no need in changing their semantics either:

1 (label car' (lambda (caexpr cabinds)
2               (car (eval (car (cdr caexpr)) cabinds))))
3 
4 (label cdr' (lambda (cdexpr cdbinds)
5               (cdr (eval (car (cdr cdexpr)) cdbinds))))

Like with eq', I can test these using lists, but this time I’ll use the Lithp quote function:

1 (car' (quote (car (quote (a b c)))) env')
2 ;;=> a
3 
4 (cdr' (quote (car (quote (a b c)))) env')
5 ;;=> (b c)

These are both straight-forward, as is cons', shown next.

cons’

Once again, since the semantics of Lithp and Lithp’ cons is the same, I just delegate down:

1 (label cons' (lambda (coexpr cobinds)
2              (cons   (eval (car (cdr coexpr)) cobinds)
3                        (eval (car (cdr (cdr coexpr))) cobinds))))

And test directly:

1 (cons' (quote (cons (quote a) (quote (b c)))) env')
2 
3 ;;=> (a b c)

So far, the implementation of Lithp’ has been weak-sauce, but with cond' shown next, it gets a little more interesting.

cond’

Up until this point, quote' was the only Lithp’ function that didn’t evaluate its arguments. However, cond' may or may not evaluate its arguments, depending on the results of the conditional checks. The base implementation of cond' is trivial:

1 (label cond' (lambda (cndexpr cndbinds)
2                (eval-cond (cdr cndexpr) cndbinds)))

That is, it simply delegates its behavior to the eval-cond function which is a cheap way of doing some piece-wise transformation, much like I’d do with a let. The eval-cond function is tasked with pulling apart the Lithp’ cond form and checking each condition in turn until it finds one that resolves to a truthy value:

1 (label eval-cond (lambda (condition condbinds)
2                    (cond ((eval (car (car condition)) condbinds)
3                             (eval (car (cdr (car condition))) condbinds))
4                           (t (eval-cond (cdr condition) condbinds)))))

The eval-bind function is the first recursive Lithp function that I’ve shown. It’s not terribly complicated, in fact it just walks through a list of lists where each inner list contains two elements, a condition part and a consequent part:

1 (label cond-ex (quote (((eq a (quote (x y z))) (quote first-branch))
2                        ((eq a nil)             (quote second-branch))
3                       (t                     (quote default-case)))))

Using the form above, I can exercise cond' by changing the binding of the symbol a, which serves as the basis for two of the three cond conditions:

1 (cons (pair (quote a) (quote nothing)) env')
2 ;;=> ((a nothing) (t t) (nil ()))

Binding a to the symbol nothing should cause the cond to fall through to the default case:

1 (eval-cond cond-ex (cons (pair (quote a) (quote nothing)) env'))
2 ;=> default-case

Likewise, I can trigger the first branch by binding a to something else:

1 (eval-cond cond-ex (cons (pair (quote a) (quote (x y z))) env'))
2 ;=> first-branch

Finally, the second branch is triggered via another binding:

1 (eval-cond cond-ex (cons (pair (quote a) (quote ())) env'))
2 ;=> second-branch

As you can see, even though the body of the cond-ex uses nil in the second branch check, Lithp’ was able to resolve the binding in the passed environment.

Now that cond is taken care of, I can move on to the pinnacle of the Lithp’ implementation, eval.

eval

The eval function is designed to take two arguments: a Lithp’ expression and a context in which to evaluate it:

1 (label eval (lambda (expr binds)

The first part of eval checks if an expression is an atomic value such as a symbol t or nil or some arbitrary name:

1 (cond
2   ((atom expr) (lookup expr binds))

If the expression is an atomic value then its evaluation is just whatever its binding happens to be in the supplied context. However, in the case where the expression itself is not an atom, it must be a list, in which case I just look at the first element:

1   ((atom (car expr))

In Lithp’, just as in Lithp (and most Lisps), the first element of a list is either a function or a special form. Since I’ve implemented the backing function for the Lithp’ special forms using Lithp, I can use what is in effect a dispatch table from the first-element symbol to the implementation:

1 (cond
2   ((eq (car expr) (quote quote)) (quote' expr))
3   ((eq (car expr) (quote atom))  (atom'  expr binds))
4   ((eq (car expr) (quote eq))    (eq'    expr binds))
5   ((eq (car expr) (quote car))   (car'   expr binds))
6   ((eq (car expr) (quote cdr))   (cdr'   expr binds))
7   ((eq (car expr) (quote cons))  (cons'  expr binds))
8   ((eq (car expr) (quote cond))  (cond'  expr binds))

As shown above, the cond above checks each first element against the known Lithp’ core functions and delegates to the Lithp handlers to evaluate them. However, what happens if a function is not directly resolvable via the lookup table above? In that case, the eval function handles it specially:

1 (t (eval (rewrite expr binds) binds))))

That is, eval rewrites the expression such that the original name at the first of the list is replace with whatever its binding happens to be:

1 (label rewrite (lambda (rexpr rbinds)
2                  (cons (lookup (car rexpr) rbinds)
3                        (cdr rexpr))))

This action can be shown below:

1 (rewrite (cons (quote a) (quote (b c)))
2          (cons (pair (quote a) (quote ______)) env'))
3 
4 ;;=> (______ b c)

This behavior is interesting because it signals the first real point of departure of Lithp’ from its host Lithp. That is, Lithp doesn’t rewrite function applications in this way, but I’ve decided to do so for Lithp’. What I haven’t shown is what happens when eval receives a rewritten expression of the form ((lambda (n) n) nil). I’ll defer the answer to that question until later and instead show one more core form label that’s also rewritten, but in a different way.

label

When examining how to implement the label form for Lithp’ there were a few ways that it could have been done. The obvious way is to view label as a binding creator and evaluate the bound value for the purpose of creating a new environment. In this case, the evaluation is as simple as just looking up the binding name in the augmented environment. This technique is less than thrilling. However, I found a more interesting technique in Paul Graham’s paper The Roots of Lisp. Indeed, Graham’s implementation of eval forms the primary inspiration for the implementation of Lithp’, even though I change it up a bit as the implementation of label below will show.

That said, observe the implementation of label below:

1 ((eq (car (car expr)) (quote label))
2  (eval (cons (car (cdr (cdr (car expr)))) (cdr expr))
3        (cons (pair (car (cdr (car expr)))
4              (car (cdr (cdr (car label-ex))))) binds)))

What’s happening here is a bit more subtle than what’s going on in the rewrite function shown above. I’ll pull this subtlety apart using the following example:

1 (label label-ex (quote ((label a (lambda (id) id)) t)))
2 
3 ;;=> ((label a (lambda (id) id)) t)

That is, to see what happens to a Lithp’ form of ((label a (lambda (id) id)) t) I can show what happens in the parts of the implementation:

1 (cons (car (cdr (cdr (car label-ex)))) (cdr label-ex))
2 ;;=> ((lambda (id) id) t)

The first part of the call to the recursive eval is very similar to the way that rewrite works. That is, it creates a call to the lambda form bound to a with the original argument t. You probably already know what happens when this form is encountered; the value t is returned. However, the second argument to the recursive eval call builds a bit of a weird context:

1 (cons (pair (car (cdr (car expr)))
2       (car (cdr (cdr (car label-ex))))) binds)
3 
4 ;;=> ((a (lambda (x) x)) (t t) (nil ()))

That is, the value a is bound to the lambda that was rewritten to replace the label body. So what’s the point? The answer is simply that should I wish to build a Lithp’ interpreter around eval I would like that the environment coming out of the label implementation be usable to evaluate further forms. Graham’s original rewrote the label form to be that of the lambda call as above, with the binding a bound to the label form itself. In both cases, it really doesn’t matter because the rewriting ensure that the evaluation doesn’t touch the a binding at all.

Observe what happens when it’s all put together:

1 (eval (cons (car (cdr (cdr (car label-ex)))) (cdr label-ex))
2       (cons (pair (car (cdr (car label-ex))) (car label-ex)) env'))
3 ;;=> t

This is the same as:

1 ((lambda (id) id) t)

And finally, I can show how eval handles the lambda form next.

lambda

The keystone of the Lithp’ implementation is the lambda evaluation logic. As I’ve shown, the eval expects something like ((lambda (id) id) t) as its input form. The following code looks for just such a form and again performs a different rewrite to provide an evaluation context for the lambda body:

1 ;; Lambda handling
2 ((eq (car (car expr)) (quote lambda))
3 (eval (car (cdr (cdr (car expr))))
4        (append (zip (car (cdr (car expr))) (eval-args (cdr expr) binds))
5                binds)))

There’s a lot going on here, so let me take it piece by piece. First, the append function probably does what you’d think – it takes two lists and tacks them together end to end:

1 (label append (lambda (append_x append_y)
2                 (cond ((null append_x) append_y)
3                        (t (cons (car append_x)
4                                 (append (cdr append_x) append_y))))))

Yet another recursive function is append, indeed, to do anything interesting in Lithp recursion is required. Regardless, append works as follows:

1 (append (quote (a b c)) (quote (d e f)))
2 
3 ;;=> (a b c d e f)

A slightly more interesting function is the zip function used above:

1 (label zip (lambda (x y)
2              (cond ((and (null x) (null y)) nil)
3                    ((and (not (atom x)) (not (atom y)))
4                     (cons (pair (car x) (car y))
5                           (zip (cdr x) (cdr y)))))))

The zip function takes two lists and builds an environment-like pairing of the elements of both, like so:

1 (zip (quote (a b c)) (quote (d e f)))
2 
3 ;;=> ((a d) (b e) (c f))

With these two functions in place, I can show how the parts of the lambda handling work in conjunction to rewrite. The example is as I’ve already shown:

1 (label lambda-ex (quote ((lambda (id) id) t)))

Now, to build up a context for use in resolving the body of the lambda I need a way of taking the argument bindings and evaluating them so that they take on values within the body of the lambda. The function eval-args does just this:

1 (label eval-args (lambda (args binds)
2                    (cond ((null args) nil)
3                           (t (cons (eval  (car args) binds)
4                                    (eval-args (cdr args) binds))))))

In the core environment env' the names (t nil) will evaluated as such:

1 (eval-args (quote (t nil)) env')
2 
3 ;;=> (t ())

These values will zip with the arguments to the lambda form to build a function-level binding context used to evaluate the lambda body, pulled out with:

1 (car (cdr (cdr (car lambda-ex))))
2 
3 ;;=> id

In other words:

1 (zip (car (cdr (car lambda-ex))) (eval-args (cdr lambda-ex) env'))
2 
3 ;;=> ((id t))

That is, the body id gets evaluated in the augmented context ((id t) (t t) (nil ())) to result in the symbol t.

And that’s all there is to it. This is the essence of the beauty discovered by John McCarthy.

Conclusion

Lithp is just a humble little Lisp created in the spirit of discovery and learning. It was incredibly instructive to create and I hope by walking through it in this installment you too might have discovered something new and interesting. There are many different ways to take this implementation. From Linear Lisp to macros to Lithp’’, the possibilities are endless. In future installments I’ll touch back with Lithp and Lithp’ and highlight other angles of discovery. But for now perhaps you could go off and implement the same using your favorite programming language?

Further reading

John McCarthy. 1960. Recursive functions of symbolic expressions and their computation by machine, Part I. Commun. ACM 3, 4 (April 1960), 184-195. DOI=10.1145/367177.367199 http://doi.acm.org/10.1145/367177.367199

Robert R. Fenichel , Jerome C. Yochelson, A LISP garbage-collector for virtual-memory computer systems, Communications of the ACM, v.12 n.11, p.611-612, Nov. 1969

Paul Graham. 2002. Roots of Lisp. http://www.paulgraham.com/rootsoflisp.html

Pascal Costanza , Richard P. Gabriel , Robert Hirschfeld , Guy L. Steele, Jr., Lisp50: The 50th birthday of lisp at OOPSLA 2008, Companion to the 23rd ACM SIGPLAN conference on Object-oriented programming systems languages and applications, October 19-23, 2008, Nashville, TN, USA

John McCarthy, A basis for a mathematical theory of computation, preliminary report, Papers presented at the May 9-11, 1961, western joint IRE-AIEE-ACM computer conference, May 09-11, 1961, Los Angeles, California

Guy Steele. Growing a Language. Keynote at the 1998 ACM OOPSLA conference. https://www.youtube.com/watch?v=_ahvzDzKdB0

Hubris in tech

hubris: n. Overbearing pride or presumption; arrogance.


calibris: n. The insistence that tech is a pure meritocracy.

chewbris: n. An outmoded management style based around intimidation, threats of firing “if you can’t get this done” and making veiled criticisms and rude comments immediately proceeded by “just kidding.”

dubris: n. The condition of being so high that re-implementing some framework sounds like a great idea.

ewbris: n. see Not-Invented-Here

foobris: n. Programming documentation filled with examples rife with meta-syntactic variables.

gnubris: n. An insistence that every open-source software developer adhere to your definition of what open-source should be.

hubris.js: n. Probably an MVC framework of some sort.

kubris: n. An insistence that every programming book should contain everything needed to understand said book.

lugubris: n. see concern-troll

n00bris: n. The pride of a newcomer to enter a new programming community and immediately start suggesting changes.

oopris: n. The idea that even though you don’t know what OOP means, you’ll know it when you see it.

prebris: n. The insistence that some concept is so important that a) everyone should already understand it and b) therefore it should not be explained.

protubris: n. That ugly utility class holding all of the methods that should have been functions, but weren’t allowed to be.

q-bris: n. Pride so deep that one feels the need to jump all over anyone skeptical of your views.

rubris: n. Excessive pride in one’s programming language or framework.

shoebris: n. The idea that something that is stunted and with chunky buttons and bright colors automatically makes it a good learning environment for children.

tumblis: n. An opinion blog without a single citation.

Viewbris: n. The excessive pride of the MVC framework author.

Wu-bris: n. The idea that all value measurements must occur through a commerce-colored lens.

Final thoughts

I hope you’ve enjoyed this installment - it was a long time coming. Please do not hesitate to reach out via the contact information below to discuss and provide feedback. All issues of Read-Eval-Print-λove are meant to be “living” in the sense that they can and will change as new ideas occur to me and feedback is provided. The essays herein can and will become all the better with your help.

Thank you.

Translations welcomed

If you’re interested in translating issues of Read-Eval-Print-λove then please contact me using the information below. I will happily share proceeds equally (as humble as they might be) with translators.

Michael Fogus — a core contributor to Clojure and ClojureScript. Creator of programming source codes.

A fervent blogger at Send More Paramedics and also co-author of the book, The Joy of Clojure and author of Functional JavaScript.

Author contact

Discussion and information

  1. Naggum was referring to Arc when he made this statement. The ego mentioned belonged to Paul Graham. The points made by Naggum in that post ran deeper than taking a shot at Graham.
  2. This is not exactly true since the Common Lisp spec intentionally leaves many key decisions up to the vendors. While I understand why the deferment was done and actually think it was the right move given the goals of the language, the choice of one vendor over another is a non-trivial decision. That said, if you think that Common Lisp is the right language then the complexities around vendor choice are necessary and should not dissuade anyone truly interested in building rock-solid systems.
  3. Frankly, the exploration continues and will continue for the foreseeable future.
  4. This code-base is at times in flux and may or may not always match the implementation detailed herein.
  5. The choice for Python was one of practicality, as at that time I was using it extensively on day-job activities.
  6. Back in the day when Graham actually wrote about Lisp. Sadly, his Lisp illumination has all but died off, but I suppose I can understand – there’s no money in it.
  7. As far as Ur-Lisps go, this is not too much of a monstrosity if I do say so myself – my implementation of Lisp in 25 lines of Ruby however…
  8. There is no particular reason that a primordial Lisp needs to implement cond, in fact something like if would work just as well.
  9. This is not to be confused with a dotted-pair which is a cons cell with a value at the head and something other than a cons cell as its tail. McCarthy’s original Lisp did not support dotted pairs and neither do I.