Leanpub: Publish Early, Publish Often

world hello

The Forth programming language was created by Charles (Chuck) Moore as the basis for a personal programming environment. After working on this system for some time in isolation, Moore decided to extract a Forth implementation for use in the National Radio Astronomy Observatory’s 11-meter radio telescope in 1971. Imagine this for a moment if you please. Chuck Moore, some time in 1968 found that the computing environments of the day were insufficient for his needs and decided to create his own from whole-cloth. Over the next 2-3 years Moore developed a computing environment suited to his philosophy of programmatic style. A crucial components of this system was the programming language forming its underpinnings: Forth.

By modern standards, Forth is an odd duck. First, the language is stack-centric, meaning that every operation in the language deals either implicitly or explicitly with a stack-based programming metaphor. This is decidedly different from a programming language like C that may use as an implementation details an underlying stack to pass function activations. In Forth, a program consisting of a single token can demonstrate how its stack is central in its programming model:

1 9

The Forth program above is interesting in that it illustrates the conceptual underpinnings of the language. In most modern programming languages, a program containing the number 9 would be quite pointless, but in Forth such a program would cause an effect in the Forth environment itself. That is, the number 9 in Forth isn’t just a constant integer, instead it’s a command to the Forth environment that states “push the integer 9 onto the data stack.” The data stack in Forth is a global scratchpad in Forth environments used by operations to retrieve and store the results of functions.

A larger Forth program is shown below:

1 36 9 /

In addition to the commands to push 36 and 9 onto the stack, there’s a new command called / that states: “pop something off of the stack and make it the denominator and then pop another thing off and make that the numerator and then push the resulting quotient onto the stack.” At the end of this program you’ll not be surprised to learn that the number 4 is the only element on the data stack.

Before I go any further I’d like to talk a little bit about the word “concatenative.” Very often when discussing Forth and languages of its ilk (the pedigree being Factor, Joy, Open Firmware, and PostScript) you’ll see this term bandied about, but what does it really mean? From a simplistic perspective, the term concatenative means “programs are built by laying code words side by side,” but again this is nearly meaningless. Instead, let me create a very simple little Clojure function that might help clarify the matter:

1 (defn postfix [& words]
2   (reduce #(if (fn? %2)
3              (let [[l r & m] %]
4 		       (cons (%2 r l) m))
5 		       (cons %2 %)) 
6 		  [] 
7 		  words))

The postfix function is fairly straight-forward. It takes a number of arguments (words) that represent either functions or non-functions. A sample call looks as follows:

1 (postfix 5 1 2 + 4 * + 3 -)

It then walks the words and when it finds a function it calls it with the top-2 elements of the “stack” (the words list) and plops the result back in place, otherwise it just plops the word. Eventually, what comes out of postfix is something that represents a stack, a la:

1 (postfix 5 1 2 + 4 * + 3 -)
2 ;;=>  (14)
3 
4 (postfix 3 2 1 + *)
5 ;;=> (9)
6 
7 (postfix 1 2 3 * +)
8 ;;=> (7)

Now imagine that the vector [5 4 *] represents a valid program:

1 (apply postfix [5 4 *])
2 ;;=> (20)

I could literally concatenate another valid program onto this vector in order to create a new valid program:

1 (apply postfix (concat [5 4 *] [30 10 +]))
2 ;;=> (40 20)

Likewise, I could concatenate an operator into that result to create yet another valid program:

1 (apply postfix
2   (-> [5 4 *]
3       (concat [30 10 +])
4 	  (concat [+])))
5 ;;=> (60)

Ad infinitum.

As shown, programs built with concatenative languages are composed via the act of “laying” operators, or even whole programs, side by side. Additionally, like in natural language, Forth words can directly replace longer phrases. Observe:

1 5 dup *

When the program above is run, the data stack will look as follows:

1 |  25  |
2 |      |
3 +------+

That is, the only element on the stack will be the number 25 because the program 5 dup * means “duplicate the thing on the top of the stack and push the duplicate on top of it, then take two things off of the stack (two 5s) and multiply them together, pushing the result back onto the stack.” The program fragment dup * is therefore equivalent to the act of squaring the number on the top of the data stack. Indeed, I could use those two words to define a new word, as shown below:

1 : sq dup * ;

I’ll talk more about the : operator later, but for now you only need to understand that it creates a new word named sq consisting of the two words dup *. Now, I can directly replace the original longer phrase dup * with sq to keep the original behavior:

1 5 sq

This is a common work-flow in Forth-like languages:

Write some code to perform a task
Identify some fragment that might be generally useful
Extract it and give it a name
Replace the relevant bits with the new word

There is nothing special about this work-flow except that it’s indicative of a philosophical underpinning of Forth-like languages driving code toward a highly-layered implementation. I’ll dive into this angle later.

In this installment of Read-Eval-Print-Love I’ll discuss Forth by first dissecting a small Ruby implementation called “Russ-Forth”¹, which is a simplified Forth (e.g. no return stack). Once the run-time for Russ-Forth is in place I’ll implement the bare bones of a core library and talk about stack-shufflers, combinators, and the like. I’ll then talk about the ideas in the amazing book “Thinking Forth” by Leo Brodie which describe an interesting bottom-up approach to system design.

While Forth is an interesting programming language in its own right, the language itself is a philosophical statement by Chuck Moore (Brodie 1984) about how programs should be constructed. Of course, I’ll have fun with a tiny Forth-like implementation, but the point of this installment is to establish a context for describing the Forth philosophy. I’m personally of the opinion that far more important than the programming language itself, the idea of Forth is key.

The name is an homage to my friend and colleague Russ Olsen, who posted the original code to me as a comment on one of my GoodReads reviews. Thanks Russ!↩

Up next

Russ Forth: a simple Forth in Ruby