Forth Thinking
In Forth-like languages, it’s often the case that programs are built from constituent parts such as those implemented so far. This may not seem very unique given that the same can be said of any programming language. However, Forth programmers often strive to write their programs in a bottom-up fashion, building layers of fluency informing all layers above. In other words, each layer in a Forth program is a special-purpose language specifically geared to support the layers above it in the program. However, it’s not the case that Forth programs are one-off deals. Instead, it’s a point of pride amongst Forth programmers that code be aggressively reusable (Pountain 1987). These two concerns, I think, form a powerful approach to programming that I’ll discuss (all too) briefly in this section.
Thinking Forth
In August 2010 I stumbled on a book that deeply influenced my personal programming style and my views on how programs are constructed. The book in question was Thinking Forth by Leo Brodie (Brodie 1987) and upon reading it I immediately put it into my own “personal pantheon” of influential programming books (along with SICP, AMOP, Object-Oriented Software Construction, Smalltalk Best Practice Patterns, and Programmers Guide to the 1802). Up until that point most of the programming books that I had read focused on code, but Thinking Forth spent a significant portion of its pages discussing the thought processes behind program construction.
While I’ve read other books that touched on programming from an angle of thoughtfulness, Thinking Forth was the first that I read that drew an essential marriage between the language in use and the thought processes advocated by it. Bear in mind that this was a deeper relationship than one that you might find in a programming language’s idioms. Indeed, while idioms are often the result of discovery, they are not something that you might consider philosophy. Instead, the very idea of Forth necessarily informs the base tenet of “factoring” as discussed in the book.
Factoring
Factoring in Forth is akin to “Refactoring” in common parlance. The main difference is that refactoring is an activity that is applied to an exiting code-base and factoring is applied to an evolving code-base. That is, refactoring happens after the fact while factoring happens in situ. Thinking Forth therefore identifies and discusses a number of benchmarks while coding that signal the need for factoring, including, but not limited to:
- Factor when complexity tickles your conscious limits
- Factor when you’re able to elucidate a name for something
- Factor at the point when you feel that you need a comment
- Factor the moment you start repeating yourself
- Factor when you need to hide detail
- Factor when your command set (API) grows too large
- Don’t factor idioms
Using RussForth I’ll touch briefly on a few of these points below.
Factor when complexity tickles your conscious limits
The book puts stock in George Miller’s famous “Magical Number Seven…” paper (Miller, 1956) when discussing the notion of code’s cognitive load. That is, the gist is that people can general only hold 7 (± 2) pieces of information in their head, so it behooves the programmer to write such code that falls under those bounds. As a stack-based language, Forth’s ongoing stack manipulation from one word to another is quite confusing. Granted, I’m very far from even a novice in Forth-like languages, so such a statement should be taken with a grain of salt. That said, let’s explore a simple operation that one could conceivably find useful. That is, I could imagine a need to apply two separate quotations to a single value (Childers, 2016). Take a stack of the following form:
1 12 [ 3 * ] [ 4 * ]
An expanded description of this sequence could be stated as, “multiply 3 by 12 and push it onto the stack and then multiply 4 by 12 and push it onto the stack too.” To accomplish this requires a series of stack manipulations of the form:
1 rot dup rot apply swap rot apply swap
Breaking this down, let’s see what’s happening. To start, the stack looks as follows:
1 | [ 4 * ] |
2 | [ 3 * ] |
3 | 12 |
4 +---------+
The application of rot causes the following:
1 | [ 4 * ] | | 12 |
2 | [ 3 * ] | -> | [ 4 * ] |
3 | 12 | | [ 3 * ] |
4 +---------+ +---------
Next, the application of dup causes the following:
1 | 12 | | 12 |
2 | [ 4 * ] | | 12 |
3 | [ 3 * ] | -> | [ 4 * ] |
4 | | | [ 3 * ] |
5 +---------+ +---------+
Next, the application of rot again causes the following:
1 | 12 | | [ 4 * ] |
2 | 12 | | 12 |
3 | [ 4 * ] | | 12 |
4 | [ 3 * ] | -> | [ 3 * ] |
5 +---------+ +---------+
Next, the application of apply causes the following:
1 | [ 4 * ] | | |
2 | 12 | | 48 |
3 | 12 | | 12 |
4 | [ 3 * ] | -> | [ 3 * ] |
5 +---------+ +---------+
Now I want to use swap rot to move that result out of the way and then prepping the next operation, causing the following:
1 | | | [ 3 * ] |
2 | 48 | | 12 |
3 | 12 | | 48 |
4 | [ 3 * ] | -> | |
5 +---------+ +---------+
Then the next use of apply almost gets me there:
1 | [ 3 * ] | | |
2 | 12 | | 36 |
3 | 48 | | 48 |
4 | | -> | |
5 +---------+ +---------+
But since I wanted the stack to look a certain way a final swap is needed leaving:
1 | 48 |
2 | 36 |
3 +---------+
And that’s it! Unfortunately between the beginning and the end of putting this sequence of words together I’ve forgotten what’s happened with the stack. I’ve used 8 words to perform this task, but most of those are raw stack shufflers that on their own are somewhat opaque to the task at hand. The heart of the task lies in the apply words. That is, the shuffling performs entirely to setting up the calls to apply, which are informative in their own right. That said, I can replace some of the words to use another apply-like operator dip to trim a word and still maintain the focus point of information:
1 rot dup rot dip rot apply swap
I’ll avoid walking through the stack manipulations again, but I’ll talk just a moment about what this achieves. First, it trims a word and gets the whole sequence into that “7 things” range. Second, while trimming it also keeps the focused setup/process information quanta:
1 rot dup rot dip
2 rot apply
3 swap
With more built-in features, this sequence could be further simplified, but I hope that my point is understood.
Factor when you’re able to elucidate a name for something
It seems that a process for applying two quotations to a single value might be generally useful. Once it becomes clear that a fragment of code is generally useful, it follows that it should have a name. 1 The name2 that I might chose comes straight from the description given earlier, “apply two separate quotations to a single value”:
1 : apply2 rot dup rot dip rot apply swap ;
And now the original fragment becomes:
1 12 [ 3 * ] [ 4 * ] apply2
2 .S
3 \\= [36, 48]>
Which leaves the stack in the condition that we saw earlier.
Factor at the point when you feel that you need a comment
Imagine that I wanted a version of dip that used its stored value in the quotation rather than extracting it for later pushing. I would need to use stack shufflers to duplicate the stored value so that it could be used in the quotation:
1 5 10 2 [ * ] over swap \ use the stored value
2 dip
Graphically this would look like the following:
1 | [ * ] |
2 | 2 |
3 | 10 |
4 +-------+
With the application of over swap the following would occur:
1 | | | [ * ] |
2 | [ * ] | | 2 |
3 | 2 | | 2 |
4 | 10 | -> | 10 |
5 +-------+ +-------+
After dip the final stack would be:
1 | 2 |
2 | 20 |
3 +-------+
As shown, it’s tempting to put a comment into the original to explain how the use of over swap works to weave the stored value back into the quotation application, but there’s a better way. That is, it’s better to factor out a new combinator instead:
1 : sip over swap dip ;
The comment is now manifested as a reusable (and testable) word. 3
Don’t factor idioms
I had alluded earlier that idioms do not operate at the philosophical level of a given programming language. Instead, idioms are natural growths occurring on a programming language. That said, very often the form of idioms are directly influenced by the philosophical underpinnings of a language. Like idioms in natural languages, programming idioms should be viewed as atomic units with meaning quite independent of their constituent parts. Therefore, it’s important to leave idioms intact and resist the urge to factor them out in whole. Hardcore Forth implementations like arrayForth (based on ColorForth), GForth, and Open Firmware have their own rich sets of idioms, but underneath those idioms lies the Forth philosophy – some more than others.
Conclusion
Forth is an astonishing programming language. The very design and (most) implementations revolve around the idea that there’s a conceptual distance between the ideas in your head and the code on the screen and that distance should be as short as possible. Sadly, I simply cannot adequately cover the whole of the beauty of Forth thinking in this small space. There is so much going on in this book that it literally made me dizzy while reading it. From thoughts on factoring, code organization, testing, DSLs, encapsulation, data-hiding, variable naming, word length, decomposition, and design, Thinking Forth is, ideas-per-page, unmatched in the realm of programming books.
- If I ever get the urge to write a “Programming for Buddhists” book then Forth (or some other concatenative language) is my choice for the language. There is a nice parallel between the ideas of rupa, Maya, and nama-rupa that would be a blast to write about.↩
- This word is typically called
bi.↩ - Interestingly
[ ] sipis equivalent todup. (Kirby, 2002)↩