Notes

A Crash Course in Lisp

1RUTILS provides an abbreviation of the standard funcall to simply call. It was surely fun to be able to call a function from a variable back in the 60’s, but now it has become so much more common that there’s no need for the prefix ;)

Essential Data Structures

Data Structures

1Moreover, Python has special syntax for destructuring such tuples: dec, rem = truncate(3.14). However, this is not the optimal way to handle returning the primary and one or more secondary values from a function. Lisp provides a more elegant solution called multiple values: all the necessary values are returned via the values form: (values dec rem) and can be retrieved with (multiple-value-bind (dec rem) (truncate 3.14) ...) or (rtl:with ((dec rem (truncate 3.14))) ...). It is more elegant because secondary values may be discarded at will by calling the function in a usual way: (+ 1 (truncate 3.14)) => 4 — not possible in Python, because you can’t sum a tuple with something.

2Actually, the complexity here is O(n^2) due to the use of the function member that performs set membership test in O(n), but it’s not essential to the general idea. If uf-disjoint is expected to be called with tens or hundreds of points the roots structure has to be changed to a hash-set that has a O(1) membership operation.

Arrays

1or void* in C, or some other type that allows any element in your language of choice

2Such incompatibility errors are not a cheap thing: for instance, it is reported that the crash of the first Arian V rocket happened due to interoperation of two programs that used the metric and the imperial measurement systems without explicit conversion of the data. There’s an elegant solution to such problem: “dimensional numbers”, which a custom reader macro to encode the measure alongside the number. Here is a formula expressed with such numbers:

The output is, of course, in metric units. Unfortunately, this approach will not be useful for arrays encoded by different languages as they are obtained not by reading the input but by referencing external memory. Instead, a wrapper struct/class is, usually, used to specify the elements order.

Linked Lists

1While, in the Lisp machines, cons cells even had special hardware support, and such change would have made it useless.

2Although, for structs, it is implementation-dependent if this will work. In all the current implementations, it will.

3To further learn about this topic I would recommend reading the relevant chapters from the book Practical Common Lisp: Generic Functions and Classes.

4Symbol plists represent one of the unpleasant legacy features of the language, in that the most obvious accessor name, namely get, is reserved for working with symbols. Therefore, this name can not be used for accessing other kinds of data. Historically, symbol plists where the first and only variant of key-values available in the language (at that time, the other languages didn’t have the slightest idea of such a high-level concept).

Derivative Data Structures

Trees

1This statement is strictly true for balanced trees, but, even for imbalanced trees, such estimation is usually correct.

2Although it was shown that this value may also be reduced to a single bit if the tree is implemented as a rank balanced tree with delta ranks allowed of 1 or 2 meaning: “when going upward there is an additional increment in height of one or two”

Strings

1A proper suffix is a suffix that is at least one character shorter than the string itself. For example, in the string abc the proper suffices are bc and c.

2Perl is only the most conspicuous example of a large number of popular programs that use the same algorithm; the same applies to Python, or PHP, or Ruby, or many other languages.

Selected Algorithms

Dynamic Programming

1If you wonder, s is a word that is usually present in English programmatic dictionaries because when it's and friends are tokenized they’re split into two tokens, and the apostrophe may be missing sometimes. Also, our dictionary is case-insensitive.

2The intuition for it is the following: in the worst case, every character has two choices: either to be the last letter of the previous word or the first one of the next word, hence the branching factor is 2.

3Actually, the condition of complete string coverage may be lifted, which will allow to use almost the same algorithm but skip over “undictionary” words like misspellings.

4See, for example, Stuart Dreyfus “Richard Bellman on the Birth of Dynamic Programming”.

5A space at the end of the line is discarded.

6Provided all the length calculations are implemented efficiently. For simplicity, I have used plain lists here with a linear length complexity, but a separate variable may be added to avoid the extra cost.

7However, if we think of it, we could reuse the already proven linked representation just putting the incoming edges into the node structure instead of the outgoing ones.

8Here, a separate backpointers array isn’t necessary as we can infer the direction by reversing the distance formula.

Approximation

1It uses the popular drakma HTTP client and cl-ppcre regex library.

2The standard Fourier Transform uses both the sine and cosine functions that are the components of the complex exponent.

Compression

1To make full use of this feature and be able to profile SBCL internal functions, you’ll need to compile SBCL with --with-cons-profiling flag. Many thanks to Douglas Katzman for developing this feature and guiding me through its usage.

2It was verified by taking the average of multiple test runs.

3You can study the details in the relevant article.

4Some implementations (for instance, SBCL) have “smart enough” compilers to perform constant folding of such expressions. However, read-eval may be used to help the compiler if it is not smart enough.

Synchronization

1We will further use the term “thread” to denote a separate computation running as part of our application, as it is less ambiguous than “process” and also much more widespread than all the other terms.

2This internal “threading”, usually, also relies on the OS threading API behind the scenes.

3In this context, they tend to be called “processes”, but we’ll still stick to the term “thread”.

4The other apparent limitation of supporting only two threads can be lifted by a modification to the algorithm, which requires some hardware support.

5If we forbid the destructive rplaca/rplacd operations and their derivatives.

6and a quite high algorithm base — usually 32 — that means very shallow trees resulting in just a handful of hops even for quite large structures

7Except for the length of the leftmost range that depends on the number of bits in a hash. For instance, for a 32-bit hash, it may be 7, and the depth of the whole HAMT would be 5.

8His thesis with the same title is freely available, but the book covers more and is more accessible.

9The reason for that might be relative immaturity of this space, as well as its complexity, so that our knowledge of it hasn’t been developed enough to reach the stage when optimization becomes the main focus.