Getting Started

part description. yadayada. etc.

‘For’ considered Harmful

In 1968 Dijkstra pointed this out: “Go To Considered Harmful” (paper). He advocated structured programming instead, a programming paradigm aimed on improving the clarity, quality, and development time of a computer program by making extensive use of subroutines, block structures and for and while loops—in contrast to using simple tests and jumps.

In this book I’ll be advising you to move to an even more structured style of programming. I’ll try to convince you to write code without for and while loops. We will be replacing those coding structures with even higher level constructs.

article with comments to process http://david.tribble.com/text/goto.html

Diving in with an Example

Imagine you have a collection of references to stores. Each store is an object that has a name, a distance from where you are currently and the list price of a thing you want to buy.

This list of stores might have been loaded from a database, or a file, or received as JSON. It doesn’t really matter here. We also use plain hashes to represent the objects.

The challenge is to find the price of the cheapest provider that is within cycling distance. You are a sporty person and let’s say that is 5 miles. There’s also one extra thing to note: you have 10% discount on the list price at the Shell shops.

1 stores = []
2 stores << { name: "total", d: 2.0, price: 32.0 }
3 stores << { name: "shell", d: 2.6, price: 28.5 }
4 stores << { name: "esso", d: 3.2, price: 41.0 }
5 stores << { name: "q8", d: 3.5, price: 22.0 }
6 stores << { name: "shell", d: 4.5, price: 19.0 }
7 stores << { name: "q8", d: 5.5, price: 18.0 }

Comparing Styles

Do you prefer giving commands or instructions? Or are you the type that prefers explaining, empowering your people or colleagues? Maybe unexpected but this is exactly the difference between classic imperative programming styles and a more functional programming styles.

  • Imperative = emphasizes on how something is computed.
  • Imperative = make known or explain.
  • Declarative = emphasizes on what is to be computed and not on how.
  • Declarative = expressing a command, commanding

It might sound abstract. How can an algorithm focus more on the ‘what’ than on the ‘how’? To make things clear we’ll be comparing an imperative solution with a functional solution below.

Imperative solution

Below you find the classical imperative solution for the problem. It mixes some patterns that many people have acquired while learning to program.

It iterates over all the stores, and if the store is within the maximum distance it goes into a block of code with two functions.First we find the business rule that describes our discount.

Next we recognize a pattern to calculate a minimum value. The pattern exist out of two parts: (1) On top min is defined as a utility variable to hold the last known minimum, (2) while within the loop min is updated with a new value if that new value is lower than the current known value.

 1 def find_cheapest_nearby (stores)
 2   min = 0
 3   stores.each do |store|
 4     if store[:d] < 5.0 then
 5       store[:price] = store[:price] * 0.9 if store[:name] == "shell"
 6       min = store[:price] if store[:price] < min
 7     end
 8   end
 9   min
10 end

Was that hard to analyze? Not really. But that’s mainly because we already new what the algorithm is expected to do. If the title of the method would be less clear and you would be looking at this code snippet without a lot of context. It would already be a bit harder. But still, as the algorithm is short and simple, a trained developer would be able to recognize the patterns and derive from that it’s function. Worst case we would resort to playing the computer ourselves in our head, iterating over the different instructions.

In the mean time, many of the readers might have been feeling uncomfortable. If you are a bit sharp while reading this, you’ve probably noted at least one ‘bug’ in the above code.

First of all, the pattern for calculating the minimum value wasn’t applied correctly. As min was initialized as 0, unless there are negative prices, the outcome of the previous algorithm will always be 0.

Second, and this one is much more scary, the routine will function correctly the first time it is run, however on repeated execution it will start returning wrong results. Why? Because the algorithm changes the list prices on each run.

Fixes are easy. So let’s get rid of those mistakes. We’ll initialize min with the first value and introduce another variable to hold our price.

 1 def best_price_nearby (stores)
 2   min = stores[0][:price]
 3   stores.each do |store|
 4     if store[:d] < 5.0 then
 5       price = store[:price]
 6       price = price * 0.9 if store[:name] == "shell"
 7       min = price if price < min
 8     end
 9   end
10   min
11 end
Functional Solution

Let’s take a look now at the more functional solution variant. First of all, don’t get frightened if you see some constructs which look new to you. They will get clear soon enough. And chances are that your best guess on what they stand for, will just be correct.

 1 def best_price_nearby (stores)
 2 
 3   near = ->(x) { x[:d] < 5.0 }
 4   myPrice = ->(x) { x[:name] == "shell" ? x[:price]*0.9 : x[:price] } 
 5   
 6   stores.
 7     find_all(&near).
 8     collect(&myPrice).
 9     min
10 end

Let’s break it down step by step. The first expression we meet is a definition of a function that decides if the shop is within our reach.

1   near = ->(d, x) { x[:d] < 5.0 }

This is a so called Predicate function, a function which returns true or false.

Next we find an expression that defines a function that given a shop returns our price.

1   myPrice = ->(x) { x[:name] == "shell" ? x[:price]*0.9 : x[:price] } 

And now let’s break down the final expression. This is where we see list processing in action for the first time. The first step is the following.

1   stores.find_all(&near)

What does it do? Well, we all expect it to return the stores that are near. How does it do that? Later we’ll be looking at find_all in more detail, but find_all is a method that given a Predicate function - a function that returns true or false - returns all elements of the list for which the Predicate function returns true.

But than it goes on. For all the shops that are near, it collects my price.

1   stores.find_all(&near).collect(&myPrice)

How does it do that? Later we’ll look in a lot more detail to the collect function. In short it converts our list of nearby shops into a list of prices by applying the myPrice function to each of the elements.

We are almost there. To get the best price from the list of prices, we simply use the min function. A standard function that is available on all lists that contain comparable elements.

1   stores.find_all(&near).collect(&myPrice).min

Conclusions

Let’s repeat the functional variant, to look at it again. What do we see?

 1 def best_price_nearby (stores)
 2 
 3   near = ->(x) { x[:d] < 5.0 }
 4   myPrice = ->(x) { x[:name] == "shell" ? x[:price]*0.9 : x[:price] } 
 5   
 6   stores.
 7     find_all(&near).
 8     collect(&myPrice).
 9     min
10 end

First of all, there are actually some things we don’t see.

  • We don’t see any more a classical for loop.
  • We don’t see utility variables.
  • We don’t see a pattern to calculate the minimum value
  • We don’t see a lot of code that we would need to ‘interprete’ to know what’s going on.

We also note the following

  • the definition of what it means for a shop to be near is nicely separated from the rest of the code.
  • the business rule that calculates my price from the list price is also nicely separated from the code
  • the ‘program’ that finds all near shops and selects from those the minimum price can almost be read as natural English.

Declarative programming is counterintuitive when you’re used to imperative programming, at least when you are asked to write down an algorithm. This all goes back to the very early days of computer science, where scientist where giving the machine sequences of instructions. Because deep down, computers still are dumb, and just are very fast in executing basic instructions, we tend to think and train people that software is about chaining basic instructions into programs.

But from a reader and maintenance point of view, functional coding makes code more readable and hence less error prone and easier to maintain.

Software and Complexity and it’s cost

At some point in a programming curriculum, one gets introduced to complexity, typically the algorithmic or computational complexity. In computational complexity theory, the amounts of resources required for the execution of algorithms is studied. The most popular types of computational complexity are the time complexity and the space complexity of a problem. Computational complexity is an important concept to well understand, and we will be talking about computation complexity a lot more in the last part where we will be talking about performance.

However, what is often not addressed, is that there are other possibly even more relevant forms of complexity in the domain of software engineering. System complexity is a measure of the interactions of the various elements of the software. This differs from the computational complexity in that it is a measure of the design of the software.

System complexity: Number of possible interactions in a system of n components = n!

Many things can put a project off course, bureaucracy, unclear objectives, lack of resources, to name a few, but it is the approach to design that largely determines how complex software can become. When complexity gets out of hand, the software can no longer be understood well enough to be easily changed or extended. By contrast, a good design can make opportunities out of those complex features. Associated with, and dependent on the complexity of an existing program, is the complexity associated with changing the program.

Domain Complexity: Some of these design factors are technological, and a great deal of effort has gone into the design of networks, databases, and other technical dimension of software. Books have been written about how to solve these problems. Developers have cultivated their skills. Yet the most significant complexity of many applications is not technical. It is in the domain itself, the activity or business of the user. When this domain complexity is not dealt with in the design, it won’t matter that the infrastructural technology is well-conceived. A successful design must systematically deal with this central aspect of the software.

Towards the compiler the software should express correctly and formally its precise function. But remember that the code you write will be read many many times over the lifetime of the software by yourself or human colleagues. The real quality of software lies in the expressiveness of your code on it’s goal towards Humans, not compilers. Because that in the end will largely influence strongly the development and maintenance cost of your software.

Putting functional programming in context. Explain that functional programming is only one way to manage complexity.

Software engineering as a discipline is largely is about techniques to manage different kinds of complexity, through techniques and processes. If we forget about the processes for a moment - things like agile, design patterns and bdd - a modern developer has a huge set of tools available when designing software:

  • Service Orientation
  • Layering
  • Modules/Libraries
  • Functions/Procedures
  • Domain Specific Languages
  • Object Orientation
  • Frameworks
  • Aspect Orientation
  • Domain Driven Design

We will zoom in to only a selection of the above techniques. But as a software professional you should try to master all of those techniques, they are all equally important, depending on the problem you are faced with. Always try to keep in mind, what goal they should be serving. As long as when applied your system complexity is reduced, they are likely applied rightly. If you ever see any of the techniques applied in such a way that system complexity is increased (number of possible interactions within the system), than probably it means that, whatever the technique is, it’s being wrongly applied in that specific context.

Short History of Functional Programming

  • Lambda calculus: 1930’s (Alonzo Church)
  • Lisp: 1950’s, multi-paradigm language strongly inspired by lambda-calculus (symbolic manipulation, rewriting)
  • ML-family: 1970’s, general purpose functional programming language with type inference
  • Haskell: 1987, 1998, 2010, open standard for functional programming research

The world of functional programming languages

Clean, F#, Scheme, Scala, Clojure, Erlang, …

SQL

SQL is not imperative because the process of HOW queries and relationships are resolved are not defined by the programmer, but rather by the compiler/optimizer/interpreter. SQL is a declarative language - In SQL, you declare relationships. This builds up a data structure (that again is not physically defined with the language but by its implementation) using inserts, updates, and deletes.

Use of the relations is then done using queries (SELECT statements), which are functional in that they do not have side effects.

Functional programming centers around a few core ideas:

  • functions are first-class citizens (that is, they can be used as values, as inputs to other functions, and as output from other functions)
  • higher-order functions (functions that operate on functions, or functions that return functions)
  • purity (a pure function is one that has no side effects; a pure function cannot do any I/O, it cannot read nor modify any global state, and it cannot take non-const reference arguments. Pure functions are especially interesting because they will always produce the same output given the same inputs)

SQL certainly doesn’t revolve around functions as the main tool for modeling things, but it does somewhat embrace the purity idea - the same query run on the same database will yield the same result, every time (except for ordering). But calling SQL a ‘functional’ language is a bit of a stretch though.

XSLT

The XSLT language is declarative: rather than listing an imperative sequence of actions to perform in a stateful environment, template rules only define how to handle a node matching a particular XPath-like pattern, if the processor should happen to encounter one, and the contents of the templates effectively comprise functional expressions that directly represent their evaluated form: the result tree, which is the basis of the processor’s output.

Although XSLT is based on functional programming ideas, it is not a full functional programming language, it lacks the ability to treat functions as a first class data type. It has elements like lazy evaluation to reduce unneeded evaluation and also the absence of explicit loops.

The Power of Recursion

The main ideas behind this chapter is:

  • Learning to think with recursion (Divide and Concquer with Recursion)
  • Often forgotten but prior to thinking functional one should be fluent with thinking with recursion. Most juniors and many seniors suffer with recursion
  • The goal is to build a foundation

non recursive version of fibonacci

1 def fibs(n)
2   ff = []
3   a = b = 1
4   n.times { ff << a; a,b = b,a+b }  
5   ff
6 end

Basic Algorithms

  • Faculty, etc

Sometimes you need assistance

Alternative title: sometimes you need a helper function.

  • Fibonacci is good example to learn about accumulators
  • Learning to think with accumulators

examples

1 def fib_acc(count, a, b, r)
2     count == 0  ?  r  :  fib(count-1, b, a+b, r << a)
3 end
4 
5 def fib(count)
6     fib_acc(count, 1, 1, [])
7 end 
8 fib(10) # => [1,1,2,3,5,...

Avoid utility functions with default argements

1 def fib(count, a = 1, b = 1 , r = [])
2     count == 0  ?  r  :  fib(count-1, b, a+b, r << a)
3 end 
4 fib(10) # => [1,1,2,3,5,...
  • r is the accumulator
  • count is the terminator
  • a and b are the ‘variables’

Exercises

Sorting Algorithms

Sorting algorithms at a level of complexity. They remain very relevant from and educatinal point of view.

Key elements to address:

  • Talk about sort stability
  • Introduce Time and Space complexity

Merge sort

first explain the algorithm

some prelude stuff we need

1 class Array
2   # prepend a to the array
3   def >> a 
4     self.unshift a
5   end
6 end     

the identity function

1 id = ->(x) { x }

splitting an array in parts

1 split_in = ->(n, xs) { xs.each_slice( (xs.length+1)/n ).to_a }.curry
2 split_in_half = split_in.(2)

merge as a lambda, generalized

 1 merge_by = ->(f, xs, ys) do
 2 
 3   return xs if ys.empty?
 4   return ys if xs.empty?
 5 
 6   x, *xt = xs
 7   y, *yt = ys
 8 
 9   return merge_by.(f, xt, ys) >> x if f.(x) <= f.(y)
10   return merge_by.(f, xs, yt) >> y
11 
12 end.curry   
13 merge = merge_by.(id)  

once we have merge, merge sort is trivial

1 msort = ->(xs) do    
2 
3   return xs if xs.length <= 1 # stopcondition    
4 
5   left, right = split_in_half.(xs)
6   merge.( msort.(left) , msort.(right) )    
7 
8 end

generalized version with derived specific version for default sort order

1 msort_by = ->(f, xs) do
2 
3   return xs if xs.length <= 1 # stopcondition
4 
5   left, right = split_in_half.(xs)
6   merge_by.( f, msort_by.(f, left) , msort_by.(f, right) )
7 
8 end.curry
9 msort = msort_by.(id)

tests

1 msort.([])
2 msort.([4,3,2,1])
3 msort.([3,2,1])
4 msort.([9,4,2,3,1,3,5,9])
5 msort.([9,4,2,3,1,3,5,9,9,4,2,3,1,3,5,9])

Quick sort

Declarative Implementation (1)

# from https://gist.github.com/aspyct/3433278 def quicksort(array, from=0, to=nil)

 1   if to == nil
 2       # Sort the whole array, by default
 3       to = array.count - 1
 4   end
 5    
 6   if from >= to
 7       # Done sorting
 8       return
 9   end
10    
11   # Take a pivot value, at the far left
12   pivot = array[from]
13    
14   # Min and Max pointers
15   min = from
16   max = to
17    
18   # Current free slot
19   free = min
20    
21   while min < max
22       if free == min # Evaluate array[max]
23           if array[max] <= pivot # Smaller than pivot, must move
24               array[free] = array[max]
25               min += 1
26               free = max
27           else
28               max -= 1
29           end
30       elsif free == max # Evaluate array[min]
31           if array[min] >= pivot # Bigger than pivot, must move
32               array[free] = array[min]
33               max -= 1
34               free = min
35           else
36               min += 1
37           end
38       else
39           raise "Inconsistent state"
40       end
41   end
42    
43   array[free] = pivot
44    
45   quicksort array, from, free - 1
46   quicksort array, free + 1, to

end

Declarative Implementation (2)

# from http://opennfo.wordpress.com/2009/02/15/quicksort-in-ruby/ def quicksort(list, p, r) if p < r then q = partition(list, p, r) quicksort(list, p, q-1) quicksort(list, q+1, r) end end

def partition(list, p, r) pivot = list[r] i = p - 1 p.upto(r-1) do |j| if list[j] <= pivot i = i+1 list[i], list[j] = list[j],list[i] end
end list[i+1],list[r] = list[r],list[i+1] return i + 1 end

Functional Variant

# functional version def qsort(list) return [] if list.size == 0 pivot, *xs = *list less, more = xs.partition { |y| y < pivot } qsort(less) + [pivot] + qsort(more) end

Exercises

Backtracking Algorithms

  • 8 Queens

classic solution with basic recursion

 1 module RecurseQueens
 2   def self.unsafe ( i,j, a,b )
 3     j == b || i == a || i+j == a+b || i-j == a-b
 4   end
 5   def self.safe (partial, n)
 6     partial.each_with_index.map { |r,i| unsafe( n, partial.length,  r, i) }.no\
 7 ne?
 8   end
 9   def self.solve (m, partial = [], solutions = [])
10     if m == 0
11       solutions << partial
12       return solutions
13     else
14       (0..7).each { |n| solve(m-1, (partial + [n]), solutions) if safe(partial\
15 , n) }
16       return solutions
17     end
18   end
19 end

classic solution with a pure ruby list comprehension

 1 module Queens
 2   def self.unsafe ( i,j, a,b )
 3     j == b || i == a || i+j == a+b || i-j == a-b
 4   end
 5   def self.safe (partial, n)
 6     partial.each_with_index.map { |r,i| unsafe( n, partial.length,  r, i) }.no\
 7 ne?
 8   end
 9   def self.solve (m = 8)
10     return [[]] if m == 0
11     solve(m-1).map { |partial| (0..7).map { |r| partial + [r] if safe(partial,\
12  r) }.compact }.flatten(1)
13   end
14 end
15 
16 pp Queens::solve.length

classic solution with using comprise list comprehensions

 1 module Qns
 2   def self.unsafe ( i,j, a,b )
 3     j == b || i == a || i+j == a+b || i-j == a-b
 4   end
 5   def self.safe (partial, n)
 6     partial.each_with_index.map { |r,i| unsafe( n, partial.length,  r, i) }.no\
 7 ne?
 8   end
 9   def self.solve (m = 8)
10     return [[]] if m == 0
11     listcomp(
12         partial: solve(m-1),
13         n: -> { (0..7).select { |i| Qns::safe(partial,i) } }
14     ) { partial + [n] }
15   end
16 end
17 
18 pp Qns::solve.length

with ‘verstehen’ list comprehensions

 1 module QnsV
 2   def self.attacks ( i,j, a,b )
 3     j == b || i == a || i+j == a+b || i-j == a-b
 4   end
 5   def self.safe (partial, n)
 6     partial.each_with_index.map { |r,i| attacks( n, partial.length,  r, i) }.n\
 7 one?
 8   end
 9   def self.solve (m = 8)
10     return [[]] if m == 0 # stop condition
11     list { partial + [n] }.
12        for(:partial).in { QnsV::solve(m-1) }.
13        for(:n).in { 0..7 }.
14        if { QnsV::safe(partial, n) }.
15        comprehend
16   end
17 end    

with lambda’s, defining some basics

1 unsafe = ->(i,j,a,b) { j == b || i == a || i+j == a+b || i-j == a-b }.curry
2 # first check is not really needed and can be removed
3     
4 safe = ->(queens, n) do
5   unsafe_to = unsafe.(n,queens.length)
6   queens.each_with_index.map { |r,i| unsafe_to.(r,i) }.none?
7 end

pure ruby list comprehension

1 queens = ->(m) do
2   return [[]] if m == 0
3   queens.(m-1).map { |p| (0..7).map { |r| p + [r] if safe.(p, r) }.compact }.f\
4 latten(1)
5 end

with list comprehension

1 qs = ->(m) do
2   return [[]] if m == 0
3   listcomp( p: qs.(m-1), r: -> { (0..7).select { |r| safe.(p,r) }  } ) { p + [\
4 r] }
5 end

with list comprehension (generalized version)

1 require 'comprise'
2 
3 qns = ->(size, m = size) do
4   return [[]] if m == 0
5   listcomp(
6       partial: qns.(size, m-1),
7       n: -> { (0..(size-1)).select { |i| safe.(partial,i) } }
8   ) { partial + [n] }
9 end

with lazylist

1 require 'lazylist'
2 
3 # lazy liss requires functions to be defined as constants
4 Lqns = ->(m) do
5   return [[]] if m == 0
6   list { partial + [n] }.where( partial: Lqns.(m-1), n: 0..7) { safe.(partial,\
7  n) }
8 end
  • Horse Walk
  • Sudoku Solver

Exercises