Table of Contents
- 1 Introduction
- 2 Kestrels
- 3 The Thrush
- 4 Songs of the Cardinal
- 5 Quirky Birds and Meta-Syntactic Programming
- 6 Aspect-Oriented Programming in Ruby using Combinator Birds
- 7 Mockingbirds
- 8 Refactoring Methods with Recursive Combinators
- 9 You can’t be serious!?
- 10 The Hopelessly Egocentric Book Chapter
- 11 Bonus Chapter: Separating Concerns in Coffeescript using Aspect-Oriented Programming
- 12 Appendix: Finding Joy in Combinators
- 13 Appendix: Source Code
- 14 About The Author
0.1 The MIT License
All contents Copyright (c) 2004-2011 Reg Braithwaite except as otherwise noted.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
http://www.opensource.org/licenses/mit-license.php
Cover photo © 2009 Jack Wolf
http://www.flickr.com/photos/wolfraven/3294145307
0.2 Preface
The chapters of this book originally appeared as blog posts. You can still read them online, for free, at http://github.com/raganwald/homoiconic. The original posts were released under the MIT license, you you can pass them around or incorporate them into your own works as you see fit. I decided to publish these essays as an e-book as well as online. This format doesn’t replace the original online essays, it’s a way to present these essays in a more coherent whole that’s easier to read consecutively. I hope you like it.
–Reginald “Raganwald” Braithwaite, Toronto, November 2011
1 Introduction
Like the Lambda Calculus, Combinatory Logic is a mathematical notation that is powerful enough to handle set theory and issues in computability.
Combinatory logic is a notation introduced by Moses Schรถnfinkel and Haskell Curry to eliminate the need for variables in mathematical logic. It has more recently been used in computer science as a theoretical model of computation and also as a basis for the design of functional programming languages. It is based on combinators. A combinator is a higher-order function that uses only function application and earlier defined combinators to define a result from its arguments.
In this book, we’re going to meet some of the standard combinators, and for each one we’ll explore some of its ramifications when writing programs using the Ruby programming language. In Combinatory logic, combinators combine and alter each other, and our Ruby examples will focus on combining and altering Ruby code. From simple examples like the K Combinator and Ruby’s .tap
method, we’ll work our way up to meta-programming with aspects and recursive combinators.
about the bird names
When Combinatory Logic was first invented by Haskell Curry, the standard combinators were given upper-case letters. For example, the two combinators needed to express everything in the Lambda Calculus and in Set Theory are the S
and K
combinators. In 1985, Raymond Smullyan published To Mock a Mockingbird, an exploration of combinatory logic for the recreational layman. Smullyan used a forest full of songbirds as a metaphor, with each of the combinators given the name of a songbird rather than a single letter. For example, the S
and K
combinators became the Starling and Kestrel, the I
combinator became the Idiot bird, and so forth.
These ornithological nicknames have become part of the standard lexicon for combinatory logic.
thanks
There are too many people to name,but amongst the crowd, Alan Smith stands out.
2 Kestrels
In Combinatory Logic, a Kestrel (or “K Combinator”) is a function that returns a constant function, normally written Kxy = x
. In Ruby, it might look like this:
# for *any* x,
kestrel
.
call
(
:
foo
).
call
(
x
)
=>
:
foo
Kestrels are to be found in Ruby. You may be familiar with their Ruby 1.9 name, #tap
. Let’s say you have a line like address = Person.find(...).address
and you wish to log the person instance. With tap
, you can inject some logging into the expression without messy temporary variables:
address
=
Person
.
find
(...).
tap
{
|
p
|
logger
.
log
"person #{p} found"
}.
address
tap
is a method in all objects that passes self
to a block and returns self, ignoring whatever the last item of the block happens to be. Ruby on Rails programmers will recognize the Kestrel in slightly different form:
address
=
returning
Person
.
find
(...)
do
|
p
|
logger
.
log
"person #{p} found"
end
.
address
Again, the result of the block is discarded, it is only there for side effects. This behaviour is the same as a Kestrel. Remember kestrel.call(:foo).call(x)
? If I rewrite it like this, you can see the similarity:
Kestrel
.
call
(
:
foo
)
do
x
end
=>
:
foo
Both returning
and tap
are handy for grouping side effects together. Methods that look like this:
def
registered_person
(
params
=
{})
person
=
Person
.
new
(
params
.
merge
(
:
registered
=>
true
))
Registry
.
register
(
person
)
person
.
send_email_notification
person
end
Can be rewritten using returning
:
def
registered_person
(
params
=
{})
returning
Person
.
new
(
params
.
merge
(
:
registered
=>
true
))
do
|
person
|
Registry
.
register
(
person
)
person
.
send_email_notification
end
end
It is obvious from the first line what will be returned and it eliminates an annoying error when the programmer neglects to make person
the last line of the method.
2.1 Object initializer blocks
The Kestrel has also been sighted in the form of object initializer blocks. Consider this example using Struct:
Contact
=
Struct
.
new
(
:
first
,
:
last
,
:
email
)
do
def
to_hash
Hash
[
*
members
.
zip
(
values
).
flatten
]
end
end
The method Struct#new
creates a new class. It also accepts an optional block, evaluating the block for side effects only. It returns the new class regardless of what happens to be in the block (it happens to evaluate the block in class scope, a small refinement).
You can use this technique when writing your own classes:
class
Bird
<
Creature
def
initialize
(
*
params
)
#
do
something
with
the
params
yield
self
if
block_given
?
end
end
Forest
.
add
(
Bird
.
new
(
:
name
=>
'
Kestrel
)
{
|
k
|
combinators
<<
k
}
)
The pattern of wanting a Kestrel/returning/tap when you create a new object is so common that building it into object initialization is useful. And in fact, it’s built into ActiveRecord
. Methods like new
and create
take optional blocks, so you can write:
class
Person
<
ActiveRecord
::
Base
#
...
end
def
registered_person
(
params
=
{})
Person
.
new
(
params
.
merge
(
:
registered
=>
true
))
do
|
person
|
Registry
.
register
(
person
)
person
.
send_email_notification
end
end
In Rails, returning
is not necessary when creating instances of your model classes, thanks to ActiveRecord’s built-in object initializer blocks.
2.2 Inside, an idiomatic Ruby Kestrel
When we discussed Struct
above, we noted that its initializer block has a slightly different behaviour than tap
or returning
. It takes an initializer block, but it doesn’t pass the new class to the block as a parameter, it evaluates the block in the context of the new class.
Putting this into implementation terms, it evaluates the block with self
set to the new class. This is not the same as returning
or tap
, both of which leave self
untouched. We can write our own version of returning
with the same semantics. We will call it inside
:
module
Kernel
def
inside
(
value
,
&
block
)
value
.
instance_eval
(
&
block
)
value
end
end
You can use this variation on a Kestrel just like returning
, only you do not need to specify a parameter:
inside
[
1
,
2
,
3
]
do
uniq
!
end
=>
[
1
,
2
,
3
]
This isn’t particularly noteworthy. Of more interest is your access to private methods and instance variables:
sna
=
Struct
.
new
(
'
Fubar
'
)
do
attr_reader
:
fu
end
.
new
inside
(
sna
)
do
@
fu
=
'
bar
'
end
=>
<
struct
Struct
::
Fubar
>
sna
.
fu
=>
'
bar
'
inside
is a Kestrel just like returning
. No matter what value its block generates, it returns its primary argument. The only difference between the two is the evaluation environment of the block.
2.3 The Enchaining Kestrel
In Kestrels, we looked at #tap
from Ruby 1.9 and returning
from Ruby on Rails. No we’ll going to look at another use for tap
. As already explained, Ruby 1.9 includes the new method Object#tap
. It passes the receiver to a block, then returns the receiver no matter what the block contains. The canonical example inserts some logging in the middle of a chain of method invocations:
address
=
Person
.
find
(...).
tap
{
|
p
|
logger
.
log
"person #{p} found"
}.
address
Object#tap
is also useful when you want to execute several method on the same object without having to create a lot of temporary variables, a practice Martin Fowler calls [Method Chaining](http://martinfowler.com/dslwip/MethodChaining.html “”). Typically, you design such an object so that it returns itself in response to every modifier message. This allows you to write things like:
HardDrive
.
new
.
capacity
(
150
).
external
.
speed
(
7200
)
Instead of:
hd
=
HardDrive
.
new
hd
.
capacity
=
150
hd
.
external
=
true
hd
.
speed
=
7200
And if you are a real fan of the Kestrel, you would design your class with an object initializer block so you could write:
hd
=
HardDrive
.
new
do
@
capacity
=
150
@
external
=
true
@
speed
=
7200
end
But what do you do when handed a class that was not designed with method chaining in mind? For example, Array#pop
returns the object being popped, not the array. Before you validate every criticism levelled against Ruby for allowing programmers to rewrite methods in core classes, consider using #tap
with Symbol#to_proc
or String#to_proc
to chain methods without rewriting them.
So instead of
def
fizz
(
arr
)
arr
.
pop
arr
.
map
!
{
|
n
|
n
*
2
}
end
We can write:
def
fizz
(
arr
)
arr
.
tap
(
&:
pop
).
map
!
{
|
n
|
n
*
2
}
end
I often use #tap
to enchain methods for those pesky array methods that sometimes do what you expect and sometimes don’t. My most hated example is Array#uniq!
:
arr
=
[
1
,
2
,
3
,
3
,
4
,
5
]
arr
.
uniq
,
arr
=>
[
1
,
2
,
3
,
4
,
5
],
[
1
,
2
,
3
,
3
,
4
,
5
]
arr
=
[
1
,
2
,
3
,
3
,
4
,
5
]
arr
.
uniq
!
,
arr
=>
[
1
,
2
,
3
,
4
,
5
],
[
1
,
2
,
3
,
4
,
5
]
arr
=
[
1
,
2
,
3
,
4
,
5
]
arr
.
uniq
,
arr
=>
[
1
,
2
,
3
,
4
,
5
],
[
1
,
2
,
3
,
4
,
5
]
arr
=
[
1
,
2
,
3
,
4
,
5
]
arr
.
uniq
!
,
arr
=>
nil
,
[
1
,
2
,
3
,
4
,
5
]
Let’s replay that last one in slow motion:
[
1
,
2
,
3
,
4
,
5
].
uniq
!
=>
nil
That might be a problem. For example:
[
1
,
2
,
3
,
4
,
5
].
uniq
!
.
sort
!
=>
NoMethodError
:
undefined
method
`
sort
!
'
for
nil
:
NilClass
Object#tap
to the rescue: When using a method like #uniq!
that modifies the array in place and sometimes returns the modified array but sometimes helpfully returns nil
, I can use #tap
to make sure I always get the array, which allows me to enchain methods:
[
1
,
2
,
3
,
4
,
5
].
tap
(
&:
uniq
!
).
sort
!
=>
[
1
,
2
,
3
,
4
,
5
]
So there’s another use for #tap
(along with Symbol#to_proc
for simple cases): We can use it when we want to enchain methods, but the methods do not return the receiver.
In Ruby 1.9,
#tap
works exactly as described above. Ruby 1.8 does not have#tap
, but you can obtain it by installing the andand gem. This version of#tap
also works like a Quirky Bird, so you can write things likeHardDrive.new.tap.capacity(150)
for enchaining methods that take parameters and/or blocks. To get andand,sudo gem install andand
. Rails users can also dropandand.rb
inconfig/initializers
.
2.4 The Obdurate Kestrel
The andand gem includes Object#tap
for Ruby 1.8. It also includes another kestrel called #dont
. Which does what it says, or rather doesn’t do what it says.
:
foo
.
tap
{
p
'
bar
'
}
bar
=>
:
foo
#
printed
'
bar
'
before
returning
a
value
!
:
foo
.
dont
{
p
'
bar
'
}
=>
:
foo
#
without
printing
'
bar
'
!
Object#dont
simply ignores the block passed to it. So what is it good for? Well, remember our logging example for #tap
?
address
=
Person
.
find
(...).
tap
{
|
p
|
logger
.
log
"person #{p} found"
}.
address
Let’s turn the logging off for a moment:
address
=
Person
.
find
(...).
dont
{
|
p
|
logger
.
log
"person #{p} found"
}.
address
And back on:
address
=
Person
.
find
(...).
tap
{
|
p
|
logger
.
log
"person #{p} found"
}.
address
I typically use it when doing certain kinds of primitive debugging. And it has another trick up its sleeve:
arr
.
dont
.
sort
!
Look at that, it works with method calls like a quirky bird! So you can use it to NOOP
methods. Now, you could have done that with Symbol#to_proc
:
arr
.
dont
(
&:
sort
!
)
But what about methods that take parameters and blocks?
JoinBetweenTwoModels
.
dont
.
create
!
(...)
do
|
new_join
|
#
...
end
Object#dont
is the Ruby-semantic equivalent of commenting out a method call, only it can be inserted inside of an existing expression. That’s why it’s called the obdurate kestrel. It refuses to do anything!
If you want to try Object#dont
, or want to use Object#tap
with Ruby 1.8, sudo gem install andand
. Rails users can also drop andand.rb
in config/initializers
as mentioned above. Enjoy!
2.5 Kestrels on Rails
As mentioned, Ruby on Rails provides #returning, a method with K Combinator semantics:
returning
(
expression
)
do
|
name
|
#
name
is
bound
to
the
result
of
evaluating
expression
#
this
block
is
evaluated
and
the
result
is
discarded
end
=>
#
the
result
of
evaluating
the
expression
is
now
returned
Rails also provides object initializer blocks for ActiveRecord models. Here’s an example from one of my unit tests:
@
board
=
Board
.
create
(
:
dimension
=>
9
)
do
|
b
|
b
[
'
aa
'
]
=
'
black
'
b
[
'
bb
'
]
=
'
black
'
b
[
'
cb
'
]
=
'
black
'
b
[
'
da
'
]
=
'
black
'
b
[
'
ba
'
]
=
'
white
'
b
[
'
ca
'
]
=
'
white
'
end
So, it looks like in Rails you can choose between an object initializer block and #returning:
@
board
=
returning
(
Board
.
create
(
:
dimension
=>
9
))
do
|
b
|
b
[
'
aa
'
]
=
'
black
'
b
[
'
bb
'
]
=
'
black
'
b
[
'
cb
'
]
=
'
black
'
b
[
'
da
'
]
=
'
black
'
b
[
'
ba
'
]
=
'
white
'
b
[
'
ca
'
]
=
'
white
'
end
In both cases the created object is returned regardless of what the block would otherwise return. But beyond that, the two Kestrels have very different semantics. “Returning” fully evaluates the expression, in this case creating the model instance in its entirety, including all of its callbacks. The object initializer block, on the other hand, is called as part of initializing the object before starting the lifecycle of the object including its callbacks.
“Returning” is what you want when you want to do stuff involving the fully created object and you are trying to logically group the other statements with the creation. In my case, that’s what I want, I am trying to say that @board is a board with black stones on certain intersections and white stones on other intersections.
Object initialization is what you want when you want to initialize certain fields by hand and perform some calculations or logic before kicking off the object creation lifecycle. That wasn’t what I wanted in this case because my []=
method depended on the object being initialized. So my code had a bug that was fixed when I changed from object initializers to #returning.
Summary: In Rails, object initializers are evaluated before the object’s life cycle is started, #returning’s block is evaluated afterwards. And that is today’s lingua obscura.
2.6 Rewriting “Returning” in Rails
One of the most useful tools provided by Ruby on Rails is the #returning method, a simple but very useful implementation of the K Combinator or Kestrel. For example, this:
def
registered_person
(
params
=
{})
person
=
Person
.
new
(
params
.
merge
(
:
registered
=>
true
))
Registry
.
register
(
person
)
person
.
send_email_notification
person
end
Can and should be expressed using #returning as this:
def
registered_person
(
params
=
{})
returning
Person
.
new
(
params
.
merge
(
:
registered
=>
true
))
do
|
person
|
Registry
.
register
(
person
)
person
.
send_email_notification
end
end
Why? Firstly, you avoid the common bug of forgetting to return the object you are creating:
def
broken_registered_person
(
params
=
{})
person
=
Person
.
new
(
params
.
merge
(
:
registered
=>
true
))
Registry
.
register
(
person
)
person
.
send_email_notification
end
This creates the person object and does the initialization you want, but doesn’t actually return it from the method, it returns whatever #send_email_notification happens to return. If you’ve worked hard to create fluent interfaces you might be correct by accident, but #send_email_notification could just as easily return the email it creates. Who knows?
Second, in methods like this as you read from top to bottom you are declaring what the method returns right up front:
def
registered_person
(
params
=
{})
returning
Person
.
new
(
params
.
merge
(
:
registered
=>
true
))
do
#
...
#
...
end
end
It takes some optional params and returns a new person. Very clear. And the third reason I like #returning is that it logically clusters the related statements together:
returning
Person
.
new
(
params
.
merge
(
:
registered
=>
true
))
do
|
person
|
Registry
.
register
(
person
)
person
.
send_email_notification
end
It is very clear that these statements are all part of one logical block. As a bonus, my IDE respects that and it’s easy to fold them or drag them around as a single unit. All in all, I think #returning is a big win and I even look for opportunities to refactor existing code to use it whenever I’m making changes.
DWIM
All that being said, I have observed a certain bug or misapplication of #returning from time to time. It’s usually pretty subtle in production code, but I’ll make it obvious with a trivial example. What does this snippet evaluate to?
returning
[
1
]
do
|
numbers
|
numbers
<<
2
numbers
+=
[
3
]
end
This is the kind of thing that sadistic interviewers use in coding quizzes. The answer is [1, 2], not [1, 2, 3]. The <<
operator mutates the value assigned to the numbers variable, but the +=
statement overwrites the reference assigned to the numbers variable without changing the original value. #returning remembers the value originally assigned to numbers and returns it. If you have some side-effects on that value, those count. But assignment does nothing to the value.
This may seem obvious, but in my experience it is a subtle point that causes difficulty. Languages with referential transparency escape the confusion entirely, but OO languages like Ruby have this weird thing where we have to keep track of references and labels on references in our head.
Here’s something contrived to look a lot more like production code. First, without #returning:
def
working_registered_person
(
params
=
{})
person
=
Person
.
new
(
params
.
merge
(
:
registered
=>
true
))
if
Registry
.
register
(
person
)
person
.
send_email_notification
else
person
=
Person
.
new
(
:
default
=>
true
)
end
person
end
And here we’ve refactored it to use #returning:
def
broken_registered_person
(
params
=
{})
returning
Person
.
new
(
params
.
merge
(
:
registered
=>
true
))
do
|
person
|
if
Registry
.
register
(
person
)
person
.
send_email_notification
else
person
=
Person
.
new
(
:
default
=>
true
)
end
end
end
Oops! This no longer works as we intended. Overwriting the person
variable is irrelevant, #returning returns the unregistered new person no matter what. So what’s going on here?
One answer is to “blame the victim.” Ruby has a certain well-documented behaviour around variables and references. #returning has a certain well-documented behaviour. Any programmer who makes the above mistake is–well–mistaken. Fix the code and set the bug ticket status to Problem Between Keyboard And Chair (“PBKAC”).
Another answer is to suggest that the implementation of #returning is at fault. If you write:
returning
...
do
|
var
|
#
...
var
=
something_else
#
...
end
You intended to change what you are returning from #returning. So #returning should be changed to do what you meant. I’m on the fence about this. When folks argue that designs should cater to programmers who do not understand the ramifactions of the programming language or of the framework, I usually retort that you cannot have progress and innovation while clinging to familiarity, an argument I first heard from Jef Raskin. The real meaning of “The Principle of Least Surprise” is that a design should be internally consistent, which is not the same thing as familiar.
Ruby’s existing use of variables and references is certainly consistent. And once you know what #returning does, it remains consistent. However, this design decision isn’t really about being consistent with Ruby’s implementation, we are debating how an idiom should be designed. I think we have a blank canvas and it’s reasonable to at least consider a version of #returning that handles assignment to the parameter.
Rewriting #returning
The RewriteRails plug-in adds syntactic abstractions like Andand to Rails projects without monkey-patching. RewriteRails now includes its own version of #returning that overrides the #returning shipping with Rails.
When RewriteRails is processing source code, it turns code like this:
def
registered_person
(
params
=
{})
returning
Person
.
new
(
params
.
merge
(
:
registered
=>
true
))
do
|
person
|
if
Registry
.
register
(
person
)
person
.
send_email_notification
else
person
=
Person
.
new
(
:
default
=>
true
)
end
end
end
Into this:
def
registered_person
(
params
=
{})
lambda
do
|
person
|
if
Registry
.
register
(
person
)
person
.
send_email_notification
else
person
=
Person
.
new
(
:
default
=>
true
)
end
person
end
.
call
(
Person
.
new
(
params
.
merge
(
:
registered
=>
true
)))
end
Note that in addition to turning the #returning “call” into a lambda that is invoked immediately, it also makes sure the new lambda returns the person
variable’s contents. So assignment to the variable does change what #returning appears to return.
Like all processors in RewriteRails, #returning is only rewritten in .rr
files that you write in your project. Existing .rb
files are not affected, including all code in the Rails framework: RewriteRails will never monkey with other people’s expectations. RewriteRails doesn’t physically modify the .rr files you write: The rewritten code is put in another file that the Ruby interpreter sees. So you see the code you write and RewriteRails figures out what to show the interpreter. This is a little like a Lisp macro.
3 The Thrush
In Combinatory Logic, the thrush is an extremely simple permuting combinator; it reverses the normal order of evaluation. The thrush is written Txy = yx
. It reverses evaluation. In Ruby terms,
thrush
.
call
(
a_value
).
call
(
a_proc
)
=>
a_proc
.
call
(
a_value
)
In No Detail Too Small, I defined Object#into
, an implementation of the thrush as a Ruby method:
class
Object
def
into
expr
=
nil
expr
.
nil
?
?
yield
(
self
)
:
expr
.
to_proc
.
call
(
self
)
end
end
If you are in the habit of violating the Law of Demeter, you can use #into
to make an expression read consistently from left to right. For example, this code:
lambda
{
|
x
|
x
*
x
}.
call
((
1..100
).
select
(
&:
odd
?
).
inject
(
&:+
))
Reads “Square (take the numbers from 1 to 100, select the odd ones, and take the sum of those).” Confusing. Whereas with #into
, you can write:
(
1..100
).
select
(
&:
odd
?
).
inject
(
&:+
).
into
{
|
x
|
x
*
x
}
Which reads “Take the numbers from 1 to 100, keep the odd ones, take the sum of those, and then answer the square of that number.”
A permuting combinator like #into
is not strictly necessary when you have parentheses or local variables. Which is kind of interesting, because it shows that if you have permuting combinators, you can model parentheses and local variables.
But we are not interested in theory. #into
may be equivalent to what we can accomplish with other means, but it is useful to us if we feel it makes the code clearer and easier to understand. Sometimes a longer expression should be broken into multiple small expressions to make it easier to understand. Sometimes it can be reordered using tools like #into
.
3.1 Let
Object#into
defines the thrush as a method that takes a block, lambda, or anything that can become a block or lambda as its argument. There is another way to formulate a Thrush:
module
Kernel
def
let
it
yield
it
end
end
It’s remarkably simple, so simple that it appears to be less useful than #into
. The example above would look like this if we used let
:
let
(
1..100
).
select
(
&:
odd
?
).
inject
(
&:+
)
do
|
x
|
x
*
x
end
How does that help? I’ll let you in on a secret: Ruby 1.9 changes the game. In Ruby 1.8, x
is local to the surrounding method, so it doesn’t help. But in Ruby 1.9, x
is a block local variable, meaning that it does not clobber an existing variable. So in Ruby 1.8:
def
say_the_square_of_the_sum_of_the_odd_numbers
(
x
)
sotos
=
let
(
1.
.
x
).
select
(
&:
odd
?
).
inject
(
&:+
)
do
|
x
|
x
*
x
end
"The square of the sum of the odd numbers from 1..#{x} is #{sotos}"
end
say_the_square_of_the_sum_of_the_odd_numbers
(
10
)
=>
"The square of the sum of the odd numbers from 1..25 is 625"
1..25
!? What happened here is that the x
inside the block clobbered the value of the x
parameter. Not good. In Ruby 1.9:
say_the_square_of_the_sum_of_the_odd_numbers
(
10
)
=>
"The square of the sum of the odd numbers from 1..10 is 625"
Much better, Ruby 1.9 creates a new scope inside the block and x
is local to that block, shadowing the x
parameter. Now we see a use for let
:
let
(
some_expression
)
do
|
my_block_local_variable
|
#
...
end
let
creates a new scope and defines your block local variable inside the block. This signals that the block local variable is not used elsewhere. Imperative methods can be easier to understand when they are composed of smaller blocks with well-defined dependencies between them. A variable local to the entire method creates a dependency across the entire method. A variable local to a block only creates dependencies within that block.
Although Ruby 1.8 does not enforce this behaviour, it can be useful to write code in this style as a signal to make the code easier to read.
Summary
We have seen two formulations of the thrush combinator, #into
and let
. One is useful for making expressions more consistent and easier to read, the other for signaling the scope of block-local variables.
4 Songs of the Cardinal
In Combinatory Logic, the cardinal is one of the most basic permuting combinators; it reverses and parenthesizes the normal order of evaluation. The cardinal is written Cxyz = xzy
. In Ruby:
cardinal
.
call
(
proc_over_proc
).
call
(
a_value
).
call
(
a_proc
)
=>
proc_over_proc
.
call
(
a_proc
).
call
(
a_value
)
What does this mean? Let’s compare it to the Thrush. The thrush is written Txy = yx
. In Ruby terms,
thrush
.
call
(
a_value
).
call
(
a_proc
)
=>
a_proc
.
call
(
a_value
)
The salient difference is that a cardinal doesn’t just pass a_value
to a_proc
. What it does is first passes a_proc
to proc_over_proc
and then passes a_value
to the result. This implies that proc_over_proc
is a function that takes a function as its argument and returns a function.
Or in plainer terms, you want a cardinal when you would like to modify what a function or a block does. Now you can see why we can derive a thrush from a cardinal. If we write:
identity
=
lambda
{
|
f
|
f
}
Then we can write:
thrush
=
cardinal
.
call
(
identity
)
What we have done is say a thrush is what you get when you use a cardinal and a function that doesn’t modify its function but answers it right back.
Note to ornithologists and ontologists:
This is not object orientation: a thrush is not a kind of cardinal. The correct relationship between them in Ruby is that a cardinal creates a thrush. Or in Smullyan’s songbird metaphor, if you call out the name of an identity bird to a cardinal, it will call out the name of a thrush back to you.
Now, this bizarre syntactic convention of writing foo.call(bar).call(bash)
is not very helpful for actually writing software. It is great for explaining what’s going on, but if we are going to use Ruby for the examples, we need to lift our game up a level and make some idiomatic Ruby.
4.1 Building a Cardinal in Ruby
The next chunk of code works around the fact that Ruby 1.8 can’t define a proc that takes a block and also doesn’t allow define_method
to define a method that takes a block. So for Ruby 1.8, we will start by making a utility method that defines methods that can take a block, based on an idea from coderr. For Ruby 1.9 this is not necessary: you can use define_method
to define methods that take blocks as arguments.
def define_method_taking_block(
name,
method_body_proc)
self.class.send :
define_method,
"__cardinal_helper_#{name}__"
,
&
method_body_proc
eval <<-
EOM
def #{name}(a_value, &a_proc)
__cardinal_helper_#{name}__(a_value, a_proc)
end
EOM
end
Now we can see what the expression “accidental complexity” means. Do you see how we need a long paragraph and a chunk of code to explain how we are working around a limitation in our tool? And how the digression to explain the workaround is longer than the actual code we want to write? Ugh!
With that out of the way, we can write our cardinal:
def
cardinal_define
(
name
,
&
proc_over_proc
)
define_method_taking_block
(
name
)
do
|
a_value
,
a_proc
|
proc_over_proc
.
call
(
a_proc
).
call
(
a_value
)
end
end
Ready to try it? Here’s a familiar example. We’ll need a proc_over_proc
, our proc that modifies another proc. Because we’re trying to be Ruby-ish, we’ll write it out as a block:
do
|
a_proc
|
a_proc
.
call
(
a_value
)
unless
a_value
.
nil
?
}
end
This takes a a_proc
and returns a brand new proc that only calls a_proc
if the value you pass it is not nil. Now let’s use our cardinal to define a new method:
cardinal_define
(
:
maybe
)
do
|
a_proc
|
lambda
{
|
a_value
|
a_proc
.
call
(
a_value
)
unless
a_value
.
nil
?
}
end
Let’s try it out:
maybe
(
1
)
{
|
x
|
x
+
1
}
=>
2
maybe
(
nil
)
{
|
x
|
x
+
1
}
=>
nil
If we’re using Rails, we can make a slightly different version of maybe
:
cardinal_define
(
:
unless_blank
)
do
|
a_proc
|
a_proc
.
call
(
a_value
)
unless
a_value
.
blank
?
}
end
unless_blank
(
Person
.
find
(...).
name
)
do
|
name
|
register_name_on_title
(
name
)
end
Remember we said the cardinal can be used to define a thrush? Let’s try our Ruby cardinal out to do the same thing. Recall that expressing the identity bird as a block is:
do
|
a_proc
|
a_proc
end
Therefore we can define a thrush with:
cardinal_define
(
:
let
)
do
|
a_proc
|
a_proc
end
let
((
1..10
).
select
{
|
n
|
n
%
2
==
1
}.
inject
{
|
mem
,
var
|
mem
+
var
})
do
|
x
|
x
*
x
end
=>
625
As you can see, once you have a defined a cardinal, you can create an infinite variety of methods that have thrush-like syntax–a method that applies a value to a block–but you can modify or augment the semantics of the block in any way you want.
In Ruby terms, you are meta-programming. In Smullyan’s terms, you are Listening to the Songs of the Cardinal.
5 Quirky Birds and Meta-Syntactic Programming
In Combinatory Logic, the Queer Birds are a family of combinators which both parenthesize and permute. One member of the family, the Quirky Bird, has interesting implications for Ruby. The quirky bird is written Q
3
xyz = z(xy)
. In Ruby:
quirky
.
call
(
value_proc
).
call
(
a_value
).
call
(
a_proc
)
=>
a_proc
.
call
(
value_proc
.
call
(
a_value
))
Like the Cardinal, the quirky bird reverses the order of application. But where the cardinal modifies the function that is applied to a value, the quirky bird modifies the value itself. Let’s compare how cardinals and quirky birds work.
a cardinals refresher
The cardinal is defined in its simplest Ruby form as:
cardinal
.
call
(
proc_over_proc
).
call
(
a_value
).
call
(
a_proc
)
=>
proc_over_proc
.
call
(
a_proc
).
call
(
a_value
)
From that definition, we wrote a method called cardinal_define
that writes methods in idiomatic Ruby. For example, here’s how we used cardinal_define
to generate the maybe
method:
cardinal_define
(
:
maybe
)
do
|
a_proc
|
lambda
{
|
a_value
|
a_proc
.
call
(
a_value
)
unless
a_value
.
nil
?
}
end
maybe
(
1
)
{
|
x
|
x
+
1
}
=>
2
maybe
(
nil
)
{
|
x
|
x
+
1
}
=>
nil
Now we are not looking at the source code for maybe
, but from the definition of a cardinal above we know that any method defined by cardinal_define
will look roughly like:
def
defined_by_a_cardinal
(
a_value
,
&
a_proc
)
proc_over_proc
.
call
(
a_proc
).
call
(
a_value
)
end
Or in our case:
def
maybe
(
a_value
,
&
a_proc
)
lambda
do
|
a_proc
|
lambda
{
|
a_value
|
a_proc
.
call
(
a_value
)
unless
a_value
.
nil
?
}
end
.
call
(
a_proc
).
call
(
a_value
)
end
and now to the quirky bird
From the definition for the quirky bird, we expect that if we write quirky_bird_define
, the methods it generates will look roughly like:
def
defined_by_a_quirky_bird
(
a_value
,
&
a_proc
)
a_proc
.
call
(
value_proc
.
call
(
a_value
))
end
So, are we ready to write quirky_bird_define
? This seems too easy. Just copy the cardinal_define
code, make a few changes, and we’re done:
def quirky_bird_define(
name,
&
value_proc)
define_method_taking_block(
name)
do |
a_value,
a_proc|
a_proc.call(
value_proc.call(
a_value))
end
end
# method_body_proc should expect (a_value, a_proc)
# see http://coderrr.wordpress.com/2008/10/29/using-define_method-with-blocks-in-\
ruby-18
/
def define_method_taking_block(
name,
&
method_body_proc)
self.class.send :
define_method,
"__quirky_bird_helper_#{name}__"
,
method_body_p\
roc
eval <<-
EOM
def #{name}(a_value, &a_proc)
__quirky_bird_helper_#{name}__(a_value, a_proc)
end
EOM
end
Ok, let’s try it out on something really trivial:
quirky_bird_define
(
:
square_first
)
do
|
a_value
|
a
value
*
a_value
end
square_first
(
1
)
{
|
n
|
n
+
1
}
=>
2
square_first
(
2
)
{
|
n
|
n
+
1
}
=>
5
It works, good. Now let’s define maybe
using the quirky bird we just wrote. Just so we’re clear, I want to write:
quirky_bird_define
(
:
maybe
)
do
|
a_value
|
#
...
something
goes
here
...
end
maybe
(
1
)
{
|
n
|
n
+
1
}
=>
2
maybe
(
nil
)
{
|
n
|
n
+
1
}
=>
nil
Scheisse! Figuring out what to put in the block to make maybe
work is indeed queer and quirky!!
Now, the simple truth is, I know of no way to use a quirky bird to cover all of the possible blocks you could use with maybe
so that it works exactly like the version of maybe
we built with a cardinal. However, I have found that sometimes it is interesting to push an incomplete idea along if it is incomplete in interesting ways. “Maybe” we can learn something in the process.
5.1 A limited interpretation of the Quirky Bird in Ruby
Let’s solve maybe
any-which-way-we-can and see how it goes. When we used a cardinal, we wanted a proc that would modify another proc to such that if it was passed nil
, it would answer nil
without evaluating its contents.
Now we want to modify a value such that if it is nil
, it responds nil
to the method +
. This is doable, with the help of the BlankSlate
class, also called a BasicObject
. You’ll find BlankSlate
and BasicObject
classes in various frameworks and Ruby 1.9, and there’s one at blank_slate.rb you can use.
BlankSlate
is a class with no methods, which is very different from the base class Object
. That’s because Object
in Ruby is heavyweight, it has lots of useful stuff. But we don’t want useful stuff, because our mission is to answer a value that responds nil
to any method you send it.
The Ruby way to handle any method is with method_missing
. Here’s a really simple expression that answers an object that responds nil
to any method:
returning
(
BlankSlate
.
new
)
do
|
it
|
def
it
.
method_missing
(
*
args
)
nil
end
end
Hmmm. What about:
quirky_bird_define
(
:
maybe
)
do
|
value
|
if
value
.
nil
?
returning
(
BlankSlate
.
new
)
do
|
it
|
def
it
.
method_missing
(
*
args
)
nil
end
end
else
value
end
end
This is saying, “Let’s define a quirky bird method based on a value_proc
as usual. Our value_proc
will take a value, and if the value is nil
we will return an object that responds with nil
to any method. But if the value is not nil, our value_proc
will respond with the object.”
Let’s try it:
maybe
(
1
)
{
|
n
|
n
+
1
}
=>
2
maybe
(
nil
)
{
|
n
|
n
+
1
}
=>
nil
Now, I admit this is very flawed:
maybe
(
nil
)
{
|
n
|
n
+
1
+
1
}
maybe
(
nil
)
{
|
n
|
1
+
n
}
The basic problem here is that we only control the value we pass in. We can’t modify how other objects respond to it, nor can we control what happens to any objects we return from methods called on it. So, the quirky bird turns out to be useful in the case where (a) the value is the receiver of a method, and (b) there is only one method being called, not a chain of methods.
Hmmm again.
5.2 Embracing the Quirky Bird
Maybe we shouldn’t be generating methods that deal with arbitrary blocks and procedures. One way to scale this down is to deal only with single method invocations. For example, what if instead of designing our new version of maybe
so that we invoke it by writing maybe(nil) { |n| n + 1 }
or maybe(1) { |n| n + 1 }
, we design it so that we write nil.maybe + 1
or 1.maybe + 1
instead?
In that case, maybe
becomes a method on the object class that applies value_proc
to its receiver rather than being a method that takes a value and a block. Getting down to business, we are going to open the core Object
class and add a new method to it. The body of that method will be our value_proc
:
def
quirky_bird_extend
(
name
,
&
value_proc
)
Object
.
send
(
:
define_method
,
name
)
do
value_proc
.
call
(
self
)
end
end
Just as we said, we are defining a new method in the Object
class.
We are using
define_method
and a block rather than thedef
keyword. The reason is that when we usedefine_method
and a block, the body of the method executes in the context of the block, not the context of the object itself. Blocks are closures in Ruby, which means that the block has access tovalue_proc
, the parameter from ourquirky_bird_extend
method.
Had we used
def
, Ruby would try to evaluatevalue_proc
in the context of the object itself. So our parameter would be lost forever. Performance wonks and compiler junkies will be interested in this behaviour, as it has very serious implications for garbage collection and memory leaks.
Now let’s use it with exactly the same block we used with quirky_bird_define
:
require
'
quirky_bird
'
require
'
blank_slate
'
require
'
returning
'
quirky_bird_extend
(
:
maybe
)
do
|
value
|
if
value
.
nil
?
returning
(
BlankSlate
.
new
)
do
|
it
|
def
it
.
method_missing
(
*
args
)
nil
end
end
else
value
end
end
nil
.
maybe
+
1
=>
nil
1.
maybe
+
1
=>
2
It works. And it looks familiar! We have defined our own version of andand, only this is much more interesting. Instead of a one-off handy-dandy, we have created a method that creates similar methods.
Let’s try it again, this time emulating Chris Wanstrath’s try:
quirky_bird_extend
(
:
try
)
do
|
value
|
returning
(
BlankSlate
.
new
)
do
|
it
|
def
it
.
__value__
=
(
arg
)
@
value
=
arg
end
def
it
.
method_missing
(
name
,
*
args
)
if
@
value
.
respond_to
?
(
name
)
@
value
.
send
(
name
,
*
args
)
end
end
it
.
__value__
=
value
end
end
nil
.
try
+
1
=>
nil
1.
try
+
1
=>
2
1.
try
.
ordinalize
=>
nil
As you can see, we can used the quirky bird to create a whole family of methods that modify the receiver in some way to produce new semantics. I can’t show you the source code, but here is something from a proprietary Rails application:
Account
.
without_requiring_authorization
.
create
!
(...)
In this case, without_requiring_authorization
follows the quirky bird pattern, only instead of taking an instance and producing a version that handles certain methods specially, this one takes a class and produces a version that doesn’t enforce authorization for use in test cases.
so what have we learned?
The quirky bird is superficially similar to the cardinal, however it can be used to generate syntax that is a little more method-oriented rather than function-oriented. And what’s better than a handy method like andand? A method for defining such methods, of course.
5.3 Andand even more
As we’ve discovered, “andand” is a Quirky Bird. Here’s a little tip for using it effectively: You already know that you can use it to conditionally invoke a method on an object:
"foo"
.
andand
+
"bar"
#
=>
"foobar"
nil
.
andand
+
"bar"
#
=>
nil
In other words, it’s a Quirky Bird. But did you know that you can also use it to conditionally invoke a block?
(
1..10
).
andand
do
|
numbers
|
doubles
=
numbers
.
map
{
|
n
|
n
*
2
}
double_doubles
=
doubles
.
map
{
|
n
|
n
*
2
}
end
#
=>
[
4
,
8
,
12
,
16
,
20
,
24
,
28
,
32
,
36
,
40
]
nil
.
andand
do
|
numbers
|
doubles
=
numbers
.
map
{
|
n
|
n
*
2
}
double_doubles
=
doubles
.
map
{
|
n
|
n
*
2
}
end
#
=>
nil
It’s not just a Quirky Bird, it’s also a Cardinal!
Consider this conditional code:
if
my_var
=
something_or_other
()
3.
times
do
yada
(
my_var
)
end
end
I’m not a big fan. The obvious sin is the pathetic 90s cultural reference. But I’m even more annoyed by having side-effects in the predicate of an if
clause, in this case assigning something to the variable my_var
. Although I’m not switching to a purely functional language any time soon, I strongly prefer that when you write if something()
, then “something()” should not cause any side effects, ever.
Another problem is that we are obviously creating my_var
just to use inside the block, but we’re declaring it in top-level scope. We could fool around with a Thrush like let
, but instead let’s use Object#andand
:
something_or_other
().
andand
do
|
my_var
|
3.
times
do
yada
(
my_var
)
end
end
Now we are making it clear that we wish to execute this block only if something_or_other()
is not nil. Furthermore, we are assigning the result of something_or_other()
to my_var
and using it within the block. Crisp and clean, no caffeine.
Note that if we don’t actually need my_var
in the block, we don’t really need andand
either:
something_or_other
()
and
begin
3.
times
do
yada
()
end
end
Like anything else, andand do ... end
is a tool to be used in specialized situations. Use it whenever you want to do something more complicated than a simple message send, but only when the subject is not nil
.
6 Aspect-Oriented Programming in Ruby using Combinator Birds
In Combinatory Logic, the bluebird is one of the most important and fundamental combinators, because the bluebird composes two other combinators. Although this is usually discussed as part of functional programming style, it is just as valuable when writing object-oriented programs. In this post, we will develop an aspect-oriented programming (or “AOP”) module that adds before methods and after methods to Ruby programs, with the implementation inspired by the bluebird. The bluebird is written Bxyz = x(yz)
. In Ruby, we can express the bluebird like this:
bluebird
.
call
(
proc1
).
call
(
proc2
).
call
(
value
)
=>
proc1
.
call
(
proc2
.
call
(
value
))
If this seems a little arcane, consider a simple Ruby expression (x * 2) + 1
: This expression composes multiplication and addition. Composition is so pervasive in programming languages that it becomes part of the syntax, something we take for granted. We don’t have to think about it until someone like Oliver Steele writes a library like functional javascript that introduces a compose
function, then we have to ask what it does.
Before we start using bluebirds, let’s be clear about something. We wrote that bluebird.call(proc1).call(proc2).call(value)
is equivalent to proc1.call(proc2.call(value))
. We want to be very careful that we understand what is special about proc1.call(proc2.call(value))
. How is it different from proc1.call(proc2).call(value)
?
The answer is:
proc1
.
call
(
proc2
.
call
(
value
))
=>
puts
value
into
proc2
,
then
puts
the
result
of
that
into
proc1
proc1
.
call
(
proc2
).
call
(
value
)
=>
puts
proc2
into
proc1
,
getting
a
function
out
,
then
puts
value
into
the
new
f
\
unction
So with a bluebird you can chain functions together in series, while if you didn’t have a bluebird all you could do is write functions that transform other functions. Not that there’s anything wrong with that, we used that to great effect with cardinals and quirky birds.
6.1 Giving methods advice
We’re not actually going to Greenspun an entire aspect-oriented layer on top of Ruby, but we will add a simple feature, we are going to add before and after methods. You already know what a normal method is. A before method simply specifies some behaviour you want executed before the method is called, while an after method specifies some behaviour you want executed after the method is called. In AOP, before and after methods are called “advice.”
There is an unwritten rule that says every Ruby programmer must, at some point, write his or her own AOP implementation –Avdi Grimm
Ruby on Rails programmers are familiar with method advice. If you have ever written any of the following, you were using Rails’ built-in aspect-oriented programming support:
after_save
validates_each
alias_method_chain
before_filter
These and other features of Rails implement method advice, albeit in a very specific way tuned to portions of the Rails framework. We’re going to implement method advice in a module that you can use in any of your classes, on any method or methods you choose. We’ll start with before methods. Here’s the syntax we want:
def
something
(
parameter
)
#
do
stuff
...
end
before
:
something
do
|
parameter
|
#
stuff
to
do
BEFORE
we
do
stuff
...
end
before
:
something
do
|
parameter
|
#
stuff
to
do
BEFORE
stuff
to
do
BEFORE
we
do
stuff
...
end
As we can see, the before methods get chained together before the method. To keep this nice and clean, we are going to make them work just like composable functions: whatever our before method’s block returns will be passed as a parameter up the chain. We also won’t fool around with altering the order of before methods, we’ll just take them as they come.
This is really simple, we are composing methods. To compare to the bluebird above, we are writing before
, then the name of a method, then a function. I’ll rewrite it like this:
bluebird
.
call
(
something
).
call
(
stuff_to_do_before_we_do_stuff
).
call
(
value
)
=>
something
.
call
(
stuff_to_do_before_we_do_stuff
.
call
(
value
))
Now we can see that this newfangled aspect-oriented programming stuff was figured out nearly a century ago by people like Alonzo Church.
Okay, enough history, let’s get started. First, we are not going to write any C, so there is no way to actually force the Ruby VM to call our before methods. So instead, we are going to have to rewrite our method. We’ll use a trick I found on Jay Fields’ blog:
module
NaiveBeforeMethods
module
ClassMethods
def
before
(
method_sym
,
&
block
)
old_method
=
self
.
instance_method
(
method_sym
)
if
old_method
.
arity
==
0
define_method
(
method_sym
)
do
block
.
call
old_method
.
bind
(
self
).
call
end
else
define_method
(
method_sym
)
do
|*
params
|
old_method
.
bind
(
self
).
call
(
*
block
.
call
(
*
params
))
end
end
end
end
def
self
.
included
(
receiver
)
receiver
.
extend
ClassMethods
end
end
As you can see, we have a special case for methods with no parameters, and when we have a method with multiple parameters, our before method must answer an array of parameters. And the implementation relies on a “flock of bluebirds:” Our before methods and the underlying base method are composed with each other to define the method that is actually executed at run time.
Using it is very easy:
class
SuperFoo
def
one_parameter
(
x
)
x
+
1
end
def
two_parameters
(
x
,
y
)
x
*
y
end
end
class
Foo
<
SuperFoo
include
NaiveBeforeMethods
before
:
one_parameter
do
|
x
|
x
*
2
end
before
:
two_parameters
do
|
x
,
y
|
[
x
+
y
,
x
-
y
]
end
end
Foo
.
new
.
one_parameter
(
5
)
=>
11
Foo
.
new
.
two_parameters
(
3
,
1
)
=>
8
This could be even more useful if it supported methods with blocks. Adventurous readers may want to combine this code with the tricks in
cardinal.rb
and see if they can build a version ofbefore
that supports methods that take blocks.
6.2 The super keyword, perhaps you’ve heard of it?
Of course, Ruby provides a means of ‘decorating’ methods like this by overriding a method and calling super
within it. So we might have written:
class
Foo
<
SuperFoo
def
one_parameter
(
x
)
super
(
x
*
2
)
end
def
two_parameters
(
x
,
y
)
super
(
x
+
y
,
x
-
y
)
end
end
On a trivial example, the two techniques seem equivalent, so why bother with the extra baggage? The answer is that using super
is a little low level. When you see a method definition in a language like Ruby, you don’t know whether you are defining a new method, overriding an existing method with entirely new functionality, or “decorating” a method with before advice. Using advice can be useful when you want to signal exactly what you are trying to accomplish.
Another reason to prefer method advice is when you want to share some functionality:
class
LoggingFoo
<
SuperFoo
def
one_parameter
(
x
)
log_entry
returning
(
super
)
do
log_exit
end
end
def
two_parameters
(
x
,
y
)
log_entry
returning
(
super
)
do
log_exit
end
end
end
This could be written as:
class
LoggingFoo
<
SuperFoo
include
NaiveBeforeMethods
before
:
one_parameter
,
:
two_parameters
do
#
see
below
log_entry
end
after
:
one_parameter
,
:
two_parameters
do
log_exit
end
end
This cleanly separates the concern of logging from the mechanism of what the methods actually do
Although this is not the main benefit, method advice also works with methods defined in modules and the current class, not just superclasses. So in some ways it is even more flexible than Ruby’s
super
keyword.
6.3 The Queer Bird
That looks handy. But we also want an after method, a way to compose methods in the other order. Good news, the Queer Bird combinator is exactly what we want. Written Qxyz = y(xz)
, the Ruby equivalent is:
queer_bird
.
call
(
something
).
call
(
stuff_to_do_after_we_do_stuff
).
call
(
value
)
=>
stuff_to_do_after_we_do_stuff
.
call
(
something
.
call
(
value
))
Which is, of course:
def
something
(
parameter
)
#
do
stuff
...
end
after
:
something
do
|
return_value
|
#
stuff
to
do
AFTER
we
do
stuff
...
end
The difference between before and after advice is that after advice is consumes and transforms whatever the method returns, while before advice consumes and transforms the parameters to the method.
We could copy, paste and modify our bluebird code for the before methods to create after methods. But before you rush off to implement that, you might want to think about a few interesting “real world” requirements:
- If you define before and after methods in any order, the final result should be that all of the before methods are run before the main method, then all of the after methods. This is not part of combinatory logic, but it’s the standard behaviour people expect from before and after methods.
- You should be able to apply the same advice to more than one method, for example by writing
after :foo, :bar do ... end
- If you declare parameters for before advice, whatever it returns will be used by the next method, just like the example above. If you do not declare parameters for before advice, whatever it returns should be ignored. The same goes for after advice.
- If you override the main method, the before and after methods should still work.
- The blocks provided should execute in the receiver’s scope, like method bodies.
One implementation meeting these requirements is in the appendix. Embedded in a lot of extra moving parts, the basic pattern of composing methods is still evident:
naive_before_advice.rb
module
NaiveBeforeAdvice
module
ClassMethods
def
before
(
method_sym
,
&
block
)
old_method
=
self
.
instance_method
(
method_sym
)
if
old_method
.
arity
==
0
define_method
(
method_sym
)
do
block
.
call
old_method
.
bind
(
self
)
.
call
end
else
define_method
(
method_sym
)
do
|*
params
|
old_method
.
bind
(
self
)
.
call
(
*
block
.
call
(
*
params
))
end
end
end
end
def
self
.
included
(
receiver
)
receiver
.
extend
ClassMethods
end
end
That is why we looked at supporting just before methods first. If you are comfortable with the naïve implementation of before advice discussed above, the mechanism is easy to understand. The complete version is considerably more powerful. As mentioned, it supports before and after advice. It also uses instance_exec
to evaluate the blocks in the receiver’s scope, providing access to private methods and instance variables. And it works properly even when you override the method being advised.
7 Mockingbirds
In this chapter, we will meet a combinator that duplicates its arguments, and see how to use it to achieve recursion. Such combinators are called recursive combinators, and are an important foundation for separating the concrete implementation of an algorithm from its definition.
7.1 Duplicative Combinators
Almost all of the combinators we’ve seen in previous essays about combinators “conserve” their arguments. For example, if you pass xyz
to a Bluebird, you get one x
, one y
, and one z
back, exactly what you passed in. You get x(yz)
back, so they have been grouped for you. But nothing has been added and nothing has been taken away. Likewise the Thrush reverses its arguments, but again it answers back the same number arguments you passed to it. The Kestrel, on the other hand, does not conserve its arguments. It erases one. If you pass xy
to a Kestrel, you only get x
back. The y
is erased. Kestrels do not conserve their arguments.
Today we are going to meet another combinator that does not conserve its arguments, the Mockingbird. Where a Kestrel erases one of its arguments, the Mockingbird duplicates its argument. In logic notation, Mx = xx
. Or in Ruby:
mockingbird
.
call
(
x
)
#
=>
x
.
call
(
x
)
The Mockingbird is not the only combinator that duplicates one or more of its arguments. Logicians have also found important uses for many other duplicating combinators like the Starling (Sxyz = xz(yz)
), which is one half of the SK combinator calculus, and the Turing Bird (Uxy = y(xxy)
), which is named after its discoverer.
The great benefit of duplicative combinators from a theoretical perspective is that combinators that duplicate an argument can be used to introduce recursion without names, scopes, bindings, and other things that clutter things up. Being able to introduce anonymous recursion is very elegant, and there are times when it is useful in its own right.
7.2 Recursive Lambdas in Ruby
Let’s write a simple recursive combinator in Ruby from first principles. To start with, let’s pick a recursive algorithm to implement: We’ll sum the numbers of a nested list. In other words, we’re going to traverse a tree of numbers and generate the sum of the leaves, a recursive problem.
This is a trivial problem in Ruby,
[1, [[2,3], [[[4]]]]].flatten.inject(&:+)
will do the trick. Of course, it does so by calling.flatten
, a built-in method that is itself recursive. However, by picking a really simple example, it’s easy to focus on the recursion rather than by the domain-specific parts of our problem. That will make things look a little over-engineered here, but when you’re interested in the engineering, that’s a good thing.
So what is our algorithm?
- If we are given a number, return it.
- If we are given a list, call ourself for each item of the list and sum the numbers that are returned.
In Ruby:
sum_of_nested_list
=
lambda
do
|
arg
|
arg
.
kind_of
?
(
Numeric
)
?
arg
:
arg
.
map
{
|
item
|
sum_of_nested_list
.
call
(
item
)
}.
\
inject
(
&:+
)
end
One reason we don’t like this is that it breaks badly if we ever modify the variable sum_of_nested_list
. Although you may think that’s unlikely, it can happen when writing the method combinators you’ve seen in previous chapters. For example, imagine you wanted to write to the log when calling this function, but only once, you don’t want to write to the log when it calls itself.
old_sum
=
sum_of_nested_list
sum_of_nested_list
=
lambda
do
|
arg
|
puts
"sum_of_nested_list(#{arg.inspect})"
old_sum
.
call
(
arg
)
end
sum_of_nested_list
.
call
([[[[[
6
]]]]])
sum_of_nested_list
([[[[[
6
]]]]])
sum_of_nested_list
([[[[
6
]]]])
sum_of_nested_list
([[[
6
]]])
sum_of_nested_list
([[
6
]])
sum_of_nested_list
([
6
])
sum_of_nested_list
(
6
)
#
=>
6
This doesn’t work because inside our original sum_of_nested_list
, we call sum_of_nested_list
by name. If that gets redefined by a method combinator or anything else, we’re calling the new thing and not the old one.
Another reason to eschew having lambdas call themselves by name is that we won’t be able to create anonymous recursive lambdas. Although naming things is an important part of writing readable software, being able to make anonymous things like object literals opens up a world where everything is truly first class and can be created on the fly or passed around like parameters. So by figuring out how to have lambdas call themselves without using their names, we’re figuring out how to make all kinds of lambdas anonymous and flexible, not just the non-recursive ones.
7.3 Recursive Combinatorics
The combinator way around this is to find a way to pass a function to itself as a parameter. If a lambda only ever calls its own parameters, it doesn’t depend on anything being bound to a name in its environment. Let’s start by rewriting our function to take itself as an argument:
sum_of_nested_list = lambda do |myself, arg| arg
.
kind_of
?
(
Numeric
)
?
arg
:
arg
.
map
{
|
item
|
myself
.
call
(
myself
,
item
)
}.
inje
\
ct
(
&:+
)
One little problem: How are we going to pass our function to itself? Let’s start by currying it into a function that takes one argument, itself, and returns a function that takes an item:
sum_of_nested_list
=
lambda
do
|
myself
|
lambda
do
|
arg
|
arg
.
kind_of
?
(
Numeric
)
?
arg
:
arg
.
map
{
|
item
|
myself
.
call
(
myself
).
call
(
item
)
\
}.
inject
(
&:+
)
end
end
Notice that we now have myself
call itself and have the result call an item. To use it, we have to have it call itself:
This works, but is annoying. Writing our function to take itself as an argument and return a function is one thing, we can fix that, but having our function call itself by name defeats the very purpose of the exercise. Let’s fix it. First thing we’ll do, let’s get rid of myself.call(myself).call(item)
. We’ll use a new parameter, recurse
(it’s the last parameter in an homage to callback-oriented programming style). We’ll pass it myself.call(myself)
, thus removing myself.call(myself)
from our inner lambda:
sum_of_nested_list
=
lambda
do
|
myself
|
lambda
do
|
arg
|
lambda
do
|
arg
,
recurse
|
arg
.
kind_of
?
(
Numeric
)
?
arg
:
arg
.
map
{
|
item
|
recurse
.
call
(
item
)
}.
inject
(
\
&:+
)
end
.
call
(
arg
,
myself
.
call
(
myself
))
end
end
sum_of_nested_list
.
call
(
sum_of_nested_list
).
call
([
1
,
[[
2
,
3
],
[[[
4
]]]]])
Next, we hoist our code out of the middle and make it a parameter. This allows us to get rid of the ` sum_of_nested_list.call(sum_of_nested_list)` by moving it into our lambda:
sum_of_nested_list
=
lambda
do
|
fn
|
lambda
{
|
x
|
x
.
call
(
x
)
}.
call
(
lambda
do
|
myself
|
lambda
do
|
arg
|
fn
.
call
(
arg
,
myself
.
call
(
myself
))
end
end
)
end
.
call
(
lambda
do
|
arg
,
recurse
|
arg
.
kind_of
?
(
Numeric
)
?
arg
:
arg
.
map
{
|
item
|
recurse
.
call
(
item
)
}.
inject
(
&:
\
+
)
end
)
sum_of_nested_list
.
call
([
1
,
[[
2
,
3
],
[[[
4
]]]]])
Lots of code there, but let’s check and see that it works as an anonymous lambda:
lambda
do
|
fn
|
lambda
{
|
x
|
x
.
call
(
x
)
}.
call
(
lambda
do
|
myself
|
lambda
do
|
arg
|
fn
.
call
(
arg
,
myself
.
call
(
myself
))
end
end
)
end
.
call
(
lambda
do
|
arg
,
recurse
|
arg
.
kind_of
?
(
Numeric
)
?
arg
:
arg
.
map
{
|
item
|
recurse
.
call
(
item
)
}.
inject
(
&:
\
+
)
end
).
call
([
1
,
[[
2
,
3
],
[[[
4
]]]]])
Looking at this final example, we can see it has two cleanly separated parts:
# The recursive combinator
lambda
do
|
fn
|
lambda
{
|
x
|
x
.
call
(
x
)
}.
call
(
lambda
do
|
myself
|
lambda
do
|
arg
|
fn
.
call
(
arg
,
myself
.
call
(
myself
))
end
end
)
end
.
call
(
#
The
lambda
we
wish
to
make
recursive
lambda
do
|
arg
,
recurse
|
arg
.
kind_of
?
(
Numeric
)
?
arg
:
arg
.
map
{
|
item
|
recurse
.
call
(
item
)
}.
inject
(
&:
\
+
)
end
)
7.4 Recursive Combinators in Idiomatic Ruby
We’ve now managed to separate the mechanism of recursing (the combinator) from what we want to do while recursing. Let’s formalize this and make it idiomatic Ruby. We’ll make it a method for creating recursive lambdas and call it with a block instead of a lambda:
def
lambda_with_recursive_callback
lambda
{
|
x
|
x
.
call
(
x
)
}.
call
(
lambda
do
|
myself
|
lambda
do
|
arg
|
yield
(
arg
,
myself
.
call
(
myself
))
end
end
)
end
sum_of_nested_list
=
lambda_with_recursive_callback
do
|
arg
,
recurse
|
arg
.
kind_of
?
(
Numeric
)
?
arg
:
arg
.
map
{
|
item
|
recurse
.
call
(
item
)
}.
inject
(
&:+
)
end
sum_of_nested_list
.
call
([
1
,
[[
2
,
3
],
[[[
4
]]]]])
Not bad. But hey, let’s DRY things up. Aren’t x.call(x)
and myself.call(myself)
the same thing?
7.5 The Mockingbird
Yes, x.call(x)
and myself.call(myself)
are the same thing:
def
mockingbird
&
x
x
.
call
(
x
)
end
def
lambda_with_recursive_callback
mockingbird
do
|
myself
|
lambda
do
|
arg
|
yield
(
arg
,
mockingbird
(
&
myself
))
end
end
end
sum_of_nested_list
=
lambda_with_recursive_callback
do
|
arg
,
recurse
|
arg
.
kind_of
?
(
Numeric
)
?
arg
:
arg
.
map
{
|
item
|
recurse
.
call
(
item
)
}.
inject
(
&:+
)
end
sum_of_nested_list
.
call
([
1
,
[[
2
,
3
],
[[[
4
]]]]])
But does it blend?
lambda_with_recursive_callback
{
|
arg
,
recurse
|
arg
.
kind_of
?
(
Numeric
)
?
arg
:
arg
.
map
{
|
item
|
recurse
.
call
(
item
)
}.
inject
(
&:+
)
}.
call
([
1
,
[[
2
,
3
],
[[[
4
]]]]])
Yes!
And now we have our finished recursive combinator. We are able to create recursive lambdas in Ruby without relying on environment variables, just on parameters passed to blocks or lambdas. Our recursive combinator is built on the simplest and most basic of duplicating combinators, the Mockingbird.
In the next chapter, we’ll build more sophisticated (and practical) recursive combinators. And while doing so, we’ll take an aggressive approach to separating interfaces from implementations in algorithms.
8 Refactoring Methods with Recursive Combinators
In previous chapters, we have met some of Combinatory Logic’s most interesting combinators like the Kestrel, Thrush, Cardinal, Quirky Bird, and Bluebird. Today we are going to learn how combinators can help us separate the general form of an algorithm like “divide and conquer” from its specific concrete steps. Consider the method #sum_squares
: It sums the squares of a tree of numbers, represented as a nested list.
def
sum_squares
(
value
)
if
value
.
kind_of
?
(
Enumerable
)
value
.
map
do
|
sub_value
|
sum_squares
(
sub_value
)
end
.
inject
()
{
|
x
,
y
|
x
+
y
}
else
value
**
2
end
end
p
sum_squares
([
1
,
2
,
3
,
[[
4
,
5
],
6
],
[[[
7
]]]])
=>
140
And the method #rotate
: It rotates a square matrix, provided the length of each side is a power of two:
def
rotate
(
square
)
if
square
.
kind_of
?
(
Enumerable
)
&&
square
.
size
>
1
half_sz
=
square
.
size
/
2
sub_square
=
lambda
do
|
row
,
col
|
square
.
slice
(
row
,
half_sz
).
map
{
|
a_row
|
a_row
.
slice
(
col
,
half_sz
)
}
end
upper_left
=
rotate
(
sub_square
.
call
(
0
,
0
))
lower_left
=
rotate
(
sub_square
.
call
(
half_sz
,
0
))
upper_right
=
rotate
(
sub_square
.
call
(
0
,
half_sz
))
lower_right
=
rotate
(
sub_square
.
call
(
half_sz
,
half_sz
))
upper_right
.
zip
(
lower_right
).
map
{
|
l
,
r
|
l
+
r
}
+
upper_left
.
zip
(
lower_left
).
map
{
|
l
,
r
|
l
+
r
}
else
square
end
end
p
rotate
([[
1
,
2
,
3
,
4
],
[
5
,
6
,
7
,
8
],
[
9
,
10
,
11
,
12
],
[
13
,
14
,
15
,
16
]])
=>
[[
4
,
8
,
12
,
16
],
[
3
,
7
,
11
,
15
],
[
2
,
6
,
10
,
14
],
[
1
,
5
,
9
,
13
]]
Our challenge is to refactor them. You could change sub_square
from a closure to a private method (and in languages like Java, you have to do that in the first place). What else? Is there any common behaviour we can extract from these two methods?
Looking at the two methods, there are no lines of code that are so obviously identical that we could mechanically extract them into a private helper. Automatic refactoring tools fall down given these two methods. And yet, there is a really, really important refactoring that should be performed here.
8.1 Divide and Conquer
Both of these methods use the Divide and Conquer strategy.
As described, there are two parts to each divide and conquer algorithm. We’ll start with conquer: you need a way to decide if the problem is simple enough to solve in a trivial manner, and a trivial solution. You’ll also need a way to divide the problem into sub-problems if it’s too complex for the trivial solution, and a way to recombine the pieces back into the solution. The entire process is carried our recursively.
For example, here’s how #rotate
rotated the square. We started with a square matrix of size 4:
[
[
1
,
2
,
3
,
4
]
,
[
5
,
6
,
7
,
8
]
,
[
9
,
10
,
11
,
12
]
,
[
13
,
14
,
15
,
16
]
]
That cannot be rotated trivially, so we divided it into four smaller sub-squares:
[
[
[
1
,
2
]
,
[
3
,
4
]
,
[
5
,
6
]
[
7
,
8
]
]
]
[
[
[
9
,
10
]
,
[
11
,
12
]
,
[
13
,
14
]
[
15
,
16
]
]
]
Those couldn’t be rotated trivially either, so our algorithm divide each of them into four smaller squares again, giving us sixteen squares of one number each. Those are small enough to rotate trivially (they do not change), so the algorithm could stop subdividing.
We said there was a recombination step. For #rotate
, four sub-squares are recombined into one square by moving them counter-clockwise 90 degrees. The sixteen smallest squares were recombined into four sub-squares like this:
[
[
[
2
,
6
]
,
[
4
,
8
]
,
[
1
,
5
]
[
3
,
7
]
]
]
[
[
[
10
,
14
]
,
[
12
,
16
]
,
[
9
,
13
]
[
11
,
15
]
]
]
Then those four squares were recombined into the final result like this:
[
[
[
4
,
8
]
,
[
12
,
16
]
,
[
3
,
7
]
[
11
,
15
]
]
]
[
[
[
2
,
6
]
,
[
10
,
14
]
,
[
1
,
5
]
[
9
,
13
]
]
And smooshed (that is the technical term) back together:
[
[
4
,
8
,
12
,
16
]
,
[
3
,
7
,
11
,
15
]
,
[
2
,
6
,
10
,
14
]
,
[
1
,
5
,
9
,
13
]
]
And Voila! There is your rotated square matrix.
Both rotation and summing the squares of a tree combine the four steps of a divide and conquer strategy:
- Deciding whether the problem is divisible into smaller pieces or can be solved trivially,
- A solution fro the trivial case,
- A way to divide a non-trivial problem up,
- And a way to piece it back together.
Here are the two methods re-written to highlight the common strategy. First, #sum_squares_2
:
public
def
sum_squares_2
(
value
)
if
sum_squares_divisible
?
(
value
)
sum_squares_recombine
(
sum_squares_divide
(
value
).
map
{
|
sub_value
|
sum_squares_2
(
sub_value
)
}
)
else
sum_squares_conquer
(
value
)
end
end
private
def
sum_squares_divisible
?
(
value
)
value
.
kind_of
?
(
Enumerable
)
end
def
sum_squares_conquer
(
value
)
value
**
2
end
def
sum_squares_divide
(
value
)
value
end
def
sum_squares_recombine
(
values
)
values
.
inject
()
{
|
x
,
y
|
x
+
y
}
end
And #rotate_2
:
public
def
rotate_2
(
value
)
if
rotate_divisible
?
(
value
)
rotate_recombine
(
rotate_divide
(
value
).
map
{
|
sub_value
|
rotate_2
(
sub_value
)
}
)
else
rotate_conquer
(
value
)
end
end
private
def
rotate_divisible
?
(
value
)
value
.
kind_of
?
(
Enumerable
)
&&
value
.
size
>
1
end
def
rotate_conquer
(
value
)
value
end
def
rotate_divide
(
value
)
half_sz
=
value
.
size
/
2
sub_square
=
lambda
do
|
row
,
col
|
value
.
slice
(
row
,
half_sz
).
map
{
|
a_row
|
a_row
.
slice
(
col
,
half_sz
)
}
end
upper_left
=
sub_square
.
call
(
0
,
0
)
lower_left
=
sub_square
.
call
(
half_sz
,
0
)
upper_right
=
sub_square
.
call
(
0
,
half_sz
)
lower_right
=
sub_square
.
call
(
half_sz
,
half_sz
)
[
upper_left
,
lower_left
,
upper_right
,
lower_right
]
end
def
rotate_recombine
(
values
)
upper_left
,
lower_left
,
upper_right
,
lower_right
=
values
upper_right
.
zip
(
lower_right
).
map
{
|
l
,
r
|
l
+
r
}
+
upper_left
.
zip
(
lower_left
).
map
{
|
l
,
r
|
l
+
r
}
end
Now the common code is glaringly obvious. The main challenge in factoring it into a helper is deciding whether you want to represent methods like #rotate_divide
as lambdas or want to fool around specifying method names as symbols. Let’s go with lambdas for the sake of writing a clear example:
public
def
sum_squares_3
(
list
)
divide_and_conquer
(
list
,
:
divisible
?
=>
lambda
{
|
value
|
value
.
kind_of
?
(
Enumerable
)
},
:
conquer
=>
lambda
{
|
value
|
value
**
2
},
:
divide
=>
lambda
{
|
value
|
value
},
:
recombine
=>
lambda
{
|
list
|
list
.
inject
()
{
|
x
,
y
|
x
+
y
}
}
)
end
def
rotate_3
(
square
)
divide_and_conquer
(
square
,
:
divisible
?
=>
lambda
{
|
value
|
value
.
kind_of
?
(
Enumerable
)
&&
value
.
size
>
1
\
},
:
conquer
=>
lambda
{
|
value
|
value
},
:
divide
=>
lambda
do
|
square
|
half_sz
=
square
.
size
/
2
sub_square
=
lambda
do
|
row
,
col
|
square
.
slice
(
row
,
half_sz
).
map
{
|
a_row
|
a_row
.
slice
(
col
,
half_sz
)
}
end
upper_left
=
sub_square
.
call
(
0
,
0
)
lower_left
=
sub_square
.
call
(
half_sz
,
0
)
upper_right
=
sub_square
.
call
(
0
,
half_sz
)
lower_right
=
sub_square
.
call
(
half_sz
,
half_sz
)
[
upper_left
,
lower_left
,
upper_right
,
lower_right
]
end
,
:
recombine
=>
lambda
do
|
list
|
upper_left
,
lower_left
,
upper_right
,
lower_right
=
list
upper_right
.
zip
(
lower_right
).
map
{
|
l
,
r
|
l
+
r
}
+
upper_left
.
zip
(
lower_left
).
map
{
|
l
,
r
|
l
+
r
}
end
)
end
private
def
divide_and_conquer
(
value
,
steps
)
if
steps
[
:
divisible
?
].
call
(
value
)
steps
[
:
recombine
].
call
(
steps
[
:
divide
].
call
(
value
).
map
{
|
sub_value
|
divide_and_conquer
(
sub_value
,
\
steps
)
}
)
else
steps
[
:
conquer
].
call
(
value
)
end
end
Now we have refactored the common algorithm out. Typically, something like divide and conquer is treated as a “pattern,” a recipe for writing methods. We have changed it into an abstraction by writing a #divide_and_conquer
method and passing it our own functions which it combines to form the final algorithm. That ought to sound familiar: #divide_and_conquer
is a combinator that creates recursive methods for us.
You can also find recursive combinators in other languages like Joy, Factor, and even Javascript (the recursive combinator presented here as #divide_and_conquer
is normally called multirec
). Eugene Lazutkin’s article on [Using recursion combinators in JavaScript](http://lazutkin.com/blog/2008/jun/30/using-recursion-combinators-javascript/ “”) shows how to use combinators to build divide and conquer algorithms in Javascript with the Dojo libraries. This example uses binrec
, a recursive combinator for algorithms that always divide their problems in two:
var
fib0
=
function
(
n
){
return
n
<=
1
?
1
:
arguments
.
callee
.
call
(
this
,
n
-
1
)
+
arguments
.
callee
.
call
(
this
,
n
-
2
);
};
var
fib1
=
binrec
(
"<= 1"
,
"1"
,
"
[
[
n
-
1
]
,
[
n
-
2
]
]"
,
"+"
);
8.2 The Merge Sort
Let’s look at another example, implementing a merge sort. This algorithm has a distinguished pedigree: It was invented by John Von Neumann in 1945.
Von Neumann was a brilliant and fascinating individual. he is most famous amongst Computer Scientists for formalizing the computer architecture which now bears his name. he also worked on game theory, and it was no game to him: He hoped to use math to advise the United States whether an when to launch a thermonuclear war on the USSR. If you are interested in reading more, Prisoner’s Dilemma is a very fine book about both game theory and one of the great minds of modern times.
Conceptually, a merge sort works as follows:
- If the list is of length 0 or 1, then it is already sorted.
- Otherwise:
- Divide the unsorted list into two sublists of about half the size.
- Sort each sublist recursively by re-applying merge sort.
- Merge the two sublists back into one sorted list.
The merge sort part will be old hat given our #divide_and_conquer
helper:
def
merge_sort
(
list
)
divide_and_conquer
(
list
,
:
divisible
?
=>
lambda
{
|
list
|
list
.
length
>
1
},
:
conquer
=>
lambda
{
|
list
|
list
},
:
divide
=>
lambda
do
|
list
|
half_index
=
(
list
.
length
/
2
)
-
1
[
list
[
0.
.
half_index
],
list
[(
half_index
+
1
)..
-
1
]
]
end
,
:
recombine
=>
lambda
{
|
pair
|
merge_two_sorted_lists
(
pair
.
first
,
pair
.
last
)
}
)
end
The interesting part is our #merge_two_sorted_lists
method. Given two sorted lists, our merge algorithm works like this:
- If either list is of length zero, return the other list.
- Otherwise:
- Compare the first item of each list using
<=>
. Let’s call the list which has the “preceding” first item the preceding list and the list which has the “following” first item the following list. - Create a pair of lists consisting of the preceding item and an empty list, and another pair of lists consisting of the remainder of the preceding list and the entire following list.
- Merge each pair of lists recursively by applying merge two sorted lists.
- Catenate the results together.
- Compare the first item of each list using
As you can tell from the description, this is another divide and conquer algorithm:
def
merge_two_sorted_lists
(*
pair
)
divide_and_conquer
(
pair
,
:divisible
?
=>
lambda
{
|
pair
|
!
pair
.
first
.
empty
?
&&
!
pair
.
last
.
empty
?
}
,
:conquer
=>
lambda
do
|
pair
|
if
pair
.first.empty
?
&&
pair
.last.empty
?
[]
elsif
pair
.first.empty
?
pair
.last
else
pair
.first
end
end
,
:divide
=>
lambda
do
|
pair
|
preceding
,
following
=
case
pair
.first.first
<=>
pair
.last.first
when
-1
:
[
pair
.
first
,
pair
.
last
]
when
0
:
[
pair
.
first
,
pair
.
last
]
when
1
:
[
pair
.
last
,
pair
.
first
]
end
[
[[
preceding.first
]
,
[]
],
[
preceding
[
1
..
-
1
]
,
following
]
]
end
,
:recombine
=>
lambda
{
|
pair
|
pair
.
first
+
pair
.
last
}
)
end
That’s great. Well, that’s barely ok, actually. The problem is that when doing our merge sort, when we decide which item is the preceding item (least most, front most, whatever you want to call it), we already know that it is a trivial item and that it doesn’t need any further merging. The only reason we bundle it up in [[preceding.first], []]
is because our #divide_and_conquer
method expects to recursively attempt to solve all of the sub-problems we generate.
In this case, #merge_two_sorted_lists
does not really divide a problem into a list of one or more sub-problems, some of which may or may not be trivially solvable. Instead, it divides a problem into a part of the solution and a single sub-problem which may or may not be trivially solvable. This common strategy also has a name, linear recursion.
Let’s write another version of #merge_two_sorted_lists
, but his time instead of using #divide_and_conquer
, we’ll write a linear recursion combinator:
def
merge_two_sorted_lists
(
*
pair
)
linear_recursion
(
pair
,
:
divisible
?
=>
lambda
{
|
pair
|
!
pair
.
first
.
empty
?
&&
!
pair
.
last
.
empty
?
},
:
conquer
=>
lambda
do
|
pair
|
if
pair
.
first
.
empty
?
&&
pair
.
last
.
empty
?
[]
elsif
pair
.
first
.
empty
?
pair
.
last
else
pair
.
first
end
end
,
:
divide
=>
lambda
do
|
pair
|
preceding
,
following
=
case
pair
.
first
.
first
<=>
pair
.
last
.
first
when
-
1
:
[
pair
.
first
,
pair
.
last
]
when
0
:
[
pair
.
first
,
pair
.
last
]
when
1
:
[
pair
.
last
,
pair
.
first
]
end
[
preceding
.
first
,
[
preceding
[
1.
.
-
1
],
following
]
]
end
,
:
recombine
=>
lambda
{
|
trivial_bit
,
divisible_bit
|
[
trivial_bit
]
+
divisible
\
_bit
}
)
end
def
linear_recursion
(
value
,
steps
)
if
steps
[
:
divisible
?
].
call
(
value
)
trivial_part
,
sub_problem
=
steps
[
:
divide
].
call
(
value
)
steps
[
:
recombine
].
call
(
trivial_part
,
linear_recursion
(
sub_problem
,
steps
)
)
else
steps
[
:
conquer
].
call
(
value
)
end
end
You may think this is even better, and it is.
8.3 Separating Declaration from Implementation
Using recursive combinators like #divide_and_conquer
and #linear_recursion
are abstraction wins. They make recursive code much easier to read, because you know the general form of the algorithm and don’t need to pick through it to discover the individual steps. But there’s another benefit we should consider: Recursive combinators separate declaration from implementation.
Consider #linear_recursion
again. This is not the fastest possible implementation. There is a long and tedious argument that arises when one programmer argues it should be implemented with iteration for performance, and the other argues it should be implemented with recursion for clarity, and a third programmer who never uses recursion claims the iterative solution is easier to understand…
Imagine a huge code base full of #linear_recursion
and #divide_and_conquer
calls. What happens if you decide that each one of these algorithms should be implemented with iteration? Hmmm… How about we modify #linear_recursion
and #divide_and_conquer
, and all of the methods that call them switch from recursion to iteration for free?
Or perhaps we decide that we really should take advantage of multiple threads… Do you see where this is going? You can write a new implementation and again, all of the existing methods are upgraded.
Even if you do not plan to change the implementation, let’s face a simple fact: when writing a brand new recursive or iterative method, you really have two possible sources of bugs: you may not have declared the solution correctly, and you may not implement it correctly.
Using combinators like #divide_and_conquer
simplifies things: You only need to get your declaration of the solution correct, the implementation is taken care of for you. This is a tremendous win when writing recursive functions.
For these reasons, I strongly encourage the use of recursion combinators, either those supplied here or ones you write for yourself.
8.4 Practical Recursive Combinators
We’ve seen how recursive combinators like #divide_and_conquer
and #linear_recursion
are abstraction wins. They make recursive code much easier to read, because you know the general form of the algorithm and don’t need to pick through it to discover the individual steps.
We also saw that by separating the recursion implementation from the declaration of how to perform the steps of an algorithm like #rotate
, we leave ourselves the opportunity to improve the performance of our implementation without the risk of adding bugs to our declaration. And today we’re going to do just that, along with a few tweaks for usability.
In this section, we’re going to optimize our combinators’ performance and make them a little easier to use with goodies like string_to_proc
. To do that, we’re going to work with closures, defining methods with define_method
, and implement functional programming’s partial application. We’ll wrap up by converting linrec
from a recursive to an iterative implementation.
First, a little organization. Here are the original examples. I’ve placed them in a module and named the combinators multirec
and linrec
in conformance with common practice:
module
RecursiveCombinators
def
multirec
(
value
,
steps
)
if
steps
[
:
divisible
?
]
.
call
(
value
)
steps
[
:
recombine
]
.
call
(
steps
[
:
divide
]
.
call
(
value
).
map
{
|
sub_value
|
multirec
(
sub_value
,
steps
)
}
)
else
steps
[
:
conquer
]
.
call
(
value
)
end
end
def
linrec
(
value
,
steps
)
if
steps
[
:
divisible
?
]
.
call
(
value
)
trivial_part
,
sub_problem
=
steps
[
:
divide
]
.
call
(
value
)
steps
[
:
recombine
]
.
call
(
trivial_part
,
linrec
(
sub_problem
,
steps
)
)
else
steps
[
:
conquer
]
.
call
(
value
)
end
end
module_function
:
multirec
,
:
linrec
end
Since they are also module functions, call them by sending a message to the module:
def
merge_sort
(
list
)
RecursiveCombinators
.
multirec
(
list
,
:
divisible
?
=>
lambda
{
|
list
|
list
.
length
>
1
},
:
conquer
=>
lambda
{
|
list
|
list
},
:
divide
=>
lambda
do
|
list
|
half_index
=
(
list
.
length
/
2
)
-
1
[
list
[
0.
.
half_index
],
list
[(
half_index
+
1
)..
-
1
]
]
end
,
:
recombine
=>
lambda
{
|
pair
|
merge_two_sorted_lists
(
pair
.
first
,
pair
.
last
)
}
)
end
Or you can include the RecursiveCombinators
module and call either method directly:
include
RecursiveCombinators
def
merge_two_sorted_lists
(
*
pair
)
linrec
(
pair
,
:
divisible
?
=>
lambda
{
|
pair
|
!
pair
.
first
.
empty
?
&&
!
pair
.
last
.
empty
?
},
:
conquer
=>
lambda
do
|
pair
|
if
pair
.
first
.
empty
?
&&
pair
.
last
.
empty
?
[]
elsif
pair
.
first
.
empty
?
pair
.
last
else
pair
.
first
end
end
,
:
divide
=>
lambda
do
|
pair
|
preceding
,
following
=
case
pair
.
first
.
first
<=>
pair
.
last
.
first
when
-
1
:
[
pair
.
first
,
pair
.
last
]
when
0
:
[
pair
.
first
,
pair
.
last
]
when
1
:
[
pair
.
last
,
pair
.
first
]
end
[
preceding
.
first
,
[
preceding
[
1.
.
-
1
],
following
]
]
end
,
:
recombine
=>
lambda
{
|
trivial_bit
,
divisible_bit
|
[
trivial_bit
]
+
divisible
\
_bit
}
)
end
merge_sort
([
8
,
3
,
10
,
1
,
9
,
5
,
7
,
4
,
6
,
2
])
=>
[
1
,
2
,
3
,
4
,
5
,
6
,
7
,
8
,
9
,
10
]
Ok, we’re ready for some slightly more substantial work. These methods were fine for illustration, but I have a few questions for the author(!)
8.5 Spicing things up
First, note that every single time we call a method like merge_sort
, we create four new lambdas from scratch. This seems wasteful, especially since the lambdas never change. Why create some objects just to throw them away?
On the other hand, it’s nice to be able to use create algorithms without having to define a method by name. Although I probably wouldn’t do a merge sort anonymously, when I need a one-off quickie, I might like to write something like:
RecursiveCombinators
.
multirec
(
[
1
,
2
,
3
,
[[
4
,
5
],
6
],
[[[
7
]]]],
:
divisible
?
=>
lambda
{
|
value
|
value
.
kind_of
?
(
Enumerable
)
},
:
conquer
=>
lambda
{
|
value
|
value
**
2
},
:
divide
=>
lambda
{
|
value
|
value
},
:
recombine
=>
lambda
{
|
list
|
list
.
inject
()
{
|
x
,
y
|
x
+
y
}
}
)
=>
140
But when I want a permanent sum of the squares method, I don’t want to write:
def
sum_squares
(
list
)
RecursiveCombinators
.
multirec
(
list
,
:
divisible
?
=>
lambda
{
|
value
|
value
.
kind_of
?
(
Enumerable
)
},
:
conquer
=>
lambda
{
|
value
|
value
**
2
},
:
divide
=>
lambda
{
|
value
|
value
},
:
recombine
=>
lambda
{
|
list
|
list
.
inject
()
{
|
x
,
y
|
x
+
y
}
}
)
end
…because that would create four lambdas every time I call the function. There are a couple of ways around this problem. First, our “recipe” for summing squares is a simple hash. We could extract that from the method into a constant:
SUM_SQUARES_RECIPE
=
{
:
divisible
?
=>
lambda
{
|
value
|
value
.
kind_of
?
(
Enumerable
)
},
:
conquer
=>
lambda
{
|
value
|
value
**
2
},
:
divide
=>
lambda
{
|
value
|
value
},
:
recombine
=>
lambda
{
|
list
|
list
.
inject
()
{
|
x
,
y
|
x
+
y
}
}
}
def
sum_squares
(
list
)
RecursiveCombinators
.
multirec
(
list
,
SUM_SQUARES_RECIPE
)
end
That (and the isomorphic solution where the constant SUM_SQUARES_RECIPE
is instead a private helper method #sum_squares_recipe
) is nice if you have some reason you wish to re-use the recipe elsewhere. But we don’t, so this merely clutters our class up and separates the method definition from its logic.
I have something in mind. To see what it is, let’s start by transforming our method definition from using the def
keyword to using the define_method
private class method. This obviously needs a module or class to work:
class
Practicum
include
RecursiveCombinators
define_method
:
sum_squares
do
|
list
|
multirec
(
list
,
:
divisible
?
=>
lambda
{
|
value
|
value
.
kind_of
?
(
Enumerable
)
},
:
conquer
=>
lambda
{
|
value
|
value
**
2
},
:
divide
=>
lambda
{
|
value
|
value
},
:
recombine
=>
lambda
{
|
list
|
list
.
inject
()
{
|
x
,
y
|
x
+
y
}
}
)
end
end
Practicum
.
new
.
sum_squares
([
1
,
2
,
3
,
[[
4
,
5
],
6
],
[[[
7
]]]])
As you probably know, any method taking a block can take a lambda using the &
operator, so:
define_method
:
sum_squares
,
&(
lambda
do
|
list
|
multirec
(
list
,
:
divisible
?
=>
lambda
{
|
value
|
value
.
kind_of
?(
Enumerable
)
},
:
conquer
=>
lambda
{
|
value
|
value
**
2
},
:
divide
=>
lambda
{
|
value
|
value
},
:
recombine
=>
lambda
{
|
list
|
list
.
inject
()
{
|
x
,
y
|
x
+
y
}
}
)
end
)
This is useful, because now we can express what we want: a lambda taking one argument that in turn calls multirec
with the other arguments filled in. Functional programmers call this Partial Application. The idea is that if you have a function or method taking two arguments, if you only give it one argument you get a function back that takes the other. So:
multirec
(
x
).
call
(
y
)
=>
multirec
(
x
,
y
)
Now the drawback with this “standard” implementation of partial application is that we would pass a list to multirec
and get back a function taking a hash of declarations. That isn’t what we want. We could partially apply things backwards so that multirec(x).call(y) => multirec(y,x)
(if Ruby was a concatenative language, we would be concatenating the multirec combinator with a thrush). The trouble with that is it is the reverse of how partial application works in every other programming language and functional programming library.
Instead, we will switch the arguments to multirec
ourselves, so it now works like this:
multirec
(
{
:
divisible
?
=>
lambda
{
|
value
|
value
.
kind_of
?
(
Enumerable
)
},
:
conquer
=>
lambda
{
|
value
|
value
**
2
},
:
divide
=>
lambda
{
|
value
|
value
},
:
recombine
=>
lambda
{
|
list
|
list
.
inject
()
{
|
x
,
y
|
x
+
y
}
}
},
list
)
The drawback with this approach is that we lose a little of Ruby’s syntactic sugar, the ability to fake named parameters by passing hash arguments without {}
if they are the last parameter. And now, let’s give it the ability to partially apply itself. You can do some stuff with allowing multiple arguments and counting the number of arguments, but we’re going to make the wild assumption that you never attempt a recursive combinator on nil
. Here’s multirec
, you can infer the implementation for linrec
:
def
multirec
(
steps
,
optional_value
=
nil
)
worker_proc
=
lambda
do
|
value
|
if
steps
[
:
divisible
?
].
call
(
value
)
steps
[
:
recombine
].
call
(
steps
[
:
divide
].
call
(
value
).
map
{
|
sub_value
|
worker_proc
.
call
(
sub_value
)
}
)
else
steps
[
:
conquer
].
call
(
value
)
end
end
if
optional_value
.
nil
?
worker_proc
else
worker_proc
.
call
(
optional_value
)
end
end
Notice that you get the same correct result whether you write:
RecursiveCombinators
.
multirec
(
:
divisible
?
=>
lambda
{
|
value
|
value
.
kind_of
?
(
Enumerable
)
},
:
conquer
=>
lambda
{
|
value
|
value
**
2
},
:
divide
=>
lambda
{
|
value
|
value
},
:
recombine
=>
lambda
{
|
list
|
list
.
inject
()
{
|
x
,
y
|
x
+
y
}
}
).
call
([
1
,
2
,
3
,
[[
4
,
5
],
6
],
[[[
7
]]]])
=>
140
Or:
RecursiveCombinators
.
multirec
(
{
:
divisible
?
=>
lambda
{
|
value
|
value
.
kind_of
?
(
Enumerable
)
},
:
conquer
=>
lambda
{
|
value
|
value
**
2
},
:
divide
=>
lambda
{
|
value
|
value
},
:
recombine
=>
lambda
{
|
list
|
list
.
inject
()
{
|
x
,
y
|
x
+
y
}
}
},
[
1
,
2
,
3
,
[[
4
,
5
],
6
],
[[[
7
]]]]
)
=>
140
Let’s go back to what we were trying to do with &
:
define_method
:
sum_squares
,
&(
lambda
do
|
list
|
multirec
(
list
,
:
divisible
?
=>
lambda
{
|
value
|
value
.
kind_of
?(
Enumerable
)
},
:
conquer
=>
lambda
{
|
value
|
value
**
2
},
:
divide
=>
lambda
{
|
value
|
value
},
:
recombine
=>
lambda
{
|
list
|
list
.
inject
()
{
|
x
,
y
|
x
+
y
}
}
)
end
)
Now we know how to build our lambda:
require
'
partial_application_recursive_combinators
'
class
Practicum
extend
PartialApplicationRecursiveCombinators
#
so
we
can
call
multirec
in
cl
\
ass
scope
define_method
:
sum_squares
,
&
multirec
(
:
divisible
?
=>
lambda
{
|
value
|
value
.
kind_of
?
(
Enumerable
)
},
:
conquer
=>
lambda
{
|
value
|
value
**
2
},
:
divide
=>
lambda
{
|
value
|
value
},
:
recombine
=>
lambda
{
|
list
|
list
.
inject
()
{
|
x
,
y
|
x
+
y
}
}
)
end
Practicum
.
new
.
sum_squares
([
1
,
2
,
3
,
[[
4
,
5
],
6
],
[[[
7
]]]])
=>
140
You can verify for yourself that no matter how many times you call sum_squares
, you do not build those lambdas again. What we have just done is added partial application to multirec
and linrec
, which in turn allows us to ensure that he cost of constructing lambdas for our methods is only done when the method is defined, not every time it is called.
8.6 Building on a legacy
We have already renamed divide_and_conquer
and linear_recursion
to bring them into line with standard practice and other programming languages. Now it’s time for us to bring the parameters–the declarative lambdas–into line with standard practice.
The four arguments to both methods are normally called cond
, then
, before
, and after
:
-
cond
is the logical inverse ofdivisible?
So ifcond(value)
evaluates to true, then we do not need to subdivide the problem. -
then
is exactly the same asconquer
, if cond then then. That’s the way I think of it. -
before
is the same asdivide
. -
after
is the same asrecombine
.
Things look very similar with the new scheme for now:
require
'
legacy_recursive_combinators
'
class
Practicum
extend
LegacyRecursiveCombinators
#
so
we
can
call
multirec
in
class
scope
define_method
:
sum_squares
,
&
multirec
(
:
cond
=>
lambda
{
|
value
|
value
.
kind_of
?
(
Numeric
)
},
#
the
only
change
right
\
now
:
then
=>
lambda
{
|
value
|
value
**
2
},
:
before
=>
lambda
{
|
value
|
value
},
:
after
=>
lambda
{
|
list
|
list
.
inject
()
{
|
x
,
y
|
x
+
y
}
}
)
end
All right, now our combinators will look familiar to functional programmers, and even better when we look at functional programs using recursive combinators we will understand them at a glance. Okay, let’s get serious and work on making our combinators easy to use and our code easy to read.
8.7 Seriously
As long as you’re writing these lambdas out, writing :cond =>
isn’t a hardship. And in an explanatory article like this, it can help at first. However, what if you find a way to abbreviate things? For example, you might alias lambda
to L
. Or you might want to use string\_to\_proc
:
string_to_proc.rb
class
String
unless
''
.
respond_to?
(
:to_proc
)
def
to_proc
&
block
params
=
[]
expr
=
self
sections
=
expr
.
split
(
/\s*->\s*/m
)
if
sections
.
length
>
1
then
eval
sections
.
reverse!
.
inject
{
|
e
,
p
|
"(Proc.new { |
#{
p
.
split
(
/\s/
)
.
jo
\
in
(
', '
)
}
|
#{
e
}
})"
},
block
&&
block
.
binding
elsif
expr
.
match
(
/\b_\b/
)
eval
"Proc.new { |_|
#{
expr
}
}"
,
block
&&
block
.
binding
else
leftSection
=
expr
.
match
(
/^\s*(?:[+*\/%&|\^\.=<>\[]|!=)/m
)
rightSection
=
expr
.
match
(
/[+\-*\/%&|\^\.=<>!]\s*$/m
)
if
leftSection
||
rightSection
then
if
(
leftSection
)
then
params
.
push
(
'$left'
)
expr
=
'$left'
+
expr
end
if
(
rightSection
)
then
params
.
push
(
'$right'
)
expr
=
expr
+
'$right'
end
else
self
.
gsub
(
/(?:\b[A-Z]|\.[a-zA-Z_$])[a-zA-Z_$\d]*|[a-zA-Z_$][a-zA-Z_$\d]*:\
|self|arguments|'(?:[^'\\]|\\.)*'|"(?:[^"\\]|\\.)*"/
,
''
)
.
scan
(
/([a-z_$][a-z_$\d]*)/i
)
do
|
v
|
params
.
push
(
v
)
unless
params
.
include?
(
v
)
end
end
eval
"Proc.new { |
#{
params
.
join
(
', '
)
}
|
#{
expr
}
}"
,
block
&&
block
.
bind
\
ing
end
end
end
end
So we should support passing the declarative arguments by position as well as by ‘name.’ And with a final twist, if any of the declarative arguments aren’t already lambdas, we’ll try to create lambdas by sending them the message to_proc
. So our goal is to write what we wrote above or either of the following and have it “just work:”
define_method
:
sum_squares
,
&
multirec
(
lambda
{
|
value
|
value
.
kind_of
?(
Numeric
)
},
#
the
only
change
right
now
lambda
{
|
value
|
value
**
2
},
lambda
{
|
value
|
value
},
lambda
{
|
list
|
list
.
inject
()
{
|
x
,
y
|
x
+
y
}
}
)
include
'string-to_proc'
define_method
:
sum_squares
,
&
multirec
(
"value.kind_of?(Numeric)"
,
"value ** 2"
,
"va\
lue"
,
"value.inject(&'+')"
)
And here is the code that makes it work:
recursive_combinators.rb
module
RecursiveCombinators
separate_args
=
lambda
do
|
args
|
if
![
1
,
2
,
4
,
5
].
include?
(
args
.
length
)
raise
ArgumentError
elsif
args
.
length
<=
2
steps
=
[
:cond
,
:then
,
:before
,
:after
].
map
{
|
k
|
args
.
first
[
k
].
to_proc
}
steps
.
push
(
args
[
1
]
)
unless
args
[
1
].
nil?
steps
else
steps
=
args
[
0
.
.
3
].
map
{
|
arg
|
arg
.
to_proc
}
steps
.
push
(
args
[
4
]
)
unless
args
[
4
].
nil?
steps
end
end
define_method
:multirec
do
|*
args
|
cond_proc
,
then_proc
,
before_proc
,
after_proc
,
optional_value
=
separate_args
\
.
call
(
args
)
worker_proc
=
lambda
do
|
value
|
if
cond_proc
.
call
(
value
)
then_proc
.
call
(
value
)
else
after_proc
.
call
(
before_proc
.
call
(
value
)
.
map
{
|
sub_value
|
worker_proc
.
call
(
sub_value
)
}
)
end
end
if
optional_value
.
nil?
worker_proc
else
worker_proc
.
call
(
optional_value
)
end
end
define_method
:linrec
do
|*
args
|
cond_proc
,
then_proc
,
before_proc
,
after_proc
,
optional_value
=
separate_args
\
.
call
(
args
)
worker_proc
=
lambda
do
|
value
|
trivial_parts
,
sub_problem
=
[]
,
value
while
!
cond_proc
.
call
(
sub_problem
)
trivial_part
,
sub_problem
=
before_proc
.
call
(
sub_problem
)
trivial_parts
.
unshift
(
trivial_part
)
end
trivial_parts
.
unshift
(
then_proc
.
call
(
sub_problem
))
trivial_parts
.
inject
do
|
recombined
,
trivial_part
|
after_proc
.
call
(
trivial_part
,
recombined
)
end
end
if
optional_value
.
nil?
worker_proc
else
worker_proc
.
call
(
optional_value
)
end
end
module_function
:multirec
,
:linrec
end
Now when we have trivial lambdas, we can use nice syntactic sugar to express them. string_to_proc
is not part of our recursive combinators, but making recursive combinators flexible, we make it “play well with others,” which is a win for our code.
8.8 Separating Implementation from Declaration
In Refactoring Methods with Recursive Combinators, we read the claim that by separating the recursion implementation from the declaration of how to perform the steps of an algorithm like #rotate
, we leave ourselves the opportunity to improve the performance of our implementation without the risk of adding bugs to our declaration.
In other words, we can optimize linrec
if we want to. Well, we want to. So what we’re going to do is optimize its performance by trading time for space. Let’s have a quick look at the worker_proc
lambda inside of linrec
:
worker_proc
=
lambda
do
|
value
|
if
cond_proc
.
call
(
value
)
then_proc
.
call
(
value
)
else
trivial_part
,
sub_problem
=
before_proc
.
call
(
value
)
after_proc
.
call
(
trivial_part
,
worker_proc
.
call
(
sub_problem
)
)
end
end
As you can see, it is recursive, it calls itself to solve each sub-problem. And here is an iterative replacement:
worker_proc
=
lambda
do
|
value
|
trivial_parts
,
sub_problem
=
[],
value
while
!
cond_proc
.
call
(
sub_problem
)
trivial_part
,
sub_problem
=
before_proc
.
call
(
sub_problem
)
trivial_parts
.
unshift
(
trivial_part
)
end
trivial_parts
.
unshift
(
then_proc
.
call
(
sub_problem
))
trivial_parts
.
inject
do
|
recombined
,
trivial_part
|
after_proc
.
call
(
trivial_part
,
recombined
)
end
end
This version doesn’t call itself. Instead, it uses an old-fashioned loop, accumulating the results in an array. In a certain sense, this uses more explicit memory than the recursive implementation. However, we both know that the recursive version uses memory for its stack, so that’s a bit of a wash. However, the Ruby stack is limited while arrays can be much larger, so this version can handle much larger data sets.
If you drop the new version of worker_proc
into the linrec
definition, each and every method you define using linrec
gets the new implementation, for free. This works because we separated the implementation of recursive divide and conquer algorithms from the declaration of the steps each particular algorithm. Here’s our new version of linrec
:
define_method
:
linrec
do
|*
args
|
cond_proc
,
then_proc
,
before_proc
,
after_proc
,
optional_value
=
separate_args
.
c
\
all
(
args
)
worker_proc
=
lambda
do
|
value
|
trivial_parts
,
sub_problem
=
[],
value
while
!
cond_proc
.
call
(
sub_problem
)
trivial_part
,
sub_problem
=
before_proc
.
call
(
sub_problem
)
trivial_parts
.
unshift
(
trivial_part
)
end
trivial_parts
.
unshift
(
then_proc
.
call
(
sub_problem
))
trivial_parts
.
inject
do
|
recombined
,
trivial_part
|
after_proc
.
call
(
trivial_part
,
recombined
)
end
end
if
optional_value
.
nil
?
worker_proc
else
worker_proc
.
call
(
optional_value
)
end
end
8.9 A Really Simple Recursive Combinator
In [Recursive Lambdas in Ruby using Object#tap](http://ciaranm.wordpress.com/2008/11/30/recursive-lambdas-in-ruby-using-objecttap/ “”), Ciaran McCreesh explained how he used #tap
to write a recursive function without cluttering the scope up with an unneeded variable. (If you would like a refresher, Object#tap
is explained in Kestrels).
Ciaran’s final solution was:
lambda
do
|
recurse
,
spec
|
case
spec
when
AllDepSpec
,
ConditionalDepSpec
spec
.
each
{
|
child
|
recurse
.
call
(
recurse
,
child
)
}
when
SimpleURIDepSpec
puts
spec
end
end
.
tap
{
|
r
|
r
.
call
(
r
,
id
.
homepage_key
.
value
)
}
if
id
.
homepage_key
There are two great things about this solution. First, Ciaran doesn’t need to calculate a result, he is just performing this computation for its side-effect, puts
. Therefore, using a kestrel like #tap
signals that he is not interested in the result. Second, he is using an off-the-shelf component and not writing a “horrid untyped lambda calculus construct” to get the job done. Fewer moving parts is a laudable goal.
That being said, when solving other problems, this solution may not meet our needs:
- Since it doesn’t return a result, we cannot use it for functions that compute values and not just generate side effects;
- Within the lambda, our
recurse
function must be called with itself as a parameter. This mixes the mechanics of our recursive implementation up with the semantics of what we’re trying to accomplish.
If we find ourselves needing to work around these limitations, we’ll need to go a bit further. Let’s use a brutally trivial example, factorial. (The naive implementation of factorial is a terrible piece of programming, but it’s simple enough that we can focus on how we’re implementing recursion and not what we are computing).
We could use one of our existing recursive combinators like linrec
:
include
'
string
-
to_proc
'
linrec
(
'
<
2
'
,
'1'
,
'
n
->
[
n
,
n
-
1
]
'
,
'*'
).
call
(
5
)
=>
120
# or perhaps you prefer...
linrec
(
lambda
{
|
n
|
n
<
2
},
lambda
{
|
n
|
1
},
lambda
{
|
n
|
[
n
,
n
-
1
]
},
lambda
{
|
n
,
m
|
n
*
m
}
).
call
(
5
)
=>
120
That gets us what we want without using a untyped lambda calculus construct, because it uses a combinatorial logic construct instead. But let’s work something out that is closer to the spirit of Ciaran’s approach. For starters, we can’t use #tap
because we need the result of the computation, so we’ll imagine we have a new method, #rcall
. Our first cut will look like this:
class
Proc
def
rcall
(
*
args
)
call
(
self
,
*
args
)
end
end
lambda
{
|
r
,
n
|
n
<
2
?
1
:
n
*
r
.
call
(
r
,
n
-
1
)
}.
rcall
(
5
)
That solves our first problem very nicely: we can call a lambda with a value and it knows to pass itself to itself. Now what about our second problem? We are still cluttering up the inside of our function with passing itself to itself. Instead of calling r.call(r, n-1)
, can we just call r.call(n-1)
?
That would make our function look a lot simpler.
Well, we start with lambda { |r, *args| ... }
. But if we are to call r.call(n)
, we need to pass in a function like lambda { |*args| ... }
. What does that function do? Send the message #rcall
to our original function, of course. So we can write:
class
Proc
def
rcall
(
*
args
)
call
(
lambda
{
|*
args
|
self
.
rcall
(
*
args
)
},
*
args
)
end
end
lambda
{
|
r
,
n
|
n
<
2
?
1
:
n
*
r
.
call
(
n
-
1
)
}.
rcall
(
5
)
=>
120
And that’s it, we’ve accomplished recursion without using any untyped lambda calculus constructs. It may look at first glance like we’re using an anonymous recursive combinator like Y, but we aren’t. We’re actually taking advantage of Ruby’s self
variable, so #rcall
does not really implement anonymous recursion, it just lets us write recursive lambdas without explicitly binding them to a variable.
And our new method, #rcall
, returns a value from our recursion and doesn’t force us to remember to pass our lambda to itself when making a recursive call.
Cheers!
class
Proc
def
rcall
(
*
args
)
call
(
lambda
{
|*
args
|
self
.
rcall
(
*
args
)
},
*
args
)
end
end
9 You can’t be serious!?
In Practical Recursive Combinators, we enhanced multirec
(a/k/a “Divide and Conquer”) and linrec
(“Linear Recursion”) to accept as arguments any object that supports the #to_proc
method. Today we’re going demonstrate why: We will look at how removing the ceremony around lambdas makes using combinators like multirec
more valuable for code we share with others.
Using recursive_combinators.rb to define how to sum the squares of a nested list of numbers, we can write:
require
'
recursive_combinators
'
include
RecursiveCombinators
multirec
(
lambda
{
|
x
|
x
.
kind_of
?
(
Numeric
)
},
lambda
{
|
x
|
x
**
2
},
lambda
{
|
x
|
x
},
lambda
{
|
x
|
x
.
inject
{
|
sum
,
n
|
sum
+
n
}
}
)
The trouble with this–to quote a seasonally appropriate character–is the noise, noise, Noise, NOISE! All those lambdas and parameter declarations outweigh the actual logic we are declaring, so much so that declaring this function using our abstraction is longer and may seem more obscure than declaring it without the abstraction.
This whole thing reminds me of languages where the keywords must be in UPPER CASE. Reading code in such languages is like listening to a poetry reading where the author shouts the punctuation:
Two roads diverged in a yellow wood COMMA!
And sorry I could not travel both
And be one traveler COMMA! long I stood
And looked down one as far as I could
To where it bent in the undergrowth SEMI-COMMA!!
Finding ways to abbreviate our declaration is more than just a little “syntactic sugar:” It’s a way of emphasizing what is important, our algorithms, and de-emphasizing what is not important, the scaffolding and ceremony of instantiating Proc
objects in Ruby. One of those ways is to use String#to_proc
.
9.1 String to Proc
String#to_proc
adds the #to_proc
method to the String
class in Ruby. This allows you to write certain simple lambdas as strings instead of using the lambda
keyword, the proc
keyword, or Proc.new
. The reason why you’d bother is that String#to_proc
provides some shortcuts that get rid of the noise.
gives
String#to_proc
provides several key abbreviations: First, ->
syntax for lambdas in Ruby 1.8. So instead of lambda { |x,y| x + y }
, you can write 'x,y -> x + y'
. I read this out loud as “x and y gives x plus y.”
This syntax gets rid of the noisy lambda
keyword and is much closer to Ruby 1.9 syntax. And frankly, reading it out loud makes much more sense than reading lambdas aloud. Our example above could be written:
require
'
string_to_proc
'
multirec
(
'
x
->
x
.
kind_of
?
(
Numeric
)
'
,
'
x
->
x
**
2
'
,
'
x
->
x
'
,
'
x
->
x
.
inject
{
|
sum
,
n
|
sum
+
n
}
'
)
This is a lot better than the version with lambdas, and if the ->
seems foreign, it is only because ->
is in keeping with modern functional languages and mathematical notation, while lambda
is in keeping with Lisp and lambda calculus notation without the ability to use a single lambda character unicode.
inferred parameters
Second, String#to_proc
adds inferred parameters: If you do not use ->
, String#to_proc
attempts to infer the parameters. So if you write 'x + y'
, String#to_proc
treats it as x,y -> x + y
. There are certain expressions where this doesn’t work, and you have to use ->
, but for really simple cases it works just fine. And frankly, for really simple cases you don’t need the extra scaffolding. Here’s our example with the first three lambdas using inferred parameters:
multirec
(
'
x
.
kind_of
?
(
Numeric
)
'
,
'
x
**
2
'
,
'x'
,
'
z
->
z
.
inject
{
|
sum
,
n
|
sum
+
n
}
'
)
I have good news and bad news about inferred parameters and
String#to_proc
in general. It uses regular expressions to do its thing, which means that complicated things often don’t work. For example, nesting->
only works when writing functions that return functions. So'x -> y -> x + y'
is a function that takes anx
and returns a function that takes ay
and returnsx + y
. That works. But'z -> z.inject(&"sum, n -> sum + n")'
does NOT work.
I considered fixing this with more sophisticated parsing, however the simple truth is this:
String#to_proc
is not a replacement forlambda
, it’s a tool to be used when what you’re doing is so simple thatlambda
is overkill. IfString#to_proc
doesn’t work for something, it probably isn’t ridiculously simple any more.
it
The third abbreviation is a special case. If there is only one parameter, you can use _
(the underscore) without naming it. This is often called the “hole” or pronounced “it.” If you use “it,” then String#to_proc
doesn’t try to infer any more parameters, so this can help you write things like:
multirec
(
'
_
.
kind_of
?
(
Numeric
)
'
,
'
_
**
2
'
,
'_'
,
'
_
.
inject
{
|
sum
,
n
|
sum
+
n
}
'
)
Admittedly, use of “it”/the hole is very much a matter of taste.
point-free
String#to_proc
has a fourth and even more extreme abbreviation up its sleeve, point-free style: “Function points” are what functional programmers usually call parameters. Point-free style consists of describing how functions are composed together rather than describing what happens with their arguments. So, let’s say that I want a function that combines .inject
with +
. One way to say that is to say that I want a new function that takes its argument and applies an inject
to it, and the inject takes another function with two arguments and applies a +
to them:
lambda
{
|
z
|
z
.
inject
{
|
sum
,
n
|
sum
+
n
}
}
The other way is to say that I want to compose .inject
and +
together. Without getting into a compose
function like Haskell’s .
operator, String#to_proc
has enough magic to let us write the above as:
".inject(&'+')"
Meaning “I want a new lambda that does an inject using plus.” Point-free style does require a new way of thinking about some things, but it is a clear win for simple cases. Proof positive of this is the fact that Ruby on Rails and Ruby 1.9 have both embraced point-free style with Symbol#to_proc
. That’s exactly how (1..100).inject(&:+)
works!
String#to_proc
supports fairly simple cases where you are sending a message or using a binary operator. So if we wanted to go all out, we could write our example as:
multirec
(
'
.
kind_of
?
(
Numeric
)
'
,
'
**
2
'
,
'x'
,
".inject(&'+')"
)
There’s no point-free magic for the identity function, although this example tempts me to special case the empty string!
When should we use all these tricks?
String#to_proc
provides these options so that you as a programmer can choose your level of ceremony around writing functions. But of course, you have to use the tool wisely. My personal rules of thumb are:
- Embrace inferred parameters for well-known mathematical or logical operations. For these operations, descriptive parameter names are usually superfluous. Follow the well-known standard and use
x
,y
,z
, andw
; ora
,b
andc
; orn
,i
,j
, andk
for the parameters. If whatever it is makes no sense using those variable names, don’t used inferred parameters. - Embrace the hole for extremely simple one-parameter lambdas that aren’t intrinsically mathematical or logical such as methods that use
.method_name
and for the identity function. - Embrace point-free style for methods that look like operators.
- Embrace
->
notation for extremely simple cases where I want to give the parameters a descriptive name. - Use lambdas for everything else.
So I would write:
multirec
(
'
_
.
kind_of
?
(
Numeric
)
'
,
'
**
2
'
,
'_'
,
"_.inject(&'+')"
)
I read the parameters out loud as:
- it kind_of? Numeric;
- raise to the power of two;
- it;
- it inject plus.
And yes, I consider multirec( '_.kind_of?(Numeric)', '** 2', '_', "_.inject(&'+')")
more succinct and easier to read than:
def
sum_squares
(
value
)
if
value
.
kind_of
?
(
Numeric
)
value
**
2
else
value
.
map
do
|
sub_value
|
sum_squares
(
sub_value
)
end
.
inject
{
|
x
,
y
|
x
+
y
}
end
end
If all this is new too you, String#to_proc
may seem like gibberish and def sum_squares
may seem reassuringly sensible. But try to remember that combinators like multirec
are built to disentangle the question of what we are doing from how we are doing it. This is the third straight post about recursive combinators using one of three different examples. So of course we know what sum_squares
does and how it does it.
But try to imagine you are looking at a piece of code that isn’t so simple, that isn’t so obvious. Maybe it was written by someone else, maybe you wrote it a while ago. If you see:
def
rotate
(
square
)
return
square
unless
square
.
kind_of
?
(
Enumerable
)
&&
square
.
size
>
1
half_sz
=
square
.
size
/
2
sub_square
=
lambda
do
|
row
,
col
|
square
.
slice
(
row
,
half_sz
).
map
{
|
a_row
|
a_row
.
slice
(
col
,
half_sz
)
}
end
upper_left
=
rotate
(
sub_square
.
call
(
0
,
0
))
lower_left
=
rotate
(
sub_square
.
call
(
half_sz
,
0
))
upper_right
=
rotate
(
sub_square
.
call
(
0
,
half_sz
))
lower_right
=
rotate
(
sub_square
.
call
(
half_sz
,
half_sz
))
upper_right
.
zip
(
lower_right
).
map
{
|
l
,
r
|
l
+
r
}
+
upper_left
.
zip
(
lower_left
).
map
{
|
l
,
r
|
l
+
r
}
end
Do you see at once how it works? Do you see at a glance whether the recursive strategy was implemented properly? Can you tell whether there’s something buggy about it? For example, this code only works rotating square matrices that have sides which are powers of two. What needs to be changed to fix that? Are you sure you can fix it without breaking the divide and conquer strategy?
For a method like this, I would write:
multirec
(
:
cond
=>
"!(_.kind_of?(Enumerable) && _.size > 1)"
,
:
then
=>
"_"
,
:
before
=>
lambda
do
|
square
|
half_sz
=
square
.
size
/
2
sub_square
=
lambda
do
|
row
,
col
|
square
.
slice
(
row
,
half_sz
).
map
{
|
a_row
|
a_row
.
slice
(
col
,
half_sz
)
}
end
upper_left
=
sub_square
.
call
(
0
,
0
)
lower_left
=
sub_square
.
call
(
half_sz
,
0
)
upper_right
=
sub_square
.
call
(
0
,
half_sz
)
lower_right
=
sub_square
.
call
(
half_sz
,
half_sz
)
[
upper_left
,
lower_left
,
upper_right
,
lower_right
]
end
,
:
after
=>
lambda
do
|
list
|
upper_left
,
lower_left
,
upper_right
,
lower_right
=
list
upper_right
.
zip
(
lower_right
).
map
(
&
'+'
)
+
upper_left
.
zip
(
lower_left
).
map
(
&
'+'
)
end
end
And be assured that months from now if I wanted to support rotating rectangular matrices of arbitrary size, I could modify :cond
, :before
, and :after
with confidence that the basic method was not being broken.
9.2 The Message
The message here is that taken by themselves, tools like recursive combinators or String#to_proc
just look strange. But when we use them together, they reinforce each other and the sum becomes much greater than the sum of the parts. In the case of String#to_proc
, it looks like frivolity to most Ruby programmers, because they don’t use that many lambdas: Why should they when the existing syntax makes writing combinators hard to use? But when we have combinators in our hand, we see how String#to_proc
can make them a win. So two things that look weird on their own are a useful tool when used in conjunction.
Our final example ended up being slightly longer than a naive version, however it is longer in ways that matter rather than longer in a mindless ceremonial way like some languages.
And that’s the point of languages like Ruby: You have the tools to decide which portions of you code matter more than others, and to make the parts that matter stand out and the parts that don’t go away. You may disagree with my choice of what matters for a recursive divide and conquer algorithm, but I hope we can agree that it’s valuable to be able to make that choice for yourself or your team.
Seriously.
10 The Hopelessly Egocentric Book Chapter
In Raymond Smullyan’s delightful book on Combinatory logic, To Mock a Mockingbird, Smullyan explains combinatory logic and derives a number of important results by presenting the various combinators as songbirds in a forest. One of his concepts is the Hopelessly Egocentric Bird:
We call a bird B hopelessly egocentric if for every bird
x
,Bx = B
. This means that whatever birdx
you call out toB
is irrelevant; it only callsB
back to you! Imagine that the bird’s name is Bertrand. When you call out “Arthur,” you get the response “Bertrand”; when you call out “Raymond,” you get the response “Bertrand”; when you call out “Ann,” you get the response “Bertrand.” All this bird can ever think about is itself!
Some folks have proposed that by making Ruby’s nil
hopelessly egocentric, we can avoid the need for monadic idioms like #andand
. Let’s examine the idea and see what consequences this has.
10.1 Object-oriented egocentricity
One of the tenets of OO programming is that programs consist of objects that respond to messages they send each other. A hopelessly egocentric object is easy to imagine: No matter what message you send it, the hopelessly egocentric object responds with itself:
class
HopelesslyEgocentric
<
BlankSlate
def
method_missing
(
*
arguments
);
self
;
end
end
Now you can create a hopelessly egocentric object with HopelesslyEgocentric.new
and no matter what message you send it, you will get it back in response. And? What good is this? What can it do? Why should we put it in our Zoo?
In Objective C, nil is hopelessly egocentric. As Learn Objective-C puts it, You usually don’t need to check for nil before calling a method on an object. If you call a method on nil that returns an object, you will get nil as a return value. The idea here is that instead of getting a NoMethodError
when we send a message to nil, we get nil back.
Some people like this so much they’ve composed the same semantics for Ruby:
class
NilClass
def
method_missing
(
*
args
);
nil
;
end
end
Now instead of writing person && person.name && person.name.upcase
or person.andand.name.andand.upcase
, you write person.name.upcase
and either get the person’s name in upper case or nil. Wonderful! Or is it? Let’s take a look at what we’re trying to accomplish and the limitations of this approach.
queries
Hopelessly egocentric nil works reasonably for querying properties, in other words sub-entities when an entity is constructed by composition, things like .name
. I’m quite happy if person.name
returns nil whether we don’t have a person or if the person doesn’t have a name. And we can extend this to what I would call purely functional transformations like .upcase
. Just as ''.upcase
is ''
, it is reasonable to think of nil.upcase
as nil.
Now let’s look at some things that aren’t properties and aren’t purely functional transformations. What do we do with methods that are intended to update their receiver? Consider a bank account object. Do we really want to write things like:
person
.
account
=
nil
person
.
account
.
increment_balance
(
100
)
=>
nil
This makes no sense. If we want to give them a hundred dollars, we had better have their actual account on hand! Clearly there is a huge difference between methods that are queries and methods that are updates. (Note that andand
doesn’t save us either, except by virtue of being explicit rather than magical so we can eschew it for update methods like #increment_balance
.)
updates
Now that we are talking about methods with side-effects, let’s be more specific. Our hopelessly egocentric nil does return nil to any method. But it has another property, it has no side-effects. This is sometimes what we want! Let’s look at our nil account again. What about this code:
person
.
account
.
update_attribute
(
:
primary_email
,
'
reg
@
braythwayt
.
com
'
)
To decide what we think of this, we need to be specific about the meaning of nil. Generally, nil means one of two things:
- NONE, meaning “There isn’t one of these,” or;
- UNKNOWN, meaning “There is one of these, but we don’t know what it is.”
person.account.update_attribute(:primary_email, 'reg@braythwayt.com')
is an example of why this difference matters. If person.account
is an account, we want to update its primary email address, of course. And if person.account
is NONE, we might be very happy not updating its primary email address. Perhaps our code looks like this:
class
Person
<
ActiveRecord
::
Base
belongs_to
:
account
def
update_email
(
new_email
)
self
.
class
.
transaction
do
update_attribute
(
:
primary_email
,
new_email
)
account
.
update_attribute
(
:
primary_email
,
new_email
)
end
end
#
...
end
Person
.
find
(
:
first
,
:
conditions
=>
{...}).
update_email
(
'
reg
@
braythwayt
.
com
'
)
Meaning, update our person’s primary email address, and if they have an account, update it too. If nil means NONE, this works. But what if nil really means UNKNOWN rather than NONE? Now it is wrong to silently fail. Let me give you a very specific way this can happen. When performing a database query, we can specify the exact columns we want returned. In Active Record, we might write something like this:
person
=
Person
.
find
(
:
first
,
:
conditions
=>
{...},
:
select
=>
'
id
,
name
'
)
What this means is that there is an account_id
column in the people
table, however we are deliberately not loading it into person
. ActiveRecord will still supply us with a #account
method, however it will return nil. This absolutely, positively means that person.account
is UNKNOWN, not NONE. There could well be an account in our database for this person, and now if we write:
person
.
update_email
(
'
reg
@
braythwayt
.
com
'
)
We do not want it to silently ignore the account email update, because we haven’t loaded the account
associated model. So for UNKNOWN, our two rules are:
- Querying UNKNOWN returns UNKNOWN;
- All attempts to update UNKNOWN are errors.
What about NONE? We gave two examples of updates, one of which was a really bad idea,#increment_balance
, and the other of which was fine update_attribute(:primary_email, new_email)
. Thus we have three rules for NONE:
- Querying NONE returns NONE;
- Some updates to NONE may return NONE and have no side effects;
- Some updates to NONE may be errors.
With a little forethought and design, you may be able to construct one or more classes if your application for which all updates to NONE return NONE and have no side effects. But for all others, methods like #increment_balance
represent a semantic problem with using a hopelessly egocentric nil to represent NONE. We also see a problem with writing a hopelessly egocentric nil to handle UNKNOWN: How does it know which methods are queries and which methods are updates?
If we work really hard and eliminate all possibility of an update to NONE being an error, are there any other issues with using a hopelessly egocentric nil? Let’s return to our initial case:
person
.
name
=>
nil
person
.
name
.
upcase
=>
nil
Makes sense. And then we write:
person
.
name
+
", esq."
=>
nil
Dubious, but let’s go with it. If this makes sense, we ought to be able to write this as well:
"Mister "
+
person
.
name
=>
TypeError
:
can
'
t
convert
nil
into
String
Why is this an error? Things don’t get any better using a hopelessly egocentric nil to handle UNKNOWN. Even if we can get past the issue of update methods, we have another problem that is much more difficult to resolve. UNKNOWN introduces tri-value logic:
UNKNOWN
==
Object
.
new
=>
UNKNOWN
UNKNOWN
!=
Object
.
new
=>
UNKNOWN
UNKNOWN
==
UNKNOWN
=>
UNKNOWN
UNKNOWN
!=
UNKNOWN
=>
UNKNOWN
Object
.
new
==
UNKNOWN
=>
UNKNOWN
Object
.
new
!=
UNKNOWN
=>
UNKNOWN
When you don’t know something’s value, it is neither equal to nor not equal to any other value, including another unknown value. And our fifth and sixth examples suffer from the same problem as nil + ", esq."
vs. "Mister " + nil
. We would need to patch all sorts of other objects to make equality testing many many other methods work. (What is 42 < UNKNOWN
?) But things get worse:
How does truthiness work? In Ruby, you cannot override the way and
, or
, if
, unless
, &&
, and ||
work. What are the semantics of if UNKNOWN
? What do true && UNKNOWN
or UNKNOWN or true
return? Before implementing a true UNKNOWN in any language, I would want those questions answered.
Finally, there is actually a fifth and sixth rule that we are ignoring because these examples are in Ruby rather than a language with an expressive type system. Consider:
'
Reg
Braithwaite
'
.
wealthy
?
=>
NoMethodError
:
undefined
method
`
wealthy
?
'
for
"Reg Braithwaite"
:
String
And now we write:
person
.
name
.
wealthy
?
#
or
...
person
.
name
.
andand
.
wealthy
?
What happens if person.name
is NONE? What happens if person.name
is UNKNOWN? Our problem here is that #wealthy?
is never a valid message to send to something returned by person.name
. Our behaviour ought to be:
- Sending an invalid message to NONE raises a
NoMethodError
; - Sending an invalid message to UNKNOWN raises a
NoMethodError
.
There is no easy way to do this in Ruby, of course. Not only do we have trouble disambiguating queries from updates, we have trouble disambiguating valid from invalid messages.
For all of these reasons, I am loathe to implement a hopelessly egocentric nil and prefer to use an explicit idiom like #andand
or #try
. With explicit idioms, I can deal with the ambiguity between nil meaning NONE and nil meaning UNKNOWN and make sure my code does not violate the rules given here. But what I like about the idea of a hopelessly egocentric nil is that thinking the consequences provokes me to really think about the semantics of my data schemas.
Representing NONE and UNKNOWN values is a subtle problem requiring a deep and pervasive approach to typing similar to C++’s const
keyword and/or writing custom null objects that understand which methods are safe to respond egocentrically and which are errors.
11 Bonus Chapter: Separating Concerns in Coffeescript using Aspect-Oriented Programming
This chapter isn’t strictly about combinatory logic and it especially isn’t about Ruby programming. However, once you grasp the underlying fundamental principles, you can apply them in other environments using other programming languages.
You shouldn’t find it too difficult to relate the content to previous chapters, the title alone provides a massive hint.
Modern object-oriented software design favours composition over inheritance and celebrates code that is DRY. The idea is to separate each object’s concerns and responsibility into separate units of code, each of which have a single responsibility. When two different types of objects share the same functionality, they do not repeat their implementation, instead they share their implementation.
When composing functionality at the method level of granularity, techniques such as mixins and delegation are effective design tools. But at a finer level of granularity, we sometimes wish to share functionality within methods. In a traditional design, we have to extract the shared functionality into a separate method that is called by other methods.
decomposing methods
You might think of extracting smaller methods from bigger methods as decomposing methods. You break them into smaller pieces, and thus you can share functionality or rearrange the pieces so that your code is organized by responsibility.
For example, let’s say that we are writing a game for the nostalgia market, and we wish to use partially constructed objects to save resources. When we go to actually use the object, we hydrate it, loading the complete object from persistent storage. This is a coarse kind of lazy evaluation.
Here’s some bogus code:
class
Wumpus
roar:
->
#
code
that
hydrates
a
Wumpus
#
...
#
code
that
roars
#
...
run:
->
#
code
that
hydrates
a
Wumpus
#
...
#
code
that
runs
#
...
class
Hunter
draw:
(
bow
)
->
#
code
that
hydrates
a
Hunter
#
...
#
code
that
draws
a
bow
#
...
run:
->
#
code
that
hydrates
a
Hunter
#
...
#
code
that
runs
#
...
We can decompose it into this:
class
Wumpus
roar:
->
hydrate
(
this
)
#
code
that
roars
#
...
run:
->
hydrate
(
this
)
#
code
that
runs
#
...
class
Hunter
draw:
(
bow
)
->
hydrate
(
this
)
#
code
that
draws
a
bow
#
...
run:
->
hydrate
(
this
)
#
code
that
runs
#
...
hydrate
=
(
object
)
->
#
code
that
hydrates
the
object
from
storage
composing methods
On an ad hoc basis, decomposing methods is fine. But there is a subtle problem. Implementation tricks like hydrating objects, memoizing return values, or other performance tweaks are orthogonal to the mechanics of what methods like roar
or run
are supposed to do. So why is hydrate(this)
in every method?
Now the obvious answer is, “Ok, it might be orthogonal to the main business of each method, but it’s just one line.” The trouble with this answer is that method decomposition doesn’t scale. We need a line for hydration, a line or two for logging, a few lines for error handling, another for wrapping certain things in a transaction…
Even when each orthogonal concern is boiled down to just one line, you can end up having the orthogonal concerns take up more space than the main business. And that makes the code hard to read in practice. You don’t believe me? take a look at just about every programming tutorial ever written. They almost always say “Hand waving over error handling and this and that” in their code examples, because they want to make the main business of the code clearer and easier to read.
We ought to do the same thing, move hydration, error handling, logging, transactions, and anything else orthogonal to the main business of a method out of the method. And we can.
method combinations
Here’s our code again, this time using the YouAreDaChef library to provide before combinations:
YouAreDaChef
=
require
(
'
YouAreDaChef
.
coffee
'
).
YouAreDaChef
class
Wumpus
roar:
->
#
...
run:
->
#
...
class
Hunter
draw:
(
bow
)
->
#
...
run:
->
#
...
hydrate
=
(
object
)
->
#
code
that
hydrates
the
object
from
storage
YouAreDaChef
(
Wumpus
,
Hunter
)
.
before
'
roar
'
,
'
draw
'
,
'
run
'
,
()
->
hydrate
(
this
)
Whenever the roar
, draw
, or run
methods are called, YouAreDaChef calls hydrate(this)
first. And the two concerns–How a Wumpus works and when it ought to be hydrated–are totally separated. This isn’t a new idea, it’s called aspect-oriented programming, and practitioners will describe what we’re doing in terms of method advice and point cuts.
Ruby on Rails programmers are familiar with this idea. If you have ever written any of the following, you were using Rails’ built-in aspect-oriented programming support:
after_save
validates_each
alias_method_chain
before_filter
These and other features of Rails implement method advice, albeit in a very specific way tuned to portions of the Rails framework.
the unwritten rule
There is an unwritten rule that says every Ruby programmer must, at some point, write his or her own AOP implementation –Avdi Grimm
Let’s look at how YouAreTheChef works. Here’s a simplified version of the code for the before
combination:
YouAreDaChef:
(
clazzes
...)
->
before:
(
method_names
...,
advice
)
->
_
.
each
method_names
,
(
name
)
->
_
.
each
clazzes
,
(
clazz
)
->
if
_
.
isFunction
(
clazz
.
prototype
[
name
])
pointcut
=
clazz
.
prototype
[
name
]
clazz
.
prototype
[
name
]
=
(
args
...)
->
advice
.
apply
(
this
,
args
)
pointcut
.
call
(
this
,
args
)
This is really simple, we are composing a method with a function. The method already defined in the class is called the pointcut, and the function we are supplying is called the advice. Unlike a purely functional combinator, we are only executing the advice for side-effects, not for its result. But in object-oriented imperative programming, that’s usually what we want.
other method combinations
That looks handy. But we also want an after method, a way to compose methods in the other order. Good news, the after combination is exactly what we want. After combinations are very handy for things like logging method calls or cleaning things up.
But there’s another great use for after combinators, triggering events. Event triggering code is often very decoupled from method logic: The whole point of events is to invert control so that an object like a Wumpus
doesn’t need to know which objects want to do something after it moves. For example, a Backbone.js view might be observing the Wumpus and wish to update itself when the Wumpus moves:
YouAreDaChef
(
Wumpus
,
Hunter
)
.
after
'
run
'
,
()
->
this
.
trigger
'
move
'
,
this
CaveView
=
Backbone
.
View
.
extend
initialize:
->
#
...
@
model
.
bind
'
move
'
,
@
wumpusMoved
wumpusMoved:
(
wumpus
)
->
#
...
The code coupling the view to the model has now been separated from the code defining the model itself.
YouAreDaChef also provides other mechanisms for separating concerns. Around combinations (also called around advice) are a very general-purpose combinator. With an around combination, the original method (the pointcut) is passed to the advice function as a parameter, allowing it to be called at any time.
Around advice is useful for wrapping methods. Using an around combinator, you could bake error handling and transactions into methods without encumbering their code with implementation details. In this example, we define the methods to be matched using a regular expression, and YouAreDaChef passes the result of the match to the advice function, which wraps them in a transaction and adds some logging:
class
EnterpriseyLegume
setId:
(
@
id
)
->
setName:
(
@
name
)
->
setDepartment:
(
@
department
)
->
setCostCentre:
(
@
costCentre
)
->
YouAreDaChef
(
EnterpriseyLegume
)
.
around
/
set
(.
*
)
/
,
(
pointcut
,
match
,
value
)
->
performTransaction
()
->
writeToLog
"#{match[1]}: #{value}"
pointcut
(
value
)
summary
Method combinations are a technique for separating concerns when the level of granularity is smaller than a method. This makes the code DRY and removes the clutter of orthogonal responsibilities.
12 Appendix: Finding Joy in Combinators
In this book, we have looked at a few interesting combinators and some Ruby code inspired by them. Today we’ll review the definition of a combinator, and from there we’ll learn something intriguing about an entire family of programming languages, the Concatenative Languages.
Let’s start at the beginning: What is a combinator?
One definition of a combinator is a function with no free variables. Another way to put it is that a combinator is a function that takes one or more arguments and produces a result without introducing anything new. In Ruby terms, we are talking about blocks, lambdas or methods that do not call anything except what has been passed in.
So if I tell you that:
finch
.
call
(
a
).
call
(
b
).
call
(
c
)
=>
c
.
call
(
b
).
call
(
a
)
Then you know that finch
is a combinator because the effect it produces is made up solely of combining the effects of the things it takes as parameters. That’s easy, but yet… Where is our vaunted simplicity? Working with Ruby’s lambdas and braces and calls gets in our way. We can learn a lot from combinatorial logic to help our Ruby programming, but Ruby is a terrible language for actually learning about combinatorial logic.
12.1 Languages for combinatorial logic
Combinatorial logicians use a much simpler, direct syntax for writing expressions:
Fabc
=>
cba
Whenever a logician writes abc
, he means the same thing as when a Rubyist writes a.call(b).call(c)
. Note that like Ruby, the precedence in combinatorial logic is to the left, so abc
is equivalent to (ab)c
just as in Ruby a.call(b).call(c)
is equivalent to (a.call(b)).call(c)
.
I think you’ll agree that abc
is much simpler than a.call(b).call(c)
. Here’s another look at the combinators we’ve met in this series, using the simpler syntax:
Kxy
=>
x
Txy
=>
yx
Cxyz
=>
xzy
Q3xyz
=>
z
(
xy
)
#
Q3
is
shorthand
for
the
Quirky
bird
Bxyz
=
x
(
yz
)
Qxyz
=
y
(
xz
)
#
Q
is
shorthand
for
the
Queer
bird
There are many, many more combinators, of course. Infinitely more, in fact. We only have names for some of the most useful. For example, the Warbler Twice Removed, or W**
is written:
W
**
xyzw
=>
xyzww
(Warblers are actually in a whole ‘nother family of birds that introduce duplication. Other members of that family include the Mockingbird and Starling. They’re incredibly useful for introducing ideas like iteration and recursion.)
You could say that combinators take a string of symbols (like x, y, z, w, and so forth), then they introduce some erasing, some duplication, some permutation, and add some parentheses. That they work to rearrange our string of symbols.
We have seen that parentheses are allowed, and that some combinators introduce parentheses. Before you say that the combinators introduce new symbols, remember that parentheses are punctuation. If you think of the symbols as words and the parentheses as punctuation, you see that the combinators simply rearrange the words and change the punctuation without introducing new words.
Now I said that combinators work with strings of symbols. This was a terrible analogy, because it made us talk about punctuation and why parentheses are not symbols. Another thing you could say is that combinator work with lists of symbols, then they re-arrange the symbols, including removing symbols, introducing sub-lists, and duplicating symbols.
This is more interesting! Now we can see that in our notation, adding parentheses is a way of introducing a sub list. Let’s revisit the bluebird:
Bxyz
=
x
(
yz
)
Now what we can say is this: The bluebird takes a list of three symbols and answers a list of one symbol and a sublist of two symbols. In Ruby:
bluebird
=
lambda
{
|*
args
|
x
,
y
,
z
=
args
[
x
,
[
y
,
z
]]
}
bluebird
.
call
(
:
x
,
:
y
,
:
z
)
=>
[
:
x
,
[
:
y
,
:
z
]]
This is easy. What about the Thrush?
thrush
=
lambda
{
|*
args
|
x
,
y
=
args
[
y
,
x
]
}
thrush
.
call
(
:
x
,
:
y
)
=>
[
:
y
,
:
x
]
Now let’s pause for a moment. Imagine we had an entire programming language devoted to this style of programming. The primary thing it does is define combinators that take a list of symbols and recombine them. Since it works with lists and we are thinking about combinatory logic, we will represent our expressions as lists:
idiot
:
x
=>
:
x
mockingbird
:
x
=>
:
x
:
x
bluebird
:
x
:
y
:
z
=>
:
x
[:
y
:
z
]
thrush
:
x
:
y
=>
:
y
:
x
Wait! Do not shout Lisp! Just because we have lists of things does not mean we are programming in Lisp!! Let’s keep going, and you will see in the next example that I do not mean Lisp:
bluebird
thrush
:
x
:
y
:
z
=>
thrush
[
:
x
:
y
]
:
z
=>
:
z
[
:
x
:
y
]
And therefore in our fictitious language we can write:
quirky
=
bluebird
thrush
And thus:
quirky
:
x
:
y
:
z
=>
:
z
[:
x
:
y
]
This looks familiar. Have you ever written a program in Postscript? Or [Forth](http://en.wikipedia.org/wiki/Forth_(programming_language)? What if instead of using a thrush we used a word called swap
? Or instead of a mockingbird we used a word called dup
?
12.2 Concatenative languages
Concatenative (or stack-based) programming languages–like Postscript, Forth, Factor, and Joy–are almost direct representations of combinatorial logic. There is a list of things, words or combinators permute the list of things, and the things can be anything: data, other combinators, or even programs. These languages are called concatenative languages because the primary way to compose programs and combinators with each other is to concatenate them together, like we did with the bluebird and thrush above.
For me the purpose of life is partly to have joy. Programmers often feel joy when they can concentrate on the creative side of programming, So Ruby is designed to make programmers happy. –Yukihiro Matsumoto
You have probably heard that it is a good idea to learn a new programming language every year. Is a concatenative language on your list of languages to learn? No? Well, here is the reason to learn a concatenative language: You will learn to think using combinatorial logic. For example, the Y Combinator is expressed in Joy as:
[
dup
cons
]
swap
concat
dup
cons
i
Where dup
is a mockingbird, swap
is a thrush, i
is an idiot bird, and cons
and concat
are likewise two other combinators. Writing in Joy is writing directly in combinators.
In other programming languages, combinatorial logic is an underpinning. It helps us explain and prove certain things, It inspires us to invent certain things. It is behind everything we do. That’s good. But in a concatenative language, it is not an underpinning or behind a curtain. It is right out there in front of you. And learning to program in a concatenative language means learning to think in combinators.
The combinators we’ve discussed in depth so far are all fascinating, however as a basis for writing programs they are incomplete. You cannot represent every possible program using kestrels, thrushes, cardinals, quirky birds, bluebirds, and queer birds. To represent all possible programs, we need to have at least one combinator that duplicates symbols, like a mockingbird or another from its family.
13 Appendix: Source Code
All source code is published under the following license:
# The MIT License
#
# All contents Copyright (c) 2004-2008 Reginald Braithwaite
# <http://braythwayt.com> except as otherwise noted.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#
# http://www.opensource.org/licenses/mit-license.php
13.1 kestrels
returning.rb
# The MIT License
#
# All contents Copyright (c) 2004-2008 Reginald Braithwaite
# <http://braythwayt.com> except as otherwise noted.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#
# http://www.opensource.org/licenses/mit-license.php
unless
respond_to?
(
:returning
)
def
returning
(
it
)
yield
it
it
end
end
13.2 thrushes
into.rb
class
Object
def
into
expr
=
nil
expr
.
nil?
?
yield
(
self
)
:
expr
.
to_proc
.
call
(
self
)
end
end
let.rb
module
Kernel
def
let
it
yield
it
end
end
13.3 the cardinal
cardinal.rb
def
cardinal_define
(
name
,
&
proc_over_proc
)
define_method_taking_block
(
name
)
do
|
a_value
,
a_proc
|
proc_over_proc
.
call
(
a_proc
)
.
call
(
a_value
)
end
end
# method_body_proc should expect (a_value, a_proc)
# see http://coderrr.wordpress.com/2008/10/29/using-define_method-with-blocks-in-\
ruby
-
18
/
def
define_method_taking_block
(
name
,
&
method_body_proc
)
self
.
class
.
send
:define_method
,
"__cardinal_helper_
#{
name
}
__"
,
method_body_proc
eval
<<-
EOM
def #{name}(a_value, &a_proc)
__cardinal_helper_#{name}__(a_value, a_proc)
end
EOM
end
13.4 quirky birds
andand.rb
module
AndAnd
# This module is included in Object, so each of these methods are added
# to Object when you require 'andand'. Each method is an *adverb*: they are
# intended to be enchained with another method, such as receiver.adverb.method
#
# The purpose of an adverb is to modify what the primary method returns.
#
# Adverbs also take blocks or procs, passing the receiver as an argument to the
# block or proc. They retain the same semantics with a block or proc as they
# do with a method. This behaviour weakly resembles a monad.
module
ObjectGoodies
# Returns nil if its receiver is nil, regardless of whether nil actually hand\
les
the
# actual method ot what it might return.
#
# 'foo'.andand.size => 3
# nil.andand.size => nil
# 'foo'.andand { |s| s << 'bar' } => 'foobar'
# nil.andand { |s| s << 'bar' } => nil
def
andand
(
p
=
nil
)
if
self
if
block_given?
yield
(
self
)
elsif
p
p
.
to_proc
.
call
(
self
)
else
self
end
else
if
block_given?
or
p
self
else
MockReturningMe
.
new
(
self
)
end
end
end
# Invokes the method and returns the receiver if nothing is raised. Therefore,
# the purpose of calling the method is strictly for side effects. In the block
# form, it resembles #tap from Ruby 1.9, and is useful for debugging. It also
# resembles #returning from Rails, with slightly different syntax.
#
# Object.new.me do |o|
# def o.foo
# 'foo'
# end
# end
# => your new object
#
# In the method form, it is handy for chaining methods that don't ordinarily
# return the receiver:
#
# [1, 2, 3, 4, 5].me.pop.reverse
# => [4, 3, 2, 1]
def
me
(
p
=
nil
)
if
block_given?
yield
(
self
)
self
elsif
p
p
.
to_proc
.
call
(
self
)
self
else
ProxyReturningMe
.
new
(
self
)
end
end
unless
Object
.
instance_methods
.
include?
(
'tap'
)
alias
:tap
:me
end
# Does not invoke the method or block and returns the receiver.
# Useful for comemnting stuff out, especially if you are using #me for
# debugging purposes: change the .me to .dont and the semantics of your
# program are unchanged.
#
# [1, 2, 3, 4, 5].me { |x| p x }
# => prints and returns the array
# [1, 2, 3, 4, 5].dont { |x| p x }
# => returns the array without printing it
def
dont
(
p
=
nil
)
if
block_given?
self
elsif
p
self
else
MockReturningMe
.
new
(
self
)
end
end
end
end
class
Object
include
AndAnd
:
:ObjectGoodies
end
unless
Module
.
constants
.
include?
(
'BlankSlate'
)
if
Module
.
constants
.
include?
(
'BasicObject'
)
module
AndAnd
class
BlankSlate
<
BasicObject
end
end
else
module
AndAnd
class
BlankSlate
def
self
.
wipe
instance_methods
.
reject
{
|
m
|
m
=~
/^__/
}
.
each
{
|
m
|
undef_method
m
}
end
def
initialize
BlankSlate
.
wipe
end
end
end
end
end
module
AndAnd
# A proxy that returns its target without invoking the method you
# invoke. Useful for nil.andand and #dont
class
MockReturningMe
<
BlankSlate
def
initialize
(
me
)
super
()
@me
=
me
end
def
method_missing
(
*
args
)
@me
end
end
# A proxy that returns its target after invoking the method you
# invoke. Useful for #me
class
ProxyReturningMe
<
BlankSlate
def
initialize
(
me
)
super
()
@me
=
me
end
def
method_missing
(
sym
,
*
args
,
&
block
)
@me
.
__send__
(
sym
,
*
args
,
&
block
)
@me
end
end
end
blank_slate.rb
# The MIT License
#
# All contents Copyright (c) 2004-2008 Reginald Braithwaite
# <http://braythwayt.com> except as otherwise noted.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#
# http://www.opensource.org/licenses/mit-license.php
unless
Module
.
constants
.
include?
(
'BlankSlate'
)
if
Module
.
constants
.
include?
(
'BasicObject'
)
class
BlankSlate
<
BasicObject
end
else
class
BlankSlate
def
self
.
wipe
instance_methods
.
reject
{
|
m
|
m
=~
/^__/
}
.
each
{
|
m
|
undef_method
m
}
end
def
initialize
BlankSlate
.
wipe
end
end
end
end
quirky_bird.rb
# The MIT License
#
# All contents Copyright (c) 2004-2008 Reginald Braithwaite
# <http://braythwayt.com> except as otherwise noted.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#
# http://www.opensource.org/licenses/mit-license.php
def
quirky_bird_define
(
name
,
&
value_proc
)
define_method_taking_block
(
name
)
do
|
a_value
,
a_proc
|
a_proc
.
call
(
value_proc
.
call
(
a_value
))
end
end
# method_body_proc should expect (a_value, a_proc)
# see http://coderrr.wordpress.com/2008/10/29/using-define_method-with-blocks-in-\
ruby
-
18
/
def
define_method_taking_block
(
name
,
&
method_body_proc
)
self
.
class
.
send
:define_method
,
"__quirky_bird_helper_
#{
name
}
__"
,
method_body_p
\
roc
eval
<<-
EOM
def #{name}(a_value, &a_proc)
__quirky_bird_helper_#{name}__(a_value, a_proc)
end
EOM
end
def
quirky_bird_extend
(
name
,
&
value_proc
)
Object
.
send
(
:define_method
,
name
)
do
value_proc
.
call
(
self
)
end
end
quirky_songs.rb
# The MIT License
#
# All contents Copyright (c) 2004-2008 Reginald Braithwaite
# <http://braythwayt.com> except as otherwise noted.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#
# http://www.opensource.org/licenses/mit-license.php
require
'quirky_bird'
require
'blank_slate'
require
'returning'
quirky_bird_extend
(
:maybe
)
do
|
value
|
if
value
.
nil?
returning
(
BlankSlate
.
new
)
do
|
it
|
def
it
.
method_missing
(
*
args
)
nil
end
end
else
value
end
end
quirky_bird_extend
(
:try
)
do
|
value
|
returning
(
BlankSlate
.
new
)
do
|
it
|
def
it
.
__value__
=
(
arg
)
@value
=
arg
end
def
it
.
method_missing
(
name
,
*
args
)
if
@value
.
respond_to?
(
name
)
@value
.
send
(
name
,
*
args
)
end
end
it
.
__value__
=
value
end
end
13.5 bluebirds
before_and_after_advice.rb
# This code contains ideas snarfed from:
#
# http://github.com/up_the_irons/immutable/tree/master
# http://blog.jayfields.com/2006/12/ruby-alias-method-alternative.html
# http://eigenclass.org/hiki.rb?bounded+space+instance_exec
#
# And a heaping side of http://blog.grayproductions.net/articles/all_about_struct
module
BeforeAndAfterAdvice
# Random ID changed at each interpreter load
UNIQ
=
"_
#{
object_id
}
"
Compositions
=
Struct
.
new
(
:before
,
:between
,
:after
)
module
MethodAdvice
;
end
module
ClassMethods
# Example:
#
# before :foo, :bar do
# # ...
# end
#
# This executes the body of the block before the #foo and #bar instance metho\
ds
# for side effects without modifying the parameters (if any) passed to #foo
# and #bar
#
# before :fizz, :buzz do |p1, p2|
# # ...
# [p1, p2]
# end
#
# This executes the body of the block before the #foo and #bar instance metho\
ds
# for side efects, AND determines what is passed along as parameters. If the \
block
# takes parameters, it acts as a filter, transforming the parameters.
#
# It is possible to chain #before advice, and you can add more advice in subc\
lasses
:
#
# class Foo
# def foo(bar); end
# end
#
# class Bar < Foo
# include MethodAdvice
#
# before :foo do
# # ...
# end
#
# end
#
# class Blitz < Bar
# include MethodAdvice
#
# before :foo do |bar|
# # ...
# bar
# end
#
# end
#
def
before
(
*
method_symbols
,
&
block
)
options
=
method_symbols
[-
1
].
kind_of?
(
Hash
)
?
method_symbols
.
pop
:
{}
method_symbols
.
each
do
|
method_sym
|
__composed_methods__
[
method_sym
].
before
.
unshift
(
__unbound_method__
(
block
,\
options
[
:name
]
))
__rebuild_method__
(
method_sym
)
end
end
# Example:
#
# after :foo, :bar do
# # ...
# end
#
# This executes the body of the block after the #foo and #bar instance methods
# for side effects without modifying the return values of the #foo and #bar m\
ethods
#
# after :fizz, :buzz do |r|
# # ...
# r
# end
#
# This executes the body of the block after the #foo and #bar instance methods
# for side efects, AND determines what is returned from the call. If the block
# takes parameters, it acts as a filter, transforming the return value.
#
# It is possible to chain #after advice, and you can add more advice in subcl\
asses
:
#
# class Foo
# def foo(bar); end
# end
#
# class Bar < Foo
# include MethodAdvice
#
# after :foo do
# # ...
# end
#
# end
#
# class Blitz < Bar
# include MethodAdvice
#
# after :foo do |r|
# # ...
# r
# end
#
# end
#
def
after
(
*
method_symbols
,
&
block
)
options
=
method_symbols
[-
1
].
kind_of?
(
Hash
)
?
method_symbols
.
pop
:
{}
method_symbols
.
each
do
|
method_sym
|
__composed_methods__
[
method_sym
].
after
.
push
(
__unbound_method__
(
block
,
opt
\
ions
[
:name
]
))
__rebuild_method__
(
method_sym
)
end
end
# Removes all advice from the named methods. Intended for testing.
#
def
reset_befores_and_afters
(
*
method_symbols
)
method_symbols
.
each
do
|
method_sym
|
__composed_methods__
[
method_sym
].
before
=
[]
__composed_methods__
[
method_sym
].
after
=
[]
__rebuild_method__
(
method_sym
)
end
end
# Modified to re-apply advice when a method is overridden. So:
#
# class Foo
# def foo(bar); end
# end
#
# class Bar < Foo
# include MethodAdvice
#
# after :foo do
# # ...
# end
#
# end
#
# class Blitz < Bar
# include MethodAdvice
#
# def foo(bar)
# # ...
# end
#
# end
#
# In this case the class Blitz overrides the method #foo, but the advice in
# class Bar is still applied, the override happens ONLY on the inner method,
# not the advice.
#
# Note well that super has undefined behaviour in this situation.
#
def
method_added
(
method_sym
)
unless
instance_variable_get
(
"@
#{
UNIQ
}
_in_method_added"
)
__safely__
do
__composed_methods__
[
method_sym
].
between
=
self
.
instance_method
(
method_
\
sym
)
@old_method_added
and
@old_method_added
.
call
(
method_sym
)
__rebuild_method__
(
method_sym
)
end
end
end
def
__composed_methods__
ancestral_composer
=
ancestors
.
detect
{
|
ancestor
|
ancestor
.
instance_variab
\
le_defined?
(
:@__composed_methods__
)
}
if
ancestral_composer
ancestral_composer
.
instance_variable_get
(
:@__composed_methods__
)
else
@__composed_methods__
||=
Hash
.
new
{
|
hash
,
method_sym
|
hash
[
method_sym
]
\
=
BeforeAndAfterAdvice
:
:Compositions
.
new
(
[]
,
self
.
instance_method
(
method_sym
),
[]
\
)
}
end
end
def
__rebuild_without_advice__
(
method_sym
,
old_method
)
if
old_method
.
arity
==
0
define_method
(
method_sym
)
{
old_method
.
bind
(
self
)
.
call
}
else
define_method
(
method_sym
)
{
|*
params
|
old_method
.
bind
(
self
)
.
call
(
*
params
)\
}
end
end
def
__rebuild_advising_no_parameters__
(
method_sym
,
old_method
,
befores
,
after
\
s
)
define_method
(
method_sym
)
do
befores
.
each
do
|
before_advice_method
|
before_advice_method
.
bind
(
self
)
.
call
end
afters
.
inject
(
old_method
.
bind
(
self
)
.
call
)
do
|
ret_val
,
after_advice_metho
\
d
|
after_advice_method
.
bind
(
self
)
.
call
end
end
end
def
__rebuild_advising_with_parameters__
(
method_sym
,
old_method
,
befores
,
aft
\
ers
)
define_method
(
method_sym
)
do
|*
params
|
afters
.
inject
(
old_method
.
bind
(
self
)
.
call
(
*
befores
.
inject
(
params
)
do
|
acc_params
,
before_advice_method
|
if
before_advice_method
.
arity
==
0
before_advice_method
.
bind
(
self
)
.
call
acc_params
else
before_advice_method
.
bind
(
self
)
.
call
(
*
acc_params
)
end
end
)
)
do
|
ret_val
,
after_advice_method
|
if
after_advice_method
.
arity
==
0
after_advice_method
.
bind
(
self
)
.
call
ret_val
else
after_advice_method
.
bind
(
self
)
.
call
(
ret_val
)
end
end
end
end
def
__rebuild_method__
(
method_sym
)
__safely__
do
composition
=
__composed_methods__
[
method_sym
]
old_method
=
composition
.
between
if
composition
.
before
.
empty?
and
composition
.
after
.
empty?
__rebuild_without_advice__
(
method_sym
,
old_method
)
else
arity
=
old_method
.
arity
if
old_method
.
arity
==
0
__rebuild_advising_no_parameters__
(
method_sym
,
old_method
,
compositio
\
n
.
before
,
composition
.
after
)
else
__rebuild_advising_with_parameters__
(
method_sym
,
old_method
,
composit
\
ion
.
before
,
composition
.
after
)
end
end
end
end
def
__safely__
was
=
instance_variable_get
(
"@
#{
UNIQ
}
_in_method_added"
)
begin
instance_variable_set
(
"@
#{
UNIQ
}
_in_method_added"
,
true
)
yield
ensure
instance_variable_set
(
"@
#{
UNIQ
}
_in_method_added"
,
was
)
end
end
def
__unbound_method__
(
a_proc
,
name_prefx
=
nil
)
begin
old_critical
,
Thread
.
critical
=
Thread
.
critical
,
true
n
=
0
n
+=
1
while
respond_to?
(
mname
=
"
#{
name_prefx
||
'__method_advice'
}
_
#{
n
}
"
)
MethodAdvice
.
module_eval
{
define_method
(
mname
,
&
a_proc
)
}
ensure
Thread
.
critical
=
old_critical
end
begin
MethodAdvice
.
instance_method
(
mname
)
ensure
MethodAdvice
.
module_eval
{
remove_method
(
mname
)
}
unless
name_prefx
rescue
\
nil
end
end
end
def
self
.
included
(
receiver
)
receiver
.
extend
ClassMethods
receiver
.
send
:include
,
MethodAdvice
receiver
.
instance_variable_set
(
"@
#{
UNIQ
}
_in_method_added"
,
false
)
receiver
.
instance_variable_set
(
:@old_method_added
,
receiver
.
public_method_def
\
ined?
(
:method_added
)
&&
receiver
.
instance_method
(
:method_added
))
end
end
14 About The Author
When he’s not shipping Ruby, Javascript and Java applications scaling out to millions of users, Reg “Raganwald” Braithwaite has authored libraries for Javascript and Ruby programming such as Katy, JQuery Combinators, YouAreDaChef, andand, and others.
He writes about programming on his “Homoiconic” un-blog as well as general-purpose ruminations on his posterous space. He is also known for authoring the popular raganwald programming blog from 2005-2008.
14.1 contact
Twitter: @raganwald
Email: raganwald@gmail.com
(Author’s Photograph (c) 2008 Joseph Hurtado, All Rights Reserved. http://www.flickr.com/photos/trumpetca/. Cover Photograph (c) 2011 Biker Jun. Some rights reserved.)