Appendix A. A Short Introduction to Ruby
The aim of this section is to gently introduce you to Ruby in enough detail that you can read the recipes without getting completely bogged down in the syntax. You can safely skip this if you have any Ruby experience.
I’m not going to attempt to cover Ruby in great detail. If you’re interested in learning more, there are many great books out there on Ruby. One particularly good one is The Ruby Programming Language by David Flanagan and Yukihiro Matsumoto.
An Example Program
Here is an example program that really doesn’t do anything but illustrate some syntax:
Example A.1. example.rb
1 #!/usr/bin/env ruby
2
3 # Comments begin with the pound sign. They can also go at the end of lines
4 puts "the example program" # puts sends output to STDOUT and appends a newline
5
6 # assignment
7 number = 2
8 s = "this is a string"
9 number = "two"
10
11 # arrays
12 an_array = [1,2,3,4,5]
13 # A more complicated array, containing integers, a string and another array
14 an_array = [1,2, 'three', [4,5,6]]
15
16 # hashes
17 a_hash = {1 => 'one', 2 => 'two', 3 => 'three', 4 => 'four'}
18 # A more complicated hash, using symbols as keys
19 a_hash = {:one => 'one', :two => 2, :three_and_four => [3,4]}
20
21 # A simple iterator
22 an_array.each do |element|
23 puts "#{element}"
24 end
Depending on your coding background, this code might look pretty normal, or just plain crazy. I’ve been coding in dynamic languages for way too long to intuitively know what areas might make you go “whoah!”, but here are some pointers that might help:
Variables are not declared, and can change type
Notice the two lines
1 number = 2
and
1 number = "two"
This, while bad programming practice, won’t raise any errors. At the end of the program, number is a string with the value “two”. Note that trying to read the value of an undeclared variable will raise an error:
1 $> irb
2 >> puts undeclared_variable
3 NameError: undefined local variable or method `undeclared_variable' for main:Obj\
4 ect
Everything is an object
In Ruby, everything is an object. This has many consequences, most of them good. One consequence is illustrated in the example code here:
1 an_array = [1,2, 'three', [4,5,6]]
An array is an object, and contains any number of other objects. There’s no need to jump through any hoops if you want to make an array of arrays or have different object types in a single array.
Data types
The code in example.rb shows the instantiation of String, Integer, Array and Hash objects. Symbols are also flirted with briefly. These and a few other data types are discussed further below.
Iterators
Notice the way the array is looped through in the last bit of the example code:
1 an_array.each do |element|
2 puts "#{element}"
3 end
This is an example of a Ruby iterator. Ruby does have things like for and while loops, but they’re rarely used. I think I used a while loop once in the code for this book, and only because I couldn’t figure out a way not to. Instead, blocks and iterators are used. They’re explained in more detail below.
Ruby Data Types
Numbers
Numbers can be integers or floating point numbers. One fun thing about numbers in Ruby is that they have methods too:
1 12.333.round # 12
2 -13.abs # 13
3 9.lcm(12) # The Lowest Common Multiple of 9 and 12 is 36
Strings
Strings are just like in every other language: an ordered list of characters. One difference from a few other languages is that strings are mutable
1 name = 'Scott Patton'
2 name[10] = 'e' # name is now 'Scott Patten'
Arrays
An array is an ordered collection of objects. The objects in an array are called elements. An array is instantiated like this:
1 some_array = [object1, object2, object3, ...]
2 some_array = [1,2,3,7,9]
You access an element of an array by giving its index (counting from zero) between square brackets
1 ordinal_numbers = ['zeroth', 'first', 'second', 'third', 'fourth', 'fifth']
2 ordinal_numbers[3] # third
For the case above, where you are instantiating an array of strings, you will often see the following syntax, which is equivalent but more concise and readable:
1 ordinal_numbers = %w(zeroth first second third fourth fifth)
Hashes
A hash is an un-ordered collection of objects. Each entry in the hash has a key (used to access the value) and a value. hashes are instantiated like this:
1 some_hash = {key1 => value1, key2 => value2, key3 => value3}
Keys can be any object, but it’s usually a bad idea to use something mutable like an array or hash as a key. It’s also more idiomatic (and memory efficient) to use a symbol rather than a string as a key. Values can be any object as well. You access a value by putting its key between square brackets, like this:
1 params = {:track_response => true, :default_response => 'Yes', :acceptable_respo\
2 nses => ['Yes', 'No', 'Maybe']}
3 params[:default_response] # true
4 params[:acceptable_responses][0] # 'Yes'
Symbols
Symbols take a little getting used to (at least they did for me when I started using Ruby). Symbols look like this: :some_symbol, a colon followed by a string of characters. At first, you can think of them as immutable strings. Symbols of the same name are initialized and exist in memory only once (as opposed to strings, which will be different objects in memory even if their contents are the same). So, if you use a symbol multiple times, you are using the same object and save some memory. For a nice explanation of symbols, see http://glu.ttono.us/articles/2005/08/19/understanding-ruby-symbols. One of the comments on that article, by Ahmad Alhashemi, gives another useful way of thinking about symbols: symbols “…are literal values, just like the number 3 and the string ‘red’. you can say: v = :s, but you can’t say: :s = v” Nicely put, Ahmad.
Ranges
A range is instantiated like
1 (start_value .. end_value)
or
1 (start_value ... end_value)
They do what you think they should: give you a range of values, increasing by a single step for every value. The two dot version includes the end value in the range, the three dot version doesn’t. If you want to see what’s inside a range, use the to_a (to array) method on it:
1 >> (1 .. 5).to_a # [1, 2, 3, 4, 5]
2 >> (1 ... 5).to_a # [1, 2, 3, 4]
To use a range, you typically iterate over it
1 (1 .. 5).each {|n| puts n*n}
2 1
3 4
4 9
5 16
6 25
Notice that I didn’t say that ranges increase by one on every step. I was intentionally vague about what the amount they increase is. Ranges can be built of any object that implements a succ method (to give the next value in the range) and the <=> method (to compare two values in the range). This is true for classes you create yourself as well.
1 >> ('a' .. 'g').to_a # ["a", "b", "c", "d", "e", "f", "g"]
2 >> require 'date'
3 >> start_date = Date.new(2010,03,12)
4 >> end_date = Date.new(2010,03,21)
5 >> vancouver_olympics_dates = (start_date .. end_date)
Output
You print things to the standard output (typically the terminal you’re working in) using puts or print. puts appends a newline to the output, print doesn’t.
Strings enclosed in double quotes do variable interpolation. Anything inside of a #{} is interpreted as Ruby code, and its output as a string inserted into the output.
1 "#{n} squared = #{n * n}" # "32 squared = 1024"
Strings in single quotes do not do variable interpolation
1 '#{n} squared = #{n * n}' # "\#{n} squared = \#{n * n}"
Every object implements two methods of converting itself to a string: to_s and inspect. By default, inspect just calls to_s. In many objects, inspect provides more readable output.
1 a = [1,2,3,4]
2 a.to_s # "1234"
3 a.inspect # "[1, 2, 3, 4]"
4 h = {:one => 1, :two => 2, :three => 3}
5 h.to_s # "one1two2three3"
6 h.inspect # "{:one=>1, :two=>2, :three=>3}"
I use inspect a lot in the recipes to show results.
You can, of course, define to_s and inspect on any classes you create.
Control Statements
Ruby has the standard control statements. An if statement looks like
1 if age > 30
2 puts "don't trust him!"
3 end
You can also use else and elseif modifiers to your if statements:
1 if x > 30
2 puts "Don't trust him!"
3 elsif x < 3
4 puts "Likely innocent"
5 else
6 puts "Still don't trust him!"
7 end
One thing that often trips up newcomers to Ruby is how true and false are defined. In most languages, the following code would result in no output:
Example A.2. truth_test.rb
1 x = 0
2 string = ""
3
4 if x
5 puts "x is true"
6 end
7
8 if string
9 puts "string is true"
10 end
In Ruby, the only things that are false are false itself, and nil. The code above will print both statements:
1 $> ruby truth_test.rb
2 x is true
3 string is true
Blocks and iterators
One of the defining characteristics of Ruby is the use of blocks and, especially, using blocks to iterate. The most basic iterator is the each iterator. For an array, each goes through each element of the array, passing the element’s value to the block. So, the following code:
1 [1,2,3,4].each {|element| puts "#{element}:#{element * element}" }
will result in the following output:
1 1:1
2 2:4
3 3:9
4 4:16
The Enumerable mixin provides a host of other iterators. All iterators work the same way: you pass a block to the iterator, and it will execute that block for each element in the collection you are iterating over. The return value you get depends on which iterator you use. The iterators provided by Enumarable include
collect and map
collect and map return an array containing the value returned by the block for each element in the collection:
1 >> [1,2,3,4].collect {|n| n*n} # [1, 4, 9, 16]
2 >> %w(monday tuesday wednesday thursday friday saturday).collect {|day| day.capi\
3 talize}
4 => ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"]
select
select returns all elements in the collection for which the block returns true
1 >> (1 .. 30).select {|n| n.is_prime?}
2 => [1, 2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
reject
reject is the opposite of select: it returns all elements in the collection for which the block returns false
1 >> (1 .. 30).reject {|n| n.even?}
2 => [1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29]
detect or find
detect and find return the first value in the collection for which the block returns true.
all?
all? returns true if the block returns true for all elements in the collection
1 >> %w(monday tuesday wednesday thursday friday saturday sunday).all? do |day|
2 day =~ /y\Z/
3 end
4 => true
There are a few more iterators in Enumerable, but the list above shows the ones used in the book.
Methods
In Ruby, what you might think of as functions or subroutines are called methods. They are defined like this:
1 def method_name(arguments)
2 ... body of method goes here ...
3 end
The body of the method can be anything you want. A method always returns a value, even if it’s just nil. If you don’t use the return method, the method will return the value of the last statement executed.
1 def double(n)
2 n * 2
3 end
4 double(2) # 4
5 double('two') # 'twotwo'
A method always belongs to a class. If you define it outside of a class, it will be added as a class method of the Object class:
1 def test
2 puts "hi, I'm a test method"
3 end
4 test
5 # hi, I'm a test method
6 Object.test
7 # hi, I'm a test method
Arguments are passed to methods as a comma delimited list of values. There are a bunch of different exceptions to this, but two that you’ll see throughout the book are default values for parameters and parameter hashes.
Default values for arguments are given like this:
1 def day_of_week(date, capitalize = true, short_form = false)
2 ...
3 end
In this case, the capitalize argument has a default value of true and short_form has a default value of false. If you only pass one argument when you call day_of_week, then capitalize will be set to true and short_form to false.
There are a few rules regarding this. First, since arguments with default values are not required to be sent, all arguments with default values must be after all the arguments without default values.
Second, if you want to set the second default argument to a non-default value, then you have to set the first default argument’s value as well. You can’t get around this by giving the argument names like you can in, for example, Python.
If the last argument to a method is a hash, then it can be passed in without the curly braces. This is often used to mimic the ability to pass in parameters without worrying about ordering:
1 def day_of_week(date, params = {})
2 capitalize = params[:capitalize] || true
3 short_form = params[:short_form] || false
4 ....
5 end
6
7 day_of_week(some_date, :short_form => true) # capitalize = true, short_form = tr\
8 ue
9 day_of_week(some_date) # capitalize = true, short_form = false
10 day_of_week(some_date, :capitalize => false) # capitalize = false, short_form = \
11 false
Another, more scaleable, way of setting the default values is:
1 def day_of_week(date, params = {})
2 defaults = {:capitalize => true, :short_form => false}
3 params = defaults.merge(params)
4 ....
5 end
Classes
A class in Ruby is defined like this:
1 class Bucket
2
3 .... contents of class ....
4
5 end
If you want to sub-class another class, then:
1 class Bucket < S3SuperClass
2
3 .... contents of class ....
4
5 end
This is read as “Bucket extends S3SuperClass”
Class names must start with a capital letter. By convention, they are camel-case (CamelCaseMeansWordsBeginWithCapitalLetters).
To create a new instance of a class, you use ClassName.new. This calls the initialize method on the class and returns a new instance of the class.
1 class Bucket
2
3 def initialize(arg1, arg2, arg3, ...)
4 .... some initialization code ....
5 end
6
7 end
8
9 # Create a new bucket
10 b = Bucket.new(arg1, arg2, arg3, ...)
Classes have both class methods and instance methods. An instance method acts on an instance of a class. A class method, called a static method in some languages, doesn’t require an instance to act on. Instance methods are defined like
1 def some_instance_method(arg1, arg2, arg3, ...
2 ... contents of method ...
3 end)
Class methods are defined as either
1 def self.class_method_name
2 ... contents of method ...
3 end
or
1 def ClassName.class_method_name
2 ... contents of method ...
3 end
and called as ClassName.class_method_name
Variables and Scope
In the interest of brevity and readability, I’m going to be a bit sloppy here. The aim, again, is to give you enough information to read the recipes, not to give a thorough introduction to variables and scope in Ruby.
In Ruby, there are five kinds of variables: global variables, class variables, instance variables, local variables and constants. They are differentiated by their names, as shown below in Table A.1, “Ruby Variable Types”.
Table A.1. Ruby Variable Types
Type Naming convention Examples Scope Notes
Global
Start with a dollar sign ($)
$global, $pi
Everywhere
Global variables are rarely used. I don’t think there is one in the book.
Class
Start with two at signs (@@)
@@class_variable, @@current_number_of_users
Within their class
Class variables are rarely used as well, but are more common than global variables
Instance
Start with a single at sign (@)
@instance_variable, @price
Within a single instance of a class
Instance variables are used widely. They are in scope throughout a single instance. See the discussion below on creating getters and setters for instance variables.
Local
Start with a lowercase letter or underscore (_)
some_variable, n
Within the block they are defined in, and that block’s children.
Easily the most common variable type.
Constants Start with a capital letter SOME_CONSTANT, PI Everywhere By convention, constants are all caps with underscores separating words. If you’re being pedantic, constants aren’t strictly variables: they will raise a warning if you try to change their value.
Getters and Setters for Instance Variables
None of the variables inside of an instance are available outside of that instance. In many classes, you will want to create getter and setter methods for instance variables. You might be tempted to write something like this:
Example A.3. dog.rb
1 class Dog
2
3 def initialize(name, age, breed)
4 @name = name
5 @age = age
6 @breed = breed
7 end
8
9 def name
10 @name
11 end
12
13 def name=(new_name)
14 @name = new_name
15 end
16
17 def age
18 @age
19 end
20
21 def age=(new_age)
22 @age = new_age
23 end
24
25 def breed
26 @breed
27 end
28
29 def breed=(new_breed)
30 @breed = new_breed
31 end
32
33 end
This will work. You can use it like this:
1 $> irb -r dog
2 >> d = Dog.new('frank', 13, 'Heinz 57')
3 => #<Dog:0x35a110 @name="frank", @breed="Heinz 57", @age=13>
4 >> d.breed
5 => "Heinz 57"
6 >> d.name
7 => "frank"
8 >> d.name = "Frank the Great, Esquire"
9 => "Frank the Great, Esquire"
10 >> d.name
11 => "Frank the Great, Esquire"
Creating getters and setters for instance variables is such a common task, however, that there’s a much easier way to do it. The attr_accessor macro will create a getter and setter for each symbol listed after it. The following code is equivalent to the more verbose version of dog.rb listed above:
Example A.4. dog.rb
1 class Dog
2
3 attr_accessor :name, :age, :breed
4
5 def initialize(name, age, breed)
6 @name = name
7 @age = age
8 @breed = breed
9 end
10
11 end
If you only want to create a getter method, use the attr_reader macro. If you want only the setter method, use attr_writer.