The refactoring phase

In earlier chapters we mentioned the laws of TDD. Originally these laws were two, in Kent Beck’s formulation:

  • Don’t write a line of new code unless you first have a failing automated test.
  • Eliminate duplication.

Essentially, what Kent Beck proposed, was to first define a small part of the specification through a test, implement a very small algorithm that satisfies it, and then, revise the code in search of duplication cases to refactor into a more general and flexible algorithm.

And this is, more or less, the way Martin Fowler defined the Red-Green-Refactor cycle:

  • Write a test for the next piece of functionality that you wish to add.
  • Write the production code necessary to make the test pass.
  • Refactor the code, both the new and the old, so that all’s well structured.

This statement seems to assume that the refactoring is, so to speak, the end of each stage of the process. But, paradoxically, if we interpret the cycle literally, we’ll fall into a bad practice.

The function of refactoring in TDD

In general, in Test Driven Development it’s favored that both tests and changes in production code are as small as possible. This minimalist approach is beneficial because it allows us to work with a light cognitive load in each cycle, while we learn and reach a more extensive and deeper comprehension of the problem, postponing decisions to a moment at which we’re knowledgeable enough to face them.

Usually, our small TDD steps let us make very simple code changes every time. Many times these changes are obvious and lead us to implementation that we could consider naive. However, as simple or rough they might seem, these implementations do pass the tests, and therefore meet the specifications. We could ship this code if we needed to, because the behavior has been developed.

And once we make the last test pass and all of them are green, we’re in good condition to refactor. This green state gives us freedom to change the shape of the implementation, assuring that we’re not accidentally changing the achieved functionality.

The refactoring phase is there, precisely, to evolve those naive implementations and turn them into better designs, making use of the safety net provided by the passing tests.

Which refactorings to do

In each cycle there are many possible refactorings. Obviously, during the first phases they will be smaller, and we might even think that they’re unnecessary. However, it’s wise to take the opportunity when it presents itself.

We can perform many types of refactorings, such as:

  • Replace magic numbers with constants.
  • Change variable and parameter names to better reflect their intentions.
  • Extract private methods.
  • Extract conditionals to methods when they become complex.
  • Flatten nested conditional structures.
  • Extract conditional branches to private methods.
  • Extract functionality to collaborators.

Refactoring limits

Sometimes, an excess of refactoring can lead us to an implementation that’s too complicated and prevents us from advancing the TDD process. This happens when we introduce patterns prematurely without having finished the development first. It would be a premature refactoring similar to the premature optimization, generating code that’s hard to maintain.

We could say that there are two kinds of refactoring involved:

  • One kind with limited reach, applicable in each red-green-refactor cycle, whose function is to make the algorithm more legible, sustainable, and capable to evolve.
  • The other kind, which will take place once we’ve completed all of the functionality, and whose objective is to introduce a more evolved and pattern-oriented design.

Another interesting question in the introduction of language-exclusive features, which in principal we’d also like to leave until that final phase. Why leave them for that moment? Precisely, because they can limit our capability to refactor a code if we’re not yet sure about towards where it could evolve.

For example, this construction in Ruby:

1 def greet(name = nil)
2   if name.nil?
3     name = 'my friend'
4   end
5 
6   "Hello, #{name}"
7 end

It could be refactored -in fact it’s recommended to do so- in this way. I think it’s really beautiful:

1 def greet(name = nil)
2   name = 'my friend' if name.nil?
3 
4   "Hello, #{name}"
5 end

In this case, the structure represents the idea of assigning a default value to the variable, something that we could also achieve in this way, which is common in other languages:

1 def greet(name = 'my friend')
2   "Hello, #{name}"
3 end

The three variations make the tests pass, but each of them puts us in a slightly different spot regarding future requirements.

Por example, let’s assume that our next requirement is to be able to introduce several names. One possible solution would be to use splat parameters, that is, let the function admit an undefined number of parameters that will later be presented in the method as an array. In Ruby this is expressed like this:

1 def greet(*name)
2   #
3 end

This declaration, for example, is incompatible with the third variant, as the splat operator doesn’t admit a default value and we’d have to re-implement that step, which would lead us back to using one of the other variants.

In principle this doesn’t seem like that big of an inconvenience, but it means undoing all of the logic determined by that structure, and depending on the development stage that we’re in, it can even lead us to dead ends.

The other options are a little less inconvenient. Apart from changing the signature, the only thing we have to change is the question (nil? by empty?) and the default value, which instead of a string, becomes an array of strings. Of course, to finish we have to join the collection in order to show it in the greeting.

1 def greet(*name)
2   if name.empty?
3     name = ['my friend']
4   end
5 
6   names = name.join(', ')
7 
8   "Hello, #{names}"
9 end

Or the rubified version:

1 def greet(*name)
2   name = ['my friend'] if name.empty?
3 
4   names = name.join(', ')
5 
6   "Hello, #{names}"
7 end

Apart from that, at this point it would be necessary to refactor the name of the parameter so it more clearly reflects it new meaning:

1 def greet(*people)
2   people = ['my friend'] if people.empty?
3 
4   names = people.join(', ')
5 
6   "Hello, #{names}"
7 end

So, as a general recommendation, it’s convenient to seek equilibrium between the refactors that help us maintain the code clean and legible, and those that we could consider over-engineering. An implementation that is not the most refined could be easier to change in the long run as more and more tests get introduced than a very evolved one.

Don’t over-refactor ahead of time.

When is the right moment to refactor

To refactor, the sine qua non condition is that all of the existing tests are passing. At this moment we’re interested in analyzing the state of the implementation and applying the most appropriate refactorings.

If a test is red, it’s telling us that a part of the specification hasn’t been achieved yet, and therefore, we should be working on doing that instead of refactoring.

But there’s a special case: when we add a new failing test and we realize that we need to do some previous refactoring to be able to implement the most obvious or simple solution.

What do we do in this case? Well, we have to take a step back.

The step back in the Red-Green-Refactor cycle

Let’s assume a simple example. we’re going to start the Test double greeting kata. We begin with a test with which to define the interface:

1 require 'rspec'
2 
3 RSpec.describe 'A simple greeting' do
4   it 'should greet a person' do
5     expect(greet('Fran')).to eq('Hello, Fran')
6   end
7 end

Our next step is to create the simplest implementation that passes the test, which we could achieve like this:

1 def greet(name)
2   'Hello, Fran'
3 end

The next requirement is to handle the situation where a name is not provided, in which case it should offer some sort of anonymous formula such as the one we use as an example in this test:

 1 require 'rspec'
 2 
 3 def greet(name)
 4   'Hello, Fran'
 5 end
 6 
 7 RSpec.describe 'A simple greeting' do
 8   it 'should greet a person' do
 9     expect(greet('Fran')).to eq('Hello, Fran')
10   end
11 
12   it 'should greet even if no name provided' do
13     expect(greet).to eq('Hello, my friend')
14   end
15 end

In the first place, the test fails because the argument isn’t optional. But on top of that, it’s not even used in the current implementation, and we need to use it to fulfill this test’s most obvious requirement. We have to execute several preparatory steps before we’re able to carry out the implementation:

  • Make the name parameter optional
  • Use the parameter in the return value

The thing is that the new requirement provides us with new information that would be useful to refactor what’s already been developed with the first test. However, as we have a failing test, we shouldn’t be doing any refactoring. For this reason we delete or cancel the previous test, for example, by commenting it out:

 1 require 'rspec'
 2 
 3 def greet(name)
 4   'Hello, Fran'
 5 end
 6 
 7 RSpec.describe 'A simple greeting' do
 8   it 'should greet a person' do
 9     expect(greet('Fran')).to eq('Hello, Fran')
10   end
11   
12   # it 'should greet even if no name provided' do
13   #   expect(greet).to eq('Hello, my friend')
14   # end
15 end

By doing this, we’re back to having all tests in green and we can apply the necessary changes, which don’t alter the behavior that’s been implemented so far.

We make the name parameter optional.

1 def greet(name = nil)
2   'Hello, Fran'
3 end

And here, we start using the parameter:

1 def greet(name)
2   "Hello, #{name}"
3 end

This has allowed us to advance from our first rough implementation to another that’s flexible enough to still pass the first test, while setting up some better conditions to reintroduce the next one:

 1 require 'rspec'
 2 
 3 def greet(name = nil)
 4   "Hello, #{name}"
 5 end
 6 
 7 RSpec.describe 'A simple greeting' do
 8   it 'should greet a person' do
 9     expect(greet('Fran')).to eq('Hello, Fran')
10   end
11 
12   it 'should greet even if no name provided' do
13     expect(greet).to eq('Hello, my friend')
14   end
15 end

Obviously the test fails, but this time the reason for failure is, precisely, that we’re missing the code that solves the requirement. Now, the only thing we have to do is check if we’re receiving a name or not, and act consequently.

1 def greet(name = nil)
2   if name.nil?
3     name = 'my friend'
4   end
5   
6   "Hello, #{name}"
7 end

In a sense, it turns out that the information from the future, that is, the new test that we design to introduce the next piece of functionality affects the past, that is, the adequate state of the code that we need to continue. This forces us to consider the depth of the necessary refactor before facing the new cycle. In this situation, it’s best to go back to the last passing test, cancelling the new on, and work on the refactoring until we’re better prepared to keep going.