What is TDD and why should I care about it?

Test Driven Development is a software development methodology in which tests are written in order to guide the structure of production code.

The tests specify -in a formal, executable and exemplified manner- the behaviors that the software we’re working on should have, defining small objectives that, after being achieved, allow us to build the software in a progressive, safe and structured way.

Despite we’re talking about tests, we’re not referring to Quality Assurance (from now on: QA), even though by working with TDD methodology we achieve the secondary effect of obtaining a unitary test suite that is valid and has the maximum possible coverage. In fact, typically part of the tests created during TDD are unnecessary for a comprehensive battery of regression tests, and therefore end up being removed as new tests make them redundant.

That is to say: both TDD and QA are based in the utilizations of tests as tools, but this use is different in several respects. Specifically, in TDD:

  • Tests are written before the software that they execute even exists.
  • The tests are very small and their objective is to force writing the minimum amount of production code needed to pass the test, which has the effect of implementing the behavior defined by the test.
  • The tests guide the development of the code, and the process contributes to the design of the system.

In TDD, the tests are defined as executable specifications of the behavior of a given unit of software, while in QA, tests are tools for verification of that same behavior.
Put in simpler words:

  • When we do QA, we try to verify that the software that we’ve written behaves according to the defined requirements.
  • When we do TDD, we write software to fulfill the defined requirements, one by one, so that we end up with a product that complies with them.

The Test Driven Development methodology

Although we will expand on this topic in depth throughout the book, we will briefly present the essentials of the methodology.

In TDD, tests are written in a way that we could think of as a dialogue with production code. This dialogue, the rules that regulate it, and the cycles that are generated by this way of interacting with code will be practiced in the first kata of the book: FizzBuzz.

Basically, it consists in:

  • Writing a failing test
  • Writing code that passes the test
  • Improving the code’s (and the test’s) structure

Writing a failing test

Once we are clear about the piece of software which we’re going to work on and the functionality that we want to implement, the first thing to do is to define a very small first test that will fail hopelessly because the file containing the production code that it needs to run doesn’t even exist. While this is something that we’ll deal with in all of the kata, in the NIF kata we will delve into strategies that will help us to decide on the first tests.

Here’s an example in Go:

 1 // roman/roman_test.go
 2 package roman
 3 
 4 import "testing"
 5 
 6 func TestRomanNumeralsConversion(t *testing.T) {
 7 	roman := decToRoman(1)
 8 
 9 	if roman != "I" {
10 		t.Errorf(
11 			"Decimal %d should convert to %s, but found %s",
12 			 1,
13 			 "I", 
14 			 roman
15 		 )
16 	}
17 }

Although we can predict that the test won’t even be able to be compiled or interpreted, we’ll try to run it nonetheless. In TDD it’s fundamental to see the tests fail, assuming it isn’t enough. Our job is making the test fail for the right reason, and then making it pass by writing production code.

1 # tddbook-go/roman [tddbook-go/roman.test]
2 ./roman_test.go:6:11: undefined: decToRoman
3 
4 Compilation finished with exit code 2

The error message will indicate us what to do next. Our short-term goal is to make that error message disappear, as well as those that might come after, one by one.

 1 package roman
 2 
 3 import "testing"
 4 
 5 func TestRomanNumeralsConversion(t *testing.T) {
 6 	roman := decToRoman(1)
 7 
 8 	if roman != "I" {
 9 		t.Errorf(
10 			"Decimal %d should convert to %s, but found %s",
11 			1,
12 			"I",
13 			roman
14 		)
15 	}
16 }
17 
18 func decToRoman(decimal int) string {
19 	
20 }

For instance, after introducing the decToRoman function, the error will change. Now it’s telling us that it should return a value:

1 # tddbook-go/roman [tddbook-go/roman.test]
2 ./roman_test.go:16:1: missing return at end of function
3 
4 Compilation finished with exit code 2

It could even happen that we get an unexpected message, such as that we’ve tried to load the Book class and it turns out that we had mistakingly created a filled named brok. That’s why it’s so important to run test, and see if it fails and how does it do it exactly.

 1 package roman
 2 
 3 import "testing"
 4 
 5 func TestRomanNumeralsConversion(t *testing.T) {
 6 	roman := decToRoman(1)
 7 
 8 	if roman != "I" {
 9 		t.Errorf(
10 			"Decimal %d should convert to %s, but found %s",
11 			1,
12 			"I",
13 			roman
14 		)
15 	}
16 }
17 
18 func decToroman(decimal int) string {
19 	
20 }

This code results in the following message:

1 # tddbook-go/roman [tddbook-go/roman.test]
2 ./roman_test.go:6:11: undefined: decToRoman
3 ./roman_test.go:16:1: missing return at end of function
4 
5 Compilation finished with exit code 2

This error tells us that we have misspelled the name of the function, so we start by correcting it:

 1 package roman
 2 
 3 import "testing"
 4 
 5 func TestRomanNumeralsConversion(t *testing.T) {
 6 	roman := decToRoman(1)
 7 
 8 	if roman != "I" {
 9 		t.Errorf(
10 			"Decimal %d should convert to %s, but found %s", 
11 			1, 
12 			"I", 
13 			roman
14 		)
15 	}
16 }
17 
18 func decToRoman(decimal int) string {
19 	
20 }

And we can continue. Since the test states that it expects the function to return “I” when we pass it 1 as an input, the failed test should indicate us that the actual result doesn’t match the expected one. However, at the moment, the test is telling us that the function doesn’t return anything. It’s still a compilation error and still not the correct reason to fail.

1 # tddbook-go/roman [tddbook-go/roman.test]
2 ./roman_test.go:16:1: missing return at end of function
3 
4 Compilation finished with exit code 2

To make the test fail for the reason that we expect it to, we have to make the function return a string, even if it’s an empty one.

 1 package roman
 2 
 3 import "testing"
 4 
 5 func TestRomanNumeralsConversion(t *testing.T) {
 6 	roman := decToRoman(1)
 7 
 8 	if roman != "I" {
 9 		t.Errorf(
10 			"Decimal %d should convert to %s, but found %s",
11 			1, 
12 			"I", 
13 			roman
14 		)
15 	}
16 }
17 
18 func decToRoman(decimal int) string {
19 	return ""
20 }

So, this change turns the error into one related with the test definition, as it’s not obtaining the result that it expects. This is the correct reason for failure, the one that will force us to write the production code that will pass the test.

1 === RUN   TestRomanNumeralsConversion
2 --- FAIL: TestRomanNumeralsConversion (0.00s)
3     roman_test.go:9: Decimal 1 should convert to I, but found 
4 FAIL
5 
6 Process finished with exit code 1

And so we would be ready to take the next step:

Writing code that passes the test

As a response to the previous result, we write the production code that is needed for the test to pass, but nothing else. Continuing with our example:

 1 package roman
 2 
 3 import "testing"
 4 
 5 func TestRomanNumeralsConversion(t *testing.T) {
 6 	roman := decToRoman(1)
 7 
 8 	if roman != "I" {
 9 		t.Errorf(
10 			"Decimal %d should convert to %s, but found %s", 
11 			1, 
12 			"I", 
13 			roman
14 		)
15 	}
16 }
17 
18 func decToRoman(decimal int) string {
19 	return "I"
20 }

After making the test pass we can start creating the file that’ll contain the unit under test. We could even rerun the test now, which probably would cause the compiler or the interpreter to throw a different error message. At this point everything depends a bit on circumstances, such as conventions in the language we’re using, the IDE we’re working with, etc.

In any case, it’s a matter of taking small steps until the compiler or interpreter is satisfied and can run the test. In principle, the test should run and fail indicating that the result received from the unit of software doesn’t match the expected one.

At this point there’s a caveat, because depending on the language, the framework, and some testing practices, the concrete manner of doing this first test may vary. For example, there are test frameworks that just require for the test to not throw any errors or exceptions to succeed, so a test that simply instantiates an object or invokes any of its methods is enough. In other cases it’s necessary that the test includes an assertion, and if none is made it’s considered as not passing.

In any case, this phase’s objective is making the test run successfully.

With the Prime Factors we’ll study the way in which production code can change to implement new functionality.

Improve the structure of the code (and tests)

When every test passes, we should examine the work done so far and check if it’s possible to refactor both the production and test code. Here we apply the usual principles: if we detect any smell, difficulty in understanding what’s happening, knowledge duplication, etc. we must refactor the code to make it better before continuing.

Ultimately, the questions at this point are:

  • Is there a better way to organize the code that I’ve just written?
  • Is there a better way to express what this code does and make it easier to understand?
  • Can I find any regularity and make the algorithm more general?

For this reason, we should keep every test that we’ve written and made pass. If any of them turn red we would have a regression in our hands and we would have spoiled, so to speak, the already implemented functionality.

It’s usual not to find many refactoring opportunities after the first cycle, but don’t get comfortable just yet: there’s always another way of seeing and doing things.

Tras el primer ciclo es normal no encontrar muchas oportunidades de refactor, pero no te fíes: siempre hay otra manera de ver y hacer las cosas. As a general rule, the earlier you spot opportunities to reorganize and clean up your code and do so, the easier development will be.

For instance, we’ve created the function under test in the same file as the test.

 1 package roman
 2 
 3 import "testing"
 4 
 5 func TestRomanNumeralsConversion(t *testing.T) {
 6 	roman := decToRoman(1)
 7 
 8 	if roman != "I" {
 9 		t.Errorf(
10 			"Decimal %d should convert to %s, but found %s", 
11 			1, 
12 			"I", 
13 			roman
14 		)
15 	}
16 }
17 
18 func decToRoman(decimal int) string {
19 	return "I"
20 }

Turns out there’s a better way to organize this code, and it is creating a new file to contain the function. In fact, it’s a recommended practice in almost every programming language. However, we may have skipped it at first.

1 //roman/roman.go
2 
3 package roman
4 
5 func decToRoman(decimal int) string {
6 	return "I"
7 }

And, in the case of Go, we can convert it in an exportable function if its name is capitalized.

1 package roman
2 
3 func DecToRoman(decimal int) string {
4 	return "I"
5 }

To delve further into everything that has to do with the refactor when working we’ll have the Bowling Game kata.

Repeat the cycle until finishing

Once the production code passes the test and is as nicely organized as it can be in this phase, it’s time to choose other functionality aspect and create a new failing test in order to describe it.

This new test fails because the existing code doesn’t cover the desired functionality and introducing a change is necessary. Therefore, our mission now is to turn this new test green by making the necessary transformations in the code, which will be small if we’ve been able to size our previous tests properly.

After making this new test pass, we search for refactoring opportunities to achieve a better code design. As we advance in the development of the piece of software, we’ll see that the possible refactorings become more and more significant.

In the first cycles we’ll begin with name changes, constant and variable extraction, etc. Then we’ll advance to introducing private methods or extracting certain aspects as functions. At some point we’ll discover the necessity of extracting functionality to helper classes, etc.

When we’re satisfied with the code’s state, we keep on repeating the loop as long has we have remaining functionality to add.

When does development end in TDD?

The obvious answer to this question could be: when all the functionality is implemented.

But, how do we know this?

Kent Beck suggested making a list of all of the aspects that would have to be fulfilled to consider the functionality as complete. Every time any one of them is attained it’s crossed off the list. Sometimes, while advancing in the development, we realize that we need to add, remove or change some elements in the list. It’s good advice.

There is a more formal way of making sure that a piece of functionality is complete. Basically, it consists in not being able to create a new failing test. Indeed, if an algorithm is implemented completely, it will be impossible to create a new test that can fail.

What is not Test Driven Development

The result or outcome of Test Driven Development is not creating flawless software free of any defect, although many of them are prevented; or generating a suite of unitary tests, although in practice it’s indeed obtained and has a coverage that can even reach 100% (with the tradeoff that it may have redundancy). But, none of these are TDD’s objectives, in any case they’re just certainly beneficial collateral effects.

TDD is not Quality Assurance

Even though we use the same tools (tests), we use them for different purposes. In TDD, testing guides development, setting specific objectives that are reached by adding or changing code. The result of TDD is a suite of tests that can be used in QA as regression tests, although it’s frequent that we have to retouch those tests in some way or other. In some cases to delete redundant tests, and in others to ensure that the casuistries are well covered.

In any case, TDD helps enormously in the QA process because it prevents many of the most common flaws and contributes to building well structured and loosely coupled code, aspects that increase software reliability, our ability to intervene in case of errors, and even the possibility of creating new tests in the future.

TDD doesn’t replace design

TDD is a tool to aid in software design, but it doesn’t replace it.

When we develop small units with some very well defined functionality, TDD helps us establish the algorithm design thanks to the safety net provided by our tests.

But when considering a larger unit, a previous analysis that leads us to a “sketch” of the main elements of the solution allows us to have a development frame.

The outside-in approach tries to integrate the design process within the development one, using what Sandro Mancuso tags as Just-in-time design: we start from a general idea about how the system will be structured and how it will work, and we design within the context of the iteration that we’re in.

How TDD helps us

What TDD provides us is a tool that:

  • Guides the software development in a systematic and progressive way.
  • Allows us to verifiable claims about whether the required functionality has been implemented or not.
  • Helps us avoid the need to design all of the implementation details in advance, since it’s a tool that helps the software component design in itself.
  • Allows us to postpone decisions at various levels.
  • Allows us to focus in very concrete problems, advancing in small steps that are easy to reverse if we introduce errors.

Benefits

Several studies have shown evidence that suggests that the application of TDD has benefits in development teams. It’s not conclusive evidence, but research tends to agree that with TDD:

  • More tests are written
  • The software has fewer flaws
  • The productivity is not diminished, it can even increase

It’s quite difficult to quantify the advantages of using TDD in terms of productivity or speed, but subjectively, many benefits can be experienced.

One of them is that the TDD methodology can lower the cognitive load of development. This is so because it favors the division of the problem in small tasks with a very defined focus, which allows us to save the limited capacity of our working memory.

Anecdotal evidence suggests that developers and teams introducing TDD reduce defects, diminish time spent on bugs, increase deployment confidence, and productivity is not adversely affected.

References