Leanpub: Publish Early, Publish Often

Motivators

There are many ways to succeed while writing tests; however, let’s start with an example of the more common path.

Let’s imagine you read Unit Testing Tips: Write Maintainable Unit Tests That Will Save You Time And Tears and decide that Roy Osherove has shown you the light. You’re going to write all your tests with Roy’s suggestions in mind. You get the entire team to read Roy’s article and everyone adopts the patterns.

Things are going well until you start accidentally breaking tests that someone else wrote and you can’t figure out why. It turns out that some object created in the Setup method is causing unexpected failures due to a side-effect of your ‘minor’ change. You’re frustrated, having been burned by Setup, and you remember the blog entry by Jim Newkirk where he discussed Why you should not use SetUp and TearDown in NUnit. Now you’re stuck with a Setup heavy test suite, and a growing suspicion that you’ve gone down the wrong path.

You do more research on Setup and stumble upon Inline Setup. You can entirely relate and go on a mission to switch all the tests to xUnit.net; xUnit.net removes the concept of Setup entirely.

Everything looks good initially, but then a few constructors start needing more dependencies. Every test creates an instance of an object; you moved the object creation out of the Setup and into each individual test. So now every test that creates that object needs to be updated. It becomes painful every time you add an argument to a constructor. You’re once again left feeling like you’ve been burnt by following “expert” advice.

The root problem: you never asked yourself ‘why?’. Why are you writing tests in the first place? Each testing practice you’ve chosen, what motivated you to adopt it?

You won’t write better software by blindly following advice. This is especially true given that much of the advice around testing is inconsistent or outright conflicting. While I’m writing this chapter there’s currently a twitter discussion with Martin Fowler, Michael Feathers, Bob Martin, Prag Dave, and David Heinemeier Hansson (all well respected and successful software engineers) where there are drastically conflicting opinions on how to effectively test. If there’s a universally right way, we haven’t found it yet.

It’s worth noting that the articles from Roy and Jim are quite old. Roy has since changed his mind on Setup (his current opinions can be found at artofunittesting.com), and I’m sure Jim has updated his opinions as well. The point of the example is to show how it’s easy to blindly follow advice that sounds good, not to tie a good or bad idea with an individual.

Back to our painful journey above: your intentions were good. You want to write better software, so you followed some reasonable advice. Unfortunately, the advice you’ve chosen to follow left you with more pain than happiness. Your tests aren’t providing enough value to justify their effort and if you keep going down this path you’ll inevitably conclude that testing is stupid and it should be abandoned.

If you’ve traveled the path above or if you aren’t regularly writing unit tests, you may find yourself wondering why other developers find them so essential. Ultimately I believe the answer boils down to selecting testing patterns based on what’s motivating you to test.

The remainder of this chapter will focus on defining testing motivators. The list that follows is presented unordered, and includes both helpful and harmful motivators. Neither inclusion nor list index reflect the value of a motivator.

Validate the System

Common motivators that would be a subset of Validate the System

Immediate Feedback That Things Work as Expected
Prevent Future Regressions

Static languages like Java provide a compiler that protects you from a certain class of errors; however, unit tests often prove their value when you need to verify not the type of a result, but the value of the result. For example, if your shopping cart cannot correctly calculate the total of each contained item, it won’t really matter that the result is an Integer.

For this reason, every codebase would benefit from, if nothing else, wrapping a few unit tests around the features of the system that if broken would cause the system to become unusable.

Theoretically, you could write a test to validate every feature of your system; however, I believe you would quickly find this task to be substantial and not necessarily worth your time - certain features of your system will likely be more important than others.

There’s a common term in finance: ROI

Return on investment (ROI) is the concept of an investment of some resource yielding a benefit to the investor. A high ROI means the investment gains compare favorably to investment cost.

When I’m motivated to write a test to validate the system, I like to look at the test from an ROI point of view. My favorite example for demonstrating how I choose based on ROI is the following:

Given a system where customers are looked up exclusively by Social Security Number

I would unit test that a Social Security Number is valid at account creation time
I would not unit test that a user’s name is alpha-numeric.

Losing a new account based on an invalid Social Security Number could be rather harmful to a business; however, storing an incorrect name for a limited amount of time should have no impact on successful use of the system.

As long as everyone on the team understands the ROI of the various features, you could trust everyone to make the right call on when and when not to test based on ROI. If your team cannot reasonably grant that responsibility and power to each team member then it will likely make sense to either pair program or err on the side of over testing and evaluating the ROI of each test during a code review.

Tests written to validate the system are often used both to verify that the system currently works as expected as well as to prevent future regression.

Code Coverage

Automated code coverage metrics can be a wonderful thing when used correctly. Upon joining a project I often use code coverage to get a feel for the level of quality that I can expect from the application code. A low coverage percentage can show probable lack of quality - though I would consider it more of a hint than a guarantee. A high coverage percentage would make me feel better about the likelihood of finding a well designed codebase, but that’s also more of a hint than a guarantee.

I expect a high level of coverage. Sometimes managers require one. There’s a subtle difference. –Brian Marick

I tend to agree with Martin Fowler’s view on the subject: If you are testing thoughtfully and well, I would expect a coverage percentage in the upper 80s or 90s. I would be suspicious of anything like 100% - it would smell of someone writing tests to make the coverage numbers happy, but not thinking about what they are doing.

Once upon a time a consultancy went as far as putting “100% code coverage” in their contracts. It was a noble goal; unfortunately, a few years later the same consultancy was presenting on the topic of: How to fail with 100% test coverage. There are various problems with pushing towards 100%:

You’ll have to test language features.
You’ll have to test framework features.
You’ll have to maintain a lot of tests with negative ROI.
etcetera

I find that code coverage metrics over time may signal an interesting change that you may have otherwise missed, but as an absolute number, it’s not very useful. –John Hume

My favorite “100% code coverage” story involves a team that added a bunch of tests to get to 100%… but didn’t add any assertions. Code coverage and verification are not the same thing. –Kent Spillner

I suspect most projects will suffer from the opposite, not enough coverage. The good news is it’s quite simple to run a coverage tool and determine which pieces of code are untested.

I’ve had success using EMMA and Clover, and John Hume recently pointed me to Cobertura. Code coverage tools are easy to work with; there’s no reason you couldn’t try a few and decide which you prefer.

Again, code coverage tools are great. I personally strive for around 80% coverage. If you’re looking to get above 80%, it would not surprise me to find tests that have code-coverage as their lone motivator.

Enable Refactoring

Getting test coverage on an untested codebase is always painful; however, it’s also essential if you’re planning to make any changes within the codebase. With the proper tests in place, you should be able to rewrite the internals of a codebase without breaking any of the existing contracts.

In addition to helping you prevent regression, creating tests can also give you direction on where the application can be logically broken up. While writing tests for a codebase you should keep track of dependencies that need to be instantiated, mocked or stubbed but have nothing to do with the current functionality you are focusing on. In general, these are the pieces that should be broken into components that are easily stubbed (ideally in 1 or 0 lines).

Document the Behavior of the System

When encountering a codebase for the first time, some developers go straight to the tests. These developers read the tests, test names as well as method bodies, to determine the how and why the system works as it does. These same developers enjoy the benefits of automated tests, but they value the documentation of tests almost as much or more than the functional aspect of the tests.

It’s absolutely true that the code doesn’t lie, and both correct and incorrect comments (including test names) can often give a view into what a developer was thinking when the test was written. If developers use tests as documentation, it’s only natural that they create many tests, some of which would likely be unnecessary if they didn’t exist solely to document the system.

Before you go deleting what appear to be superfluous tests, make sure you don’t have someone on the team that sees your worthless test as essential documentation.

Your Manager Told You To

If this were your only motivator for writing a test, I think you’d be in a very paradoxical position. If you write worthless tests you’re sure to anger your manager. Given that you’re forced to write “meaningful” tests, I believe you’d want to write the most maintainable tests possible despite your lack of additional motivators. I imagine that you’ll want to spend as little time as possible reading and writing tests, and the only way I see accomplishing that is by focusing on maintainability.

Thus, even if you don’t particularly value testing, it will likely benefit you to seek out the most maintainable way to write tests in your context.

Test Driven Development

Common motivators that would be a subset of TDD

Breaking a Problem up into Smaller Pieces
Defining the “Simplest Thing that Could Possibly Work”
Improved Design

Unit Testing and TDD are often incorrectly conflated and referred to by either name. Unit testing is an umbrella name for testing at a certain level of abstraction. TDD has a very specific definition:

Test-driven development (TDD) is a software development process that relies on the repetition of a very short development cycle: first the developer writes an (initially failing) automated test case that defines a desired improvement or new function, then produces the minimum amount of code to pass that test, and finally refactors the new code to acceptable standards. –Wikipedia

It’s not necessary to write unit tests to TDD, nor is it necessary to TDD to write unit tests.

That said, there’s a reason that the terms are often conflated: If you’re practicing TDD, then you’re very likely also writing a substantial amount of unit tests. A codebase written by developers dogmatically following TDD would theoretically contain no code that wasn’t written as a result of a failing test. Proponents of TDD often claim that the results of TDD give the existing team and future maintainers a greater level of confidence.

TDD’s development cycle is also very appealing to developers who can find a large problem overwhelming, but are able to quickly break a large problem down into many smaller tests that, when combined, solve the larger problem. Rather than focusing on the single large problem and trying to write code that solves for every known problem, the developers will focus on writing tests for each individual variable and growing the code in a way where each test keeps passing and each variable is dealt with individually.

Incredibly large and complicated problems don’t seem nearly as daunting when programmers are able to focus exclusively on the task at hand: make the individual test pass. In addition, all of the previously written tests provide a safety net, thus allowing you to (harmlessly) ignore all prior constraints.

Proponents of TDD generally believe it promotes superior design as well. Two reasons are the most often used when describing the design benefits of TDD:

By focusing on the test cases first, a developer is forced to imagine how the functionality will be used by clients.
TDD leads to more modularized, flexible, and extensible code by requiring that the developers think of the software in terms of small units that can be written and tested independently and integrated together later.

In my opinion every developer should practice TDD at some point in their career. Utilizing TDD at the right moment will unquestionably make you more productive. That said, the frequency of those moments often depends greatly on the individual. Only through experience can a developer know how to judge whether the current moment would benefit or suffer from switching to a TDD cycle.

An anonymous comment once appeared on my blog:

The developers that know how to write tests correctly are very rare, and only those developers can really do TDD. The rest end up with a nest of poorly written brittle tests that don’t fully test the code.

It’s my hope that this book will help increase the number of developers that are productively unit testing. Still, it’s perfectly reasonable to delete a test that provided value as part of a TDD cycle, but no longer has positive ROI.

Customer Acceptance

Unit Testing to achieve customer acceptance would be an interesting choice. Rarely would a domain expert be willing to sift through all of the unit tests to determine whether or not they’re willing to sign off on the system. Thus, I imagine you’d need to devise some type of filtering that allowed the domain expert to drill down to what they believed to be important.

My default choice is to enable the domain expert to write and maintain tests in a tool designed for high level tests; removing developers and unit tests almost entirely from the acceptance process. However, if the developers must be responsible for writing the tests used for customer acceptance, I would devise a plan to annotate the appropriate unit tests and provide a well formatted report based on the automated results.

In my experience, developers are willing to support customer acceptance low level tests that can quickly be debugged when they fail. Conversely, I’ve never seen a developer that was happy to maintain tests that are both strictly for the customer and high level (thus hard to debug).

Ping Pong Pair-Programming

From the c2.com Wiki

here’s how Pair Programming works on my team.

A writes a new test and sees that it fails.

B implements the code needed to pass the test.

B writes the next test and sees that it fails.

A implements the code needed to pass the test.

And so on.

While the most popular definition obviously describes a TDD approach, there’s no reason you couldn’t ping-pong writing the test after. If you’re already pair programming, the rhythm created by practicing ping-pong may be the only motivator you need for writing a test. I’ve seen this approach utilized very successfully.

Once a feature is complete it’s often worth your time to examine the associated tests. Many of the recently created tests will be valuable as is. Other tests may provide negative ROI as written, but with small tweaks can be made to produce positive ROI. Finally, any tests that were motivated solely by the development process should be considered for deletion.

What Motivates You (or Your Team)

The primary driver for this chapter is to recognize that tests can be written for many different reasons, and assuming that a test is necessary simply because it exists is not always the right decision. It’s valuable to recognize which motivators are responsible for a test that you’re creating or updating. If you come across a test with no motivators, do everyone a favor and delete the test.

I often write speculative tests that help me get to feature completion, but are unnecessary in the long term. Those are the tests that I look to delete once a feature is complete. They’re valuable to me for brainstorming purposes, but aren’t well designed for documentation, regression detection, or any other motivator. Once they’ve served their purpose, I happily kill them off.

Deleting tests that no longer provide value is an important activity; however, deleting tests is an activity that shouldn’t be taken lightly. Each test deletion likely requires at least a little collaboration to ensure that (as I previously mentioned) your valueless test isn’t someone else’s documentation.

Up next

More…