Reducing Duplication

We now have 3 assertions in our test. Notice that all 3 of them do the same thing. They all use the same assertEquals() assertion, supply an expected domain, and call the same method with an email address. The only difference is the email address and the expected domain. Everything else is duplication.

Extracting a Method

As programmers, we’ve been taught about the DRY principle and trained to eliminate duplication. While it’s a good practice to eliminate duplication in the “real” code, it’s not always a good idea to do that in tests. For instance, our first instinct might be to extract a method:

 1 <?php
 2 
 3 class EmailAddressPartsExtractorTest
 4     extends PHPUnit_Framework_TestCase
 5 {
 6     public function testExtractDomain()
 7     {
 8         $extractor = new EmailAddressPartsExtractor();
 9         $this->assertDomainExtraction(
10             $extractor,
11             'example.com',
12             'elnur@example.com'
13         );
14         $this->assertDomainExtraction(
15             $extractor,
16             'example.org',
17             'elnur@example.org'
18         );
19         $this->assertDomainExtraction(
20             $extractor,
21             'example.net',
22             'elnur@example.net'
23         );
24     }
25 
26     private function assertDomainExtraction(
27         EmailAddressPartsExtractor $extractor,
28         $domain,
29         $emailAddress
30     ) {
31         $this->assertEquals(
32             $domain,
33             $extractor->extractDomain($emailAddress)
34         );
35     }
36 }

Here, on lines 26-35, we define the assertDomainExtraction() method and move the assertion to it. We define it as a private method because it’s for the internal use of the class and is not meant to be called from the outside.

Then, in the testExtractDomain() method, we replaced all 3 assertions with calls to that method.

While we have reduced some duplication by making our test calling a single method for each assertion instead of calling two methods, the test itself became less readable.

The problem is that just by reading the test method, it’s not really clear how the assertion is being done. We have to jump to another method to see that.

In contrast to “real” code, this sort of duplication elimination is actually a bad practice when it comes to tests because readability is more important here than perfect code factoring.

Extracting Data

Let’s solve the duplication problem differently. Instead of extracting a method, let’s extract the data our assertions use:

 1 <?php
 2 
 3 class EmailAddressPartsExtractorTest
 4     extends PHPUnit_Framework_TestCase
 5 {
 6     public function testExtractDomain()
 7     {
 8         $extractor = new EmailAddressPartsExtractor();
 9 
10         $emailsToDomains = [
11             'elnur@example.com' => 'example.com',
12             'elnur@example.org' => 'example.org',
13             'elnur@example.net' => 'example.net',
14         ];
15 
16         foreach ($emailsToDomains as $email => $domain) {
17             $this->assertEquals(
18                 $domain,
19                 $extractor->extractDomain($email)
20             );
21         }
22     }
23 }

On lines 10-14 we create an associative array with email address used as keys and the expected domains as values.

Then we use the foreach loop on lines 16-21 to iterate over the array and assert the domain extracting implementation for each key/value pair.

This way, we reduced duplication but kept the assertion in the test method. We don’t have to jump to other methods to understand what’s going on now.

Conveniently, PHPUnit supports this pattern of data extraction natively. Let’s start using it.

Data Providers

Extracting sets of data to reduce duplication in tests is so common that PHPUnit supports it natively. The feature is called Data Providers. Let’s see how our test class is going to look when using that feature:

 1 <?php
 2 
 3 class EmailAddressPartsExtractorTest
 4     extends PHPUnit_Framework_TestCase
 5 {
 6     /**
 7      * @dataProvider emailAddressesToDomains
 8      */
 9     public function testExtractDomain($emailAddress, $domain)
10     {
11         $extractor = new EmailAddressPartsExtractor();
12 
13         $this->assertEquals(
14             $domain,
15             $extractor->extractDomain($emailAddress)
16         );
17     }
18 
19     public function emailAddressesToDomains()
20     {
21         return [
22             ['elnur@example.com', 'example.com'],
23             ['elnur@example.org', 'example.org'],
24             ['elnur@example.net', 'example.net'],
25         ];
26     }
27 }

On lines 19-26, we define the emailAddressesToDomains() method that returns an array of arrays. Notice that this time it’s not an array of key/value pairs in the form of key => value, but a multidimensional array. That’s the format PHPUnit requires because there can be more than two values in each item. For instance, it could be in the form of:

['elnur@example.com', 'elnur', 'example.com']

Also notice that the method is public. That’s because it’s going to be called from the outside by PHPUnit.

On lines 6-8, we define a PHPDoc block with the @dataProvider annotation followed by the name of the data providing method emailAddressesToDomains. That’s how we link a test method to a data providing method.

On line 9, we add two parameters to our test method’s signature — $emailAddress and $domain. That’s how the data from the data provider will get into our test method. The order of the parameters is directly related to the order of values in each item of the array returned from the data provider. Since we put the email address first and the domain second, that’s the order we have to define the test method’s parameters in.

And finally, on lines 13-16, we have our assertEquals() assertion now using the $emailAddress and $domain parameters.

Let’s rerun the test now:

1 $ bin/phpunit --color tests
2 PHPUnit 4.8.19 by Sebastian Bergmann and contributors.
3 
4 ...
5 
6 Time: 48 ms, Memory: 3.75Mb
7 
8 OK (3 tests, 3 assertions)

Yay. The test is passing. Or tests?

Notice that this time we got three dots instead of one. That’s because PHPUnit basically treats each combination of a test and an item from a data provider as a separate test.

Now that we’ve dealt with reducing duplication in our test, let’s make our domain extraction code more solid by properly handling situations when malformed email addresses get passed to it.