Solving the NIF kata

In this kata we’re going to follow an approach that tackles the sad paths first, that is, we’re going to handle the cases that would cause an error first. Thus, we’ll first develop the validation of the input structure, and then move on to the algorithm.

It’s usual that the kata ignore issues such as validation, but in this case we’ve decided to go for a more realistic example, in the sense that it’s a situation with which we have to deal quite often. In the code of a project in production, the validation of input data is essential, and it’s worth practicing with an exercise that focuses almost exclusively on it.

Besides, we’ll see a couple of interesting techniques to transform a public interface without breaking the tests.

Statement of the kata

Create a Nif class, which will be a Value Object to represent the Spanish Tax Identification Number. It’s a string of eight numeric characters, with a final letter that acts as a control character.

This control letter is obtained by calculating the remainder of diving the numeric part of the NIF by 23 (mod 23). The result indicates us in which row of the following table to look up the control letter. In this table I’ve also included some simple examples of valid NIF so you can use them in the tests.

Numeric part Remainder Letter Valid NIF example
00000023 0 T 00000023T
00000024 1 R 00000024R
00000025 2 W 00000025W
00000026 3 A 00000026A
00000027 4 G 00000027G
00000028 5 M 00000028M
00000029 6 Y 00000029Y
00000030 7 F 00000030F
00000031 8 P 00000031P
00000032 9 D 00000032D
00000033 10 X 00000033X
00000034 11 B 00000034B
00000035 12 N 00000035N
00000036 13 J 00000036J
00000037 14 Z 00000037Z
00000038 15 S 00000038S
00000039 16 Q 00000039Q
00000040 17 V 00000040V
00000041 18 H 00000041H
00000042 19 L 00000042L
00000043 20 C 00000043C
00000044 21 K 00000044K
00000045 22 E 00000045E

You can create invalid NIF simply by choosing a numeric part and adding a letter that doesn’t correspond it.

Invalid example
00000000S
00000001M
00000002H
00000003Q
00000004E

There’s an exception: the NIF for foreigners (or NIE) may start by the letters X, Yor Z, which for the purposes of the calculations are replaced by the numbers 0, 1 and 2, respectively. In this case, X0000000T is equivalent to 00000000T.

To prevent confusion we’ve excluded the letters U, I, O and Ñ.

A string that starts with a letter other than X, Y, Z, or that contains alphabetic characters in the central positions is also invalid.

Language and focus

We’re going to solve this kata using Go, so we’re going to clarify its result a bit. In this example we’re going to create a data type Nif, which will basically be a string, and a factory function NewNif which will allow us to build validated NIF starting from an input string.

On the other hand, testing in Go is also a bit peculiar. Even though the language includes support for testing as a standard feature, it doesn’t include common utilities such as asserts.

Disclaimer

To solve this kata I’m going to take advantage of the way in which Go handles errors. These can be returned as one of the responses of a function, which forces you to always handle them explicitly.

Designing tests based on error messages is not a good practice, as they can easily change, making tests fail even when there hasn’t really been an alteration of the functionality. However, in this kata we’re going to use the error messages as a sort of temporary wildcard on which to rely, making them go from more specific to more general. By the end of the exercise, we’ll be handling only two possible errors.

Create the constructor function

In this kata, we want to start by focusing on the sad paths, the cases in which we won’t be able to use the argument that’s been passed to the constructor function. From all the innumerable string combinations that the function could receive, let’s first give an answer to those that we know won’t be of use because they don’t meet the requirements. This answer will be an error.

We’ll start by rejecting the strings that are too long, those that have more than nine characters. We can describe this with the following test:

In the nif/nif_test.go file

1 package nif
2 
3 import "testing"
4 
5 func TestShouldFailWhenStringIsTooLong(t *testing.T) {
6 	NewNif()
7 }

For now we’ll ignore the function’s responses, just to force ourselves to implement the minimum amount of code.

As expected, the test will fail because it doesn’t compile. So we’ll implement the minimum necessary code, which can be as small as this:

nif/nif.go file

1 package nif
2 
3 func NewNif() {
4 
5 }

With this, we get a foundation on which to build.

Now we can go a step further. The function should accept a parameter:

1 package nif
2 
3 import "testing"
4 
5 func TestShouldFailWhenStringIsTooLong(t *testing.T) {
6 	NewNif("01234567891011")
7 }

We make the test pass again with:

1 package nif
2 
3 func NewNif(candidate string) {
4 
5 }

And finally return:

  • the NIF, when the one we’ve passed is valid.
  • an error in the case it’s not possible.

In Go, a function can return multiple values and, by convention, errors are also returned as the last return value.

This provides a flexibility that is not common to find in other languages, and let us play with some ideas that are at least curious. For example, for now we’re going to ignore the response on the function and focus exclusively on the errors. Our next test is going to ask the function to return only the error without doing anything with it. The if is there, for now, to keep the compiler from complaining.

1 package nif
2 
3 import "testing"
4 
5 func TestShouldFailWhenStringIsTooLong(t *testing.T) {
6 	err := NewNif("01234567891011")
7 
8 	if err != nil {}
9 }

This test tells us that we must return something, so for now we indicate that we’re going to return an error, which can be nil.

1 package nif
2 
3 func NewNif(candidate string) error {
4 	return nil
5 }

Let’s go a step further by expecting a specific error when the condition defined by the test is met: the string is too long. With this, we’ll have a proper first test:

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailWhenStringIsTooLong(t *testing.T) {
 6 	err := NewNif("01234567891011")
 7 
 8 	if err.Error() != "too long" {
 9 		t.Errorf("Expected too long, got %s", err.Error())
10 	}
11 }

Again, the test will fail, and to make it pass we return the error unconditionally:

1 package nif
2 
3 import "errors"
4 
5 func NewNif(candidate string) error {
6 	return errors.New("too long")
7 }

And with this, we’ve already completed our first test and made it pass. We could be a little more strict in the handling of the response to contemplate the case in which err is nil, but it’s something that doesn’t have to affect us for the time being.

At this point, I’d like to draw your attention to the fact that we’re not solving anything yet: the error is returned unconditionally, so we’re postponing this validation for later.

Implement the first validation

Our second test has the goal of forcing the implementation of the validation we’ve just postponed. It may sound a little weird, but it showcases that one the great benefits of TDD is the ability to postpone decisions. By doing so we’ll have a little more information, which is always an advantage.

This test is very similar to the previous one:

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailWhenStringIsTooLong(t *testing.T) {
 6 	err := NewNif("01234567891011")
 7 
 8 	if err.Error() != "too long" {
 9 		t.Errorf("Expected too long, got %s", err.Error())
10 	}
11 }
12 
13 func TestShouldFailWhenStringIsTooShort(t *testing.T) {
14 	err := NewNif("0123456")
15 
16 	if err.Error() != "too short" {
17 		t.Errorf("Expected too short, got %s", err.Error())
18 	}
19 }

This test already forces us to act differently in each case, so we’re going to implement the validation that limits the strings that are too long:

 1 package nif
 2 
 3 import "errors"
 4 
 5 func NewNif(candidate string) error {
 6 	if len(candidate) > 9 {
 7 		return errors.New("too long")
 8 	}
 9 
10 	return errors.New("too short")
11 }

Again, I point out that at the moment we’re not implementing what the test says. We’ll do it in the next cycle, but the test is fulfilled by returning the expected error unconditionally.

There’s not much else we can do in the production code, but looking at the tests we can see that it would be possible to unify their structure a bit. After all, we’re going to make a series of them to which we pass a value and expect a specific error in response.

A test to rule them all

In Go there is a test structure similar to the one provided by the use of Data Providers in other languages: Table Test.

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailIfCandidateIsInvalid(t *testing.T) {
 6 	tests := []struct {
 7 		name string
 8 		example string
 9 		expected string
10 	}{
11 		{"should fail if too long", "01234567891011", "too long"},
12 		{"should fail if too short", "01234", "too short"},
13 	}
14 	for _, test := range tests {
15 		t.Run(test.name, func(t *testing.T) {
16 			err := NewNif(test.example)
17 
18 			if err.Error() != test.expected {
19 				t.Errorf("Expected %s, got %s", test.expected, err.Error())
20 			}
21 		})
22 	}
23 }

With this, it’s now very easy and fast to add tests, especially if they are from the same family, like in this case in which we pass invalid candidate strings and check for the error. Also, if we make changes to the constructor’s interface, we only have a place in which to apply them.

With this, we’d have everything ready to continue developing.

Complete the validation of the length and start examining the structure

With the two previous tests we verify that the string that we’re examining meets the specification of having exactly nine characters, although that’s not implemented yet. We’ll do it now.

However, you may be asking yourself why don’t we simply test that the function rejects the strings that don’t fulfill it, something that we could do in just one test.

The reason is that there are actually two possible ways in which the specification may not be met: the string has more than nine characters, or the string has less. If we do a single test, we’ll have to choose one of the two cases, so we cannot guarantee that the other is fulfilled.

In this specific example, in which we’re interested in just one value, we could raise the dichotomy between strings with length nine and strings with lengths other than nine. However, it’s common for us to have to work with interval of values that, moreover, can be open or closed. In that situation, the strategy of having two or even more tests is far safer.

In any case, in the point at which we are, we need to add another requirement in the form of a test in order to drive the development. The two existing tests define the string’s valid length. The next test asks about its structure.

And with the refactoring that we’ve just made, adding a test is extremely simple.

We’ll start at the beginning. Valid NIF begin with a number, except a for a subset of the that begin with one of the letters X, Y, and Z. One way of defining the test is the following:

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailIfCandidateIsInvalid(t *testing.T) {
 6 	tests := []struct {
 7 		name string
 8 		example string
 9 		expected string
10 	}{
11 		{"should fail if too long", "01234567891011", "too long"},
12 		{"should fail if too short", "01234", "too short"},
13 		{"should fail if starts with a letter other than X, Y, Z", "A12345678", "bad start\
14  format"},
15 	}
16 	for _, test := range tests {
17 		t.Run(test.name, func(t *testing.T) {
18 			err := NewNif(test.example)
19 
20 			if err.Error() != test.expected {
21 				t.Errorf("Expected %s, got %s", test.expected, err.Error())
22 			}
23 		})
24 	}
25 }

To pass the test, we first solve the pending problem of the previous one:

 1 package nif
 2 
 3 import "errors"
 4 
 5 func NewNif(candidate string) error {
 6 	if len(candidate) > 9 {
 7 		return errors.New("too long")
 8 	}
 9 
10 	if len(candidate) < 9 {
11 		return errors.New("too short")
12 	}
13 
14 	return errors.New("bad start format")
15 }

Here we have a pretty clear refactoring opportunity that would consist in joining the conditionals that evaluate the lengths of the string. However, that will cause the test to fail since we would at least have to change an error message.

The not very clean way of changing the test and production code a the same time

One possibility is to temporarily “skip” our self-imposed condition of only doing refactorings with all tests in green, and making changes in both production and test code at the same time. Let’s see what happens.

The first thing would be to change the test so it expects a different error message, which will be more generic and the same for all of the cases that we want to consolidate in this step:

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailIfCandidateIsInvalid(t *testing.T) {
 6 	tests := []struct {
 7 		name string
 8 		example string
 9 		expected string
10 	}{
11 		{"should fail if too long", "01234567891011", "bad format"},
12 		{"should fail if too short", "01234", "bad format"},
13 		{"should fail if starts with a letter other than X, Y, Z", "A12345678", "bad start\
14  format"},
15 	}
16 	for _, test := range tests {
17 		t.Run(test.name, func(t *testing.T) {
18 			err := NewNif(test.example)
19 
20 			if err.Error() != test.expected {
21 				t.Errorf("Expected %s, got %s", test.expected, err.Error())
22 			}
23 		})
24 	}
25 }

This will cause the test to fail. An issue that can be solved by changing the production code in the same way:

 1 package nif
 2 
 3 import "errors"
 4 
 5 func NewNif(candidate string) error {
 6 	if len(candidate) > 9 {
 7 		return errors.New("bad format")
 8 	}
 9 
10 	if len(candidate) < 9 {
11 		return errors.New("bad format")
12 	}
13 
14 	return errors.New("bad start format")
15 }

The test passes again and we are ready to refactor. But we’re not going to do that.

The safe way

Other option is to make a temporary refactoring in the test in order to make it more tolerant. We just make it possible to return a more generic error apart from the specific one.

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailIfCandidateIsInvalid(t *testing.T) {
 6 	tests := []struct {
 7 		name string
 8 		example string
 9 		expected string
10 	}{
11 		{"should fail if too long", "01234567891011", "too long"},
12 		{"should fail if too short", "01234", "too short"},
13 		{"should fail if starts with a letter other than X, Y, Z", "A12345678", "bad start\
14  format"},
15 	}
16 	for _, test := range tests {
17 		t.Run(test.name, func(t *testing.T) {
18 			err := NewNif(test.example)
19 
20 			if err.Error() != test.expected && err.Error() != "bad format" {
21 				t.Errorf("Expected %s, got %s", test.expected, err.Error())
22 			}
23 		})
24 	}
25 }

This change allows us to make the change in production without breaking anything:

 1 package nif
 2 
 3 import "errors"
 4 
 5 func NewNif(candidate string) error {
 6 	if len(candidate) > 9 {
 7 		return errors.New("bad format")
 8 	}
 9 
10 	if len(candidate) < 9 {
11 		return errors.New("bad format")
12 	}
13 
14 	return errors.New("bad end format")
15 }

The test keeps passing, and now we can perform the refactoring.

Unify the string length validation

Unifying the conditionals is easy now. This is the first step, which I include here to have a reference of how to do this in case we were working with an interval of valid lengths.

 1 package nif
 2 
 3 import "errors"
 4 
 5 func NewNif(candidate string) error {
 6 	if len(candidate) > 9 || len(candidate) < 9 {
 7 		return errors.New("bad format")
 8 	}
 9 	
10 	return errors.New("bad start format")
11 }

But it can be done better:

 1 package nif
 2 
 3 import "errors"
 4 
 5 func NewNif(candidate string) error {
 6 	if len(candidate) != 9 {
 7 		return errors.New("bad format")
 8 	}
 9 
10 	return errors.New("bad start format")
11 }

And a little more expressive:

 1 package nif
 2 
 3 import "errors"
 4 
 5 func NewNif(candidate string) error {
 6 	const maxlength = 9
 7 	
 8 	if len(candidate) != maxlength {
 9 		return errors.New("bad format")
10 	}
11 
12 	return errors.New("bad start format")
13 }

Finally, a new refactoring of the test to contemplate these changes. We remove our temporary change, although it’s possible that we need it again in the future.

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailIfCandidateIsInvalid(t *testing.T) {
 6 	tests := []struct {
 7 		name string
 8 		example string
 9 		expected string
10 	}{
11 		{"should fail if too long", "01234567891011", "bad format"},
12 		{"should fail if too short", "01234", "bad format"},
13 		{"should fail if starts with a letter other than X, Y, Z", "A12345678", "bad start\
14  format"},
15 	}
16 	for _, test := range tests {
17 		t.Run(test.name, func(t *testing.T) {
18 			err := NewNif(test.example)
19 
20 			if err.Error() != test.expected {
21 				t.Errorf("Expected %s, got %s", test.expected, err.Error())
22 			}
23 		})
24 	}
25 }

Note that we’ve been able to make all these changes without the tests failing.

Moving forward with the structure

The code is pretty compact, so we’re going to add a new test that lets us move forward with the validity of the structure. The central fragment of the NIF is composed only of numbers, exactly seven:

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailIfCandidateIsInvalid(t *testing.T) {
 6 	tests := []struct {
 7 		name string
 8 		example string
 9 		expected string
10 	}{
11 		{"should fail if too long", "01234567891011", "bad format"},
12 		{"should fail if too short", "01234", "bad format"},
13 		{"should fail if starts with a letter other than X, Y, Z", "A12345678", "bad start\
14  format"},
15 		{"should fail if doesn't have 7 digit in the middle", "0123X567R", "bad middle for\
16 mat"},
17 	}
18 	for _, test := range tests {
19 		t.Run(test.name, func(t *testing.T) {
20 			err := NewNif(test.example)
21 
22 			if err.Error() != test.expected {
23 				t.Errorf("Expected %s, got %s", test.expected, err.Error())
24 			}
25 		})
26 	}
27 }

We run it to make sure that it fails for the right reason. To pass the test, we have to solve the previous test first, so we’ll add code to verify that the first symbol is either a number or a letter in the X, Y and Z set. We’ll do it with a regular expression:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 func NewNif(candidate string) error {
 9 	const maxlength = 9
10 
11 	if len(candidate) != maxlength {
12 		return errors.New("bad format")
13 	}
14 
15 	invalid := regexp.MustCompile(`(?i)^[^0-9XYZ].*`)
16 
17 	if invalid.MatchString(candidate) {
18 		return errors.New("bad start format")
19 	}
20 	
21 	return errors.New("bad middle format")
22 }

This code is enough to pass the test, but we’re going to make a refactoring.

Invert the conditional

It makes sense that, instead of matching a regular expression that excludes the non-valid strings, we match an expression that detects them. If we do that, we’ll have to invert the conditional. To be honest, the change is pretty small:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 func NewNif(candidate string) error {
 9 	const maxlength = 9
10 
11 	if len(candidate) != maxlength {
12 		return errors.New("bad format")
13 	}
14 
15 	valid := regexp.MustCompile(`(?i)^[0-9XYZ].*`)
16 
17 	if !valid.MatchString(candidate) {
18 		return errors.New("bad start format")
19 	}
20 	
21 	return errors.New("bad middle format")
22 }

The end of the structure

We’re reaching the end of the structural validation of the NIF, we need a test that tells us which candidates to reject depending on its last symbol, which leads us to solving the pending problem from the previous test:

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailIfCandidateIsInvalid(t *testing.T) {
 6 	tests := []struct {
 7 		name string
 8 		example string
 9 		expected string
10 	}{
11 		{"should fail if too long", "01234567891011", "bad format"},
12 		{"should fail if too short", "01234", "bad format"},
13 		{"should fail if starts with a letter other than X, Y, Z", "A12345678", "bad start\
14  format"},
15 		{"should fail if doesn't have 7 digit in the middle", "0123X567R", "bad middle for\
16 mat"},
17 		{"should fail if doesn't end with a valid control letter", "01234567U", "invalid e\
18 nd format"},
19 
20 	}
21 	for _, test := range tests {
22 		t.Run(test.name, func(t *testing.T) {
23 			err := NewNif(test.example)
24 
25 			if err.Error() != test.expected {
26 				t.Errorf("Expected %s, got %s", test.expected, err.Error())
27 			}
28 		})
29 	}
30 }

From the four non-valid letters we take the U as an example, but it could be I, Ñ, or O.

However, to make the test pass, what we do is make sure that the previous one is fulfilled. It’s easier to implement that separately:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 func NewNif(candidate string) error {
 9 	const maxlength = 9
10 
11 	if len(candidate) != maxlength {
12 		return errors.New("bad format")
13 	}
14 
15 	valid := regexp.MustCompile(`(?i)^[0-9XYZ].*`)
16 
17 	if !valid.MatchString(candidate) {
18 		return errors.New("bad start format")
19 	}
20 
21 	valid = regexp.MustCompile(`(?i)^.\d{7}.*`)
22 
23 	if !valid.MatchString(candidate) {
24 		return errors.New("bad middle format")
25 	}
26 
27 	return errors.New("invalid end format")
28 }

Compacting the algorithm

This passes the test, and we are met by a familiar situation which we’ve already solved before: we have to make the errors more generic with the temporary help of some extra control in the test:

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailIfCandidateIsInvalid(t *testing.T) {
 6 	tests := []struct {
 7 		name string
 8 		example string
 9 		expected string
10 	}{
11 		{"should fail if too long", "01234567891011", "bad format"},
12 		{"should fail if too short", "01234", "bad format"},
13 		{"should fail if starts with a letter other than X, Y, Z", "A12345678", "bad start\
14  format"},
15 		{"should fail if doesn't have 7 digit in the middle", "0123X567R", "bad middle for\
16 mat"},
17 		{"should fail if doesn't end with a valid control letter", "01234567U", "invalid e\
18 nd format"},
19 
20 	}
21 	for _, test := range tests {
22 		t.Run(test.name, func(t *testing.T) {
23 			err := NewNif(test.example)
24 
25 			if err.Error() != test.expected && err.Error() != "bad format" {
26 				t.Errorf("Expected %s, got %s", test.expected, err.Error())
27 			}
28 		})
29 	}
30 }

We change the error messages in the production code:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 func NewNif(candidate string) error {
 9 	const maxlength = 9
10 
11 	if len(candidate) != maxlength {
12 		return errors.New("bad format")
13 	}
14 
15 	valid := regexp.MustCompile(`(?i)^[0-9XYZ].*`)
16 
17 	if !valid.MatchString(candidate) {
18 		return errors.New("bad format")
19 	}
20 
21 	valid = regexp.MustCompile(`(?i)^.\d{7}.*`)
22 
23 	if !valid.MatchString(candidate) {
24 		return errors.New("bad format")
25 	}
26 
27 	return errors.New("invalid end format")
28 }

Now we unify the regular expression and the conditionals:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 func NewNif(candidate string) error {
 9 	const maxlength = 9
10 
11 	if len(candidate) != maxlength {
12 		return errors.New("bad format")
13 	}
14 
15 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}.*`)
16 
17 	if !valid.MatchString(candidate) {
18 		return errors.New("bad format")
19 	}
20 	
21 	return errors.New("invalid end format")
22 }

We can still make a small but important change. The last part of the regular expression, .*, is there to fulfill the requirement of matching the whole string. However, we don’t really need the quantifier as one character is enough:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 func NewNif(candidate string) error {
 9 	const maxlength = 9
10 
11 	if len(candidate) != maxlength {
12 		return errors.New("bad format")
13 	}
14 
15 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}.$`)
16 
17 	if !valid.MatchString(candidate) {
18 		return errors.New("bad format")
19 	}
20 
21 	return errors.New("invalid end format")
22 }

And this reveals a detail, the regular expression only matches strings that have exactly nine characters, so the initial length validation is unnecessary:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 func NewNif(candidate string) error {
 9 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}.$`)
10 
11 	if !valid.MatchString(candidate) {
12 		return errors.New("bad format")
13 	}
14 
15 	return errors.New("invalid end format")
16 }

We’ve walked so far… only to retrace our steps. However, we didn’t know this in the beginning, and that’s where the value of the process lies.

Lastly, we change the test to reflect the changes and, again, remove our temporary support:

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailIfCandidateIsInvalid(t *testing.T) {
 6 	tests := []struct {
 7 		name string
 8 		example string
 9 		expected string
10 	}{
11 		{"should fail if too long", "01234567891011", "bad format"},
12 		{"should fail if too short", "01234", "bad format"},
13 		{"should fail if starts with a letter other than X, Y, Z", "A12345678", "bad forma\
14 t"},
15 		{"should fail if doesn't have 7 digit in the middle", "0123X567R", "bad format"},
16 		{"should fail if doesn't end with a valid control letter", "01234567U", "invalid e\
17 nd format"},
18 
19 	}
20 	for _, test := range tests {
21 		t.Run(test.name, func(t *testing.T) {
22 			err := NewNif(test.example)
23 
24 			if err.Error() != test.expected {
25 				t.Errorf("Expected %s, got %s", test.expected, err.Error())
26 			}
27 		})
28 	}
29 }

Finishing the structural validation

We need a new test to finish the structural validation part. The existing tests guarantee that the strings are correct, but the following validation already involves the algorithm that calculates the control letter.

This test should ensure that we can’t use a structurally valid NIF with an incorrect control letter. When we presented the kata we gave some examples, such as 00000000S. This is the test:

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailIfCandidateIsInvalid(t *testing.T) {
 6 	tests := []struct {
 7 		name string
 8 		example string
 9 		expected string
10 	}{
11 		{"should fail if too long", "01234567891011", "bad format"},
12 		{"should fail if too short", "01234", "bad format"},
13 		{"should fail if starts with a letter other than X, Y, Z", "A12345678", "bad forma\
14 t"},
15 		{"should fail if doesn't have 7 digit in the middle", "0123X567R", "bad format"},
16 		{"should fail if doesn't end with a valid control letter", "01234567U", "invalid e\
17 nd format"},
18 		{"should fail if doesn't end with the right control letter", "00000000S", "bad con\
19 trol letter"},
20 
21 	}
22 	for _, test := range tests {
23 		t.Run(test.name, func(t *testing.T) {
24 			err := NewNif(test.example)
25 
26 			if err.Error() != test.expected {
27 				t.Errorf("Expected %s, got %s", test.expected, err.Error())
28 			}
29 		})
30 	}
31 }

And here is the code that passes it:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 func NewNif(candidate string) error {
 9 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}.$`)
10 
11 	if !valid.MatchString(candidate) {
12 		return errors.New("bad format")
13 	}
14 
15 	valid = regexp.MustCompile(`(?i).*[^UIOÑ0-9]$`)
16 
17 	if !valid.MatchString(candidate) {
18 		return errors.New("invalid end format")
19 	}
20 	
21 	return errors.New("bad control letter")
22 }

And now, of course, it’s time to refactor.

Compacting the validation

This refactoring is pretty obvious, but we have to temporarily protect the test again:

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailIfCandidateIsInvalid(t *testing.T) {
 6 	tests := []struct {
 7 		name string
 8 		example string
 9 		expected string
10 	}{
11 		{"should fail if too long", "01234567891011", "bad format"},
12 		{"should fail if too short", "01234", "bad format"},
13 		{"should fail if starts with a letter other than X, Y, Z", "A12345678", "bad forma\
14 t"},
15 		{"should fail if doesn't have 7 digit in the middle", "0123X567R", "bad format"},
16 		{"should fail if doesn't end with a valid control letter", "01234567U", "invalid e\
17 nd format"},
18 		{"should fail if doesn't end with the right control letter", "00000000S", "bad con\
19 trol letter"},
20 
21 	}
22 	for _, test := range tests {
23 		t.Run(test.name, func(t *testing.T) {
24 			err := NewNif(test.example)
25 
26 			if err.Error() != test.expected  && err.Error() != "bad format" {
27 				t.Errorf("Expected %s, got %s", test.expected, err.Error())
28 			}
29 		})
30 	}
31 }

We make the error more general to be able to unify the regular expressions and the conditionals:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 func NewNif(candidate string) error {
 9 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}.$`)
10 
11 	if !valid.MatchString(candidate) {
12 		return errors.New("bad format")
13 	}
14 
15 	valid = regexp.MustCompile(`(?i).*[^UIOÑ0-9]$`)
16 
17 	if !valid.MatchString(candidate) {
18 		return errors.New("bad format")
19 	}
20 
21 	return errors.New("bad control letter")
22 }

And now we join them while the tests keep passing:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 func NewNif(candidate string) error {
 9 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
10 
11 	if !valid.MatchString(candidate) {
12 		return errors.New("bad format")
13 	}
14 
15 	return errors.New("bad control letter")
16 }

With this we finish the structure validation, and we’d have the implementations of the mod23 algorithm left. But to do that we need a little change of approach.

Let’s look on the bright side

The algorithm is, in fact, very simple: calculate the remainder (of dividing by 23), and use it as an index to look up the corresponding letter in a table. Implementing it in only one iteration would be easy. However, we’re going to do it more slowly.

Until now, our tests have been pessimistic: they expected incorrect NIF results in order to pass. But now, our new test must be optimistic, that is, they’re going to expect that we pass them valid NIF examples.

At this point, we’ll introduce a change. You may remember that for now we’re only returning the error, but the final interface of the function will return the validated string as a NIF type that we shall create for the occasion.

That is, we must change the code so that it returns something, and that something has to be of a type that doesn’t exist yet.

To make this change without breaking the test, we’re going to make use of a somewhat contrived refactoring technique.

Changing the public interface

In the first place, we extract the body of NewNif to another function:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 func NewNif(candidate string) error {
 9 	return FullNewNif(candidate)
10 }
11 
12 func FullNewNif(candidate string) error {
13 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
14 
15 	if !valid.MatchString(candidate) {
16 		return errors.New("bad format")
17 	}
18 
19 	return errors.New("bad control letter")
20 }

The tests keep on passing. Now, we introduce a variable:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 func NewNif(candidate string) error {
 9 	err := FullNewNif(candidate)
10 	return err
11 }
12 
13 func FullNewNif(candidate string) error {
14 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
15 
16 	if !valid.MatchString(candidate) {
17 		return errors.New("bad format")
18 	}
19 
20 	return errors.New("bad control letter")
21 }

With this, we can make it so that FullNewNif returns the string without affecting the test, because it stays encapsulated within NewNif.

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 func NewNif(candidate string) error {
 9 	_, err := FullNewNif(candidate)
10 	return err
11 }
12 
13 func FullNewNif(candidate string) (string, error) {
14 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
15 
16 	if !valid.MatchString(candidate) {
17 		return "", errors.New("bad format")
18 	}
19 
20 	return candidate, errors.New("bad control letter")
21 }

The tests still pass, and we’re almost finished. In the test, we change the usage of NewNif by FullNewNif.

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailIfCandidateIsInvalid(t *testing.T) {
 6 	tests := []struct {
 7 		name string
 8 		example string
 9 		expected string
10 	}{
11 		{"should fail if too long", "01234567891011", "bad format"},
12 		{"should fail if too short", "01234", "bad format"},
13 		{"should fail if starts with a letter other than X, Y, Z", "A12345678", "bad forma\
14 t"},
15 		{"should fail if doesn't have 7 digit in the middle", "0123X567R", "bad format"},
16 		{"should fail if doesn't end with a valid control letter", "01234567U", "invalid e\
17 nd format"},
18 		{"should fail if doesn't end with the right control letter", "00000000S", "bad con\
19 trol letter"},
20 
21 	}
22 	for _, test := range tests {
23 		t.Run(test.name, func(t *testing.T) {
24 			_, err := FullNewNif(test.example)
25 
26 			if err.Error() != test.expected  && err.Error() != "bad format" {
27 				t.Errorf("Expected %s, got %s", test.expected, err.Error())
28 			}
29 		})
30 	}
31 }

And they’re still passing. Now, the function returns the two parameters that we wanted and the tests remain unbroken. We can proceed to remove the original NewNif function.

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 func FullNewNif(candidate string) (string, error) {
 9 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
10 
11 	if !valid.MatchString(candidate) {
12 		return "", errors.New("bad format")
13 	}
14 
15 	return candidate, errors.New("bad control letter")
16 }

And use the IDE tools to change the function name from FullNewNif to NewNif.

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 func NewNif(candidate string) (string, error) {
 9 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
10 
11 	if !valid.MatchString(candidate) {
12 		return "", errors.New("bad format")
13 	}
14 
15 	return candidate, errors.New("bad control letter")
16 }

NOW it’s time

Our goal now is to push the implementation of the mod23 algorithm. This time, the tests expect the string to be valid. Also, we want to force the return of Nif type objects instead of strings.

 1 func TestShouldCreateNifTypeWithValidCandidate(t *testing.T) {
 2 	tests := []struct {
 3 		name string
 4 		example string
 5 	}{
 6 		{"should accept mod23 being 0", "00000023T"},
 7 	}
 8 	for _, test := range tests {
 9 		t.Run(test.name, func(t *testing.T) {
10 			nif, err := NewNif(test.example)
11 			
12 			if nif != Nif(test.example) {
13 				t.Errorf("Expected Nif(%s), got %s", test.example, nif)
14 			}
15 			
16 			if err != nil {
17 				t.Errorf("Unexpected error %s", err.Error())
18 			}
19 		})
20 	}
21 }

As a first step, we change the production code to introduce and use the Nif type:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 type Nif string
 9 
10 func NewNif(candidate string) (Nif, error) {
11 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
12 
13 	if !valid.MatchString(candidate) {
14 		return "", errors.New("bad format")
15 	}
16 
17 	return "", errors.New("bad control letter")
18 }

Now the test will fail because we haven’t validated anything yet. To make it pass we add a conditional:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 type Nif string
 9 
10 func NewNif(candidate string) (Nif, error) {
11 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
12 
13 	if !valid.MatchString(candidate) {
14 		return "", errors.New("bad format")
15 	}
16 
17 	if candidate == "00000023T" {
18 		return Nif(candidate), nil
19 	}
20 
21 	return "", errors.New("bad control letter")
22 }

A note about Go: custom types can’t have nil value, they should have an empty value instead. For this reason, we return an empty string in the case of an error.

Moving forward with the algorithm

For now we don’t have many reasons to refactor, so we’re going to introduce a test that should help us move forward a bit. In principle, we want it to drive us to separate the numeric part from the control letter.

One possibility would be to test another NIF that ends with the letter T, such as 00000046T.

 1 func TestShouldCreateNifTypeWithValidCandidate(t *testing.T) {
 2 	tests := []struct {
 3 		name string
 4 		example string
 5 	}{
 6 		{"should accept mod23 being 0", "00000023T"},
 7 		{"should accept mod23 being 0 letter T", "00000046T"},
 8 	}
 9 	for _, test := range tests {
10 		t.Run(test.name, func(t *testing.T) {
11 			nif, err := NewNif(test.example)
12 
13 			if nif != Nif(test.example) {
14 				t.Errorf("Expected Nif(%s), got %s", test.example, nif)
15 			}
16 
17 			if err != nil {
18 				t.Errorf("Unexpected error %s", err.Error())
19 			}
20 		})
21 	}
22 }

To pass the test, we could do this simple implementation:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 type Nif string
 9 
10 func NewNif(candidate string) (Nif, error) {
11 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
12 
13 	if !valid.MatchString(candidate) {
14 		return "", errors.New("bad format")
15 	}
16 
17 	if candidate == "00000023T" {
18 		return Nif(candidate), nil
19 	}
20 
21 	if candidate == "00000046T" {
22 		return Nif(candidate), nil
23 	}
24 
25 	return "", errors.New("bad control letter")
26 }

And now we start refactoring.

More refactoring

In the production code, we can take a look at what’s different and what’s common between the examples. Both of them have T as their control letter, and their numeric part is divisible by 23. Therefore, their mod23 will be 0.

Now we can perform the refactoring. A first step.

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 )
 7 
 8 type Nif string
 9 
10 func NewNif(candidate string) (Nif, error) {
11 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
12 
13 	if !valid.MatchString(candidate) {
14 		return "", errors.New("bad format")
15 	}
16 
17 	control := string(candidate[8])
18 
19 	if control == "T" {
20 		return Nif(candidate), nil
21 	}
22 
23 	return "", errors.New("bad control letter")
24 }

And, after seeing the tests pass, the second step:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 	"strconv"
 7 )
 8 
 9 type Nif string
10 
11 func NewNif(candidate string) (Nif, error) {
12 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
13 
14 	if !valid.MatchString(candidate) {
15 		return "", errors.New("bad format")
16 	}
17 
18 	control := string(candidate[8])
19 
20 	numeric, _ := strconv.Atoi(candidate[0:8])
21 
22 	modulus := numeric % 23
23 	
24 	if control == "T" && modulus == 0 {
25 		return Nif(candidate), nil
26 	}
27 
28 	return "", errors.New("bad control letter")
29 }

With this change, the tests pass, and the function accepts all of the valid NIF that end with a T.

Validating more control letters

In this kind of algorithm there isn’t much of a point in trying to validate all of the control letters, but we can introduce another one to force ourselves to understand how the code should evolve. We’ll try a new one:

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailIfCandidateIsInvalid(t *testing.T) {
 6 	tests := []struct {
 7 		name string
 8 		example string
 9 		expected string
10 	}{
11 		{"should fail if too long", "01234567891011", "bad format"},
12 		{"should fail if too short", "01234", "bad format"},
13 		{"should fail if starts with a letter other than X, Y, Z", "A12345678", "bad forma\
14 t"},
15 		{"should fail if doesn't have 7 digit in the middle", "0123X567R", "bad format"},
16 		{"should fail if doesn't end with a valid control letter", "01234567U", "invalid e\
17 nd format"},
18 		{"should fail if doesn't end with the right control letter", "00000000S", "bad con\
19 trol letter"},
20 
21 	}
22 	for _, test := range tests {
23 		t.Run(test.name, func(t *testing.T) {
24 			_, err := NewNif(test.example)
25 
26 			if err.Error() != test.expected  && err.Error() != "bad format" {
27 				t.Errorf("Expected %s, got %s", test.expected, err.Error())
28 			}
29 		})
30 	}
31 }
32 
33 func TestShouldCreateNifTypeWithValidCandidate(t *testing.T) {
34 	tests := []struct {
35 		name string
36 		example string
37 	}{
38 		{"should accept mod23 being 0", "00000000T"},
39 		{"should accept mod23 being 0 letter T", "00000023T"},
40 		{"should accept mod23 being 1 letter R", "00000024R"},
41 	}
42 	for _, test := range tests {
43 		t.Run(test.name, func(t *testing.T) {
44 			nif, err := NewNif(test.example)
45 
46 			if nif != Nif(test.example) {
47 				t.Errorf("Expected Nif(%s), got %s", test.example, nif)
48 			}
49 
50 			if err != nil {
51 				t.Errorf("Unexpected error %s", err.Error())
52 			}
53 		})
54 	}
55 }

This test is already failing, so let’s make a very simple implementation:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 	"strconv"
 7 )
 8 
 9 type Nif string
10 
11 func NewNif(candidate string) (Nif, error) {
12 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
13 
14 	if !valid.MatchString(candidate) {
15 		return "", errors.New("bad format")
16 	}
17 
18 	control := string(candidate[8])
19 
20 	numeric, _ := strconv.Atoi(candidate[0:8])
21 
22 	modulus := numeric % 23
23 
24 	if control == "T" && modulus == 0 {
25 		return Nif(candidate), nil
26 	}
27 
28 	if control == "R" && modulus == 1 {
29 		return Nif(candidate), nil
30 	}
31 
32 	return "", errors.New("bad control letter")
33 }

This already gives us an idea about what we’re getting at: a map between letters and the remainder after dividing by 23. However, in many languages strings can work as arrays, so it would be sufficient to have a string with all the control letters properly sorted, and access the letter that’s in the position indicated by the modulus.

A refactoring, for even more simplicity

First we implement a simple version of this idea:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 	"strconv"
 7 )
 8 
 9 type Nif string
10 
11 func NewNif(candidate string) (Nif, error) {
12 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
13 
14 	if !valid.MatchString(candidate) {
15 		return "", errors.New("bad format")
16 	}
17 
18 	controlMap := "TR"
19 
20 	control := candidate[8]
21 
22 	numeric, _ := strconv.Atoi(candidate[0:8])
23 
24 	modulus := numeric % 23
25 
26 	if control == controlMap[modulus] {
27 		return Nif(candidate), nil
28 	}
29 
30 	return "", errors.New("bad control letter")
31 }

We have our first version! We’ll add the full letter list later, but for now we can try to fix up the current code a little. First, we make controlMap constant:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 	"strconv"
 7 )
 8 
 9 type Nif string
10 
11 func NewNif(candidate string) (Nif, error) {
12 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
13 
14 	if !valid.MatchString(candidate) {
15 		return "", errors.New("bad format")
16 	}
17 
18 	const controlMap = "TR"
19 
20 	control := candidate[8]
21 
22 	numeric, _ := strconv.Atoi(candidate[0:8])
23 
24 	modulus := numeric % 23
25 
26 	if control == controlMap[modulus] {
27 		return Nif(candidate), nil
28 	}
29 
30 	return "", errors.New("bad control letter")
31 }

Actually, we could extract all of the modulus calculation part to another function. First we rearrange the code to better control the extraction:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 	"strconv"
 7 )
 8 
 9 type Nif string
10 
11 func NewNif(candidate string) (Nif, error) {
12 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
13 
14 	if !valid.MatchString(candidate) {
15 		return "", errors.New("bad format")
16 	}
17 
18 	control := candidate[8]
19 	
20 	const controlMap = "TR"
21 	numeric, _ := strconv.Atoi(candidate[0:8])
22 	modulus := numeric % 23
23 	shouldBe := controlMap[modulus]
24 	
25 	if control == shouldBe {
26 		return Nif(candidate), nil
27 	}
28 
29 	return "", errors.New("bad control letter")
30 }

Remember to verify that the tests keep passing. Now we extract the function:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 	"strconv"
 7 )
 8 
 9 type Nif string
10 
11 func NewNif(candidate string) (Nif, error) {
12 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
13 
14 	if !valid.MatchString(candidate) {
15 		return "", errors.New("bad format")
16 	}
17 
18 	control := candidate[8]
19 
20 	if control == shouldHaveControl(candidate) {
21 		return Nif(candidate), nil
22 	}
23 
24 	return "", errors.New("bad control letter")
25 }
26 
27 func shouldHaveControl(candidate string) uint8 {
28 	const controlMap = "TR"
29 	numeric, _ := strconv.Atoi(candidate[0:8])
30 	modulus := numeric % 23
31 	
32 	return controlMap[modulus]
33 }

And we can compact the code a little bit further while we add the rest of the control letters. At first sight it could look like “cheating”, but in the end it’s nothing more than generalizing an algorithm that could be enunciated as “take the letter that’s in the position given by the mod23 of the numeric part”.

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 	"strconv"
 7 )
 8 
 9 type Nif string
10 
11 func NewNif(candidate string) (Nif, error) {
12 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
13 
14 	if !valid.MatchString(candidate) {
15 		return "", errors.New("bad format")
16 	}
17 
18 	if candidate[8] == shouldHaveControl(candidate) {
19 		return Nif(candidate), nil
20 	}
21 
22 	return "", errors.New("bad control letter")
23 }
24 
25 func shouldHaveControl(candidate string) uint8 {
26 	const controlMap = "TRWAGMYFPDXBNJZSQVHLCKE"
27 
28 	numeric, _ := strconv.Atoi(candidate[0:8])
29 	modulus := numeric % 23
30 
31 	return controlMap[modulus]
32 }

With this we can already validate all of the NIF excepting the NIE, which begin with the letters X, Y or Z.

NIE support

Now that we’ve implemented the general algorithm let’s try to handle its exceptions, which aren’t all that many. NIE begin with a letter that, to all effects of the calculation, gets replaced by a number.

The test that seems to be the most obvious at this point is the following:

 1 func TestShouldCreateNifTypeWithValidCandidate(t *testing.T) {
 2 	tests := []struct {
 3 		name string
 4 		example string
 5 	}{
 6 		{"should accept mod23 being 0", "00000000T"},
 7 		{"should accept mod23 being 0 letter T", "00000023T"},
 8 		{"should accept mod23 being 1 letter R", "00000024R"},
 9 		{"should accept NIE starting with X", "X0000023T"},
10 	}
11 	for _, test := range tests {
12 		t.Run(test.name, func(t *testing.T) {
13 			nif, err := NewNif(test.example)
14 
15 			if nif != Nif(test.example) {
16 				t.Errorf("Expected Nif(%s), got %s", test.example, nif)
17 			}
18 
19 			if err != nil {
20 				t.Errorf("Unexpected error %s", err.Error())
21 			}
22 		})
23 	}
24 }

The X0000023T is equivalent to 00000023T, will this affect the result of the test?

We run the test and… surprise? The test passes. This happens because the conversion that we do in this line generates an error that we’re currently ignoring, but causes the numeric part to still be equivalent to 23 (whose mod23 is 0 and should be paired with the letter T).

1 	numeric, _ := strconv.Atoi(candidate[0:8])

In other languages the conversion doesn’t fail, but assumes the X as 0.

In any case, this opens up two possible paths:

  • remove this test, refactor the production code to treat the error, and see that it fails when we put it back
  • test other example that we know will fail (Y0000000Z) and make the change later

Possibly, in this case the second option would be more than enough, since our structural validations would be assuring that the error couldn’t appear once the function was completely developed.

However, it could interesting to introduce the handling of the error. Managing errors, including those that could never happen, is always good practice.

So, let’s cancel the test and introduce a refactoring to handle the error:

 1 func TestShouldCreateNifTypeWithValidCandidate(t *testing.T) {
 2 	tests := []struct {
 3 		name string
 4 		example string
 5 	}{
 6 		{"should accept mod23 being 0", "00000000T"},
 7 		{"should accept mod23 being 0 letter T", "00000023T"},
 8 		{"should accept mod23 being 1 letter R", "00000024R"},
 9 		//{"should accept NIE starting with X", "X0000023X"},
10 	}
11 	for _, test := range tests {
12 		t.Run(test.name, func(t *testing.T) {
13 			nif, err := NewNif(test.example)
14 
15 			if nif != Nif(test.example) {
16 				t.Errorf("Expected Nif(%s), got %s", test.example, nif)
17 			}
18 
19 			if err != nil {
20 				t.Errorf("Unexpected error %s", err.Error())
21 			}
22 		})
23 	}
24 }

Here’s the refactor. In this case, I handle the error causing a panic, which is not the best way of managing an error, but allows us to make the test fail and to force ourselves to implement the solution.

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 	"strconv"
 7 )
 8 
 9 type Nif string
10 
11 func NewNif(candidate string) (Nif, error) {
12 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
13 
14 	if !valid.MatchString(candidate) {
15 		return "", errors.New("bad format")
16 	}
17 
18 	if candidate[8] == shouldHaveControl(candidate) {
19 		return Nif(candidate), nil
20 	}
21 
22 	return "", errors.New("bad control letter")
23 }
24 
25 func shouldHaveControl(candidate string) uint8 {
26 	const controlMap = "TRWAGMYFPDXBNJZSQVHLCKE"
27 
28 	numeric, err := strconv.Atoi(candidate[0:8])
29 
30 	if err != nil {
31 		panic("Numeric part contains letters")
32 	}
33 	
34 	modulus := numeric % 23
35 
36 	return controlMap[modulus]
37 }

If we run the tests we can check that they’re still green. But, if we reactivate the last test, we can see it fail:

 1 func TestShouldCreateNifTypeWithValidCandidate(t *testing.T) {
 2 	tests := []struct {
 3 		name string
 4 		example string
 5 	}{
 6 		{"should accept mod23 being 0", "00000000T"},
 7 		{"should accept mod23 being 0 letter T", "00000023T"},
 8 		{"should accept mod23 being 1 letter R", "00000024R"},
 9 		{"should accept NIE starting with X", "X0000023X"},
10 	}
11 	for _, test := range tests {
12 		t.Run(test.name, func(t *testing.T) {
13 			nif, err := NewNif(test.example)
14 
15 			if nif != Nif(test.example) {
16 				t.Errorf("Expected Nif(%s), got %s", test.example, nif)
17 			}
18 
19 			if err != nil {
20 				t.Errorf("Unexpected error %s", err.Error())
21 			}
22 		})
23 	}
24 }

And this already forces us to introduce a special treatment for these cases. It’s basically replacing the X with a 0:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 	"strconv"
 7 	"strings"
 8 )
 9 
10 type Nif string
11 
12 func NewNif(candidate string) (Nif, error) {
13 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
14 
15 	if !valid.MatchString(candidate) {
16 		return "", errors.New("bad format")
17 	}
18 
19 	if candidate[8] == shouldHaveControl(candidate) {
20 		return Nif(candidate), nil
21 	}
22 
23 	return "", errors.New("bad control letter")
24 }
25 
26 func shouldHaveControl(candidate string) uint8 {
27 	const controlMap = "TRWAGMYFPDXBNJZSQVHLCKE"
28 
29 	var numPart = candidate[0:8]
30 
31 	if string(candidate[0]) == "X" {
32 		numPart = strings.Replace(numPart, "X", "0", 1)
33 	}
34 
35 	numeric, err := strconv.Atoi(numPart)
36 
37 	if err != nil {
38 		panic("Numeric part contains letters")
39 	}
40 
41 	modulus := numeric % 23
42 
43 	return controlMap[modulus]
44 }

It can be refactored by using a Replacer:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 	"strconv"
 7 	"strings"
 8 )
 9 
10 type Nif string
11 
12 func NewNif(candidate string) (Nif, error) {
13 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
14 
15 	if !valid.MatchString(candidate) {
16 		return "", errors.New("bad format")
17 	}
18 
19 	if candidate[8] == shouldHaveControl(candidate) {
20 		return Nif(candidate), nil
21 	}
22 
23 	return "", errors.New("bad control letter")
24 }
25 
26 func shouldHaveControl(candidate string) uint8 {
27 	const controlMap = "TRWAGMYFPDXBNJZSQVHLCKE"
28 
29 	var numPart = candidate[0:8]
30 
31 	re := strings.NewReplacer("X", "0")
32 	
33 	numeric, err := strconv.Atoi(re.Replace(numPart))
34 
35 	if err != nil {
36 		panic("Numeric part contains letters")
37 	}
38 
39 	modulus := numeric % 23
40 
41 	return controlMap[modulus]
42 }

At this point, we could make a test to force us to introduce the rest of the replacements. It’s cheap, although it’s ultimately not very necessary for the reason we discussed earlier: we could interpret this part of the algorithm as “replacing the initial letters X, Y and Z with the numbers 0, 1 and 2, respectively”.

 1 package nif
 2 
 3 import "testing"
 4 
 5 func TestShouldFailIfCandidateIsInvalid(t *testing.T) {
 6 	tests := []struct {
 7 		name string
 8 		example string
 9 		expected string
10 	}{
11 		{"should fail if too long", "01234567891011", "bad format"},
12 		{"should fail if too short", "01234", "bad format"},
13 		{"should fail if starts with a letter other than X, Y, Z", "A12345678", "bad forma\
14 t"},
15 		{"should fail if doesn't have 7 digit in the middle", "0123X567R", "bad format"},
16 		{"should fail if doesn't end with a valid control letter", "01234567U", "invalid e\
17 nd format"},
18 		{"should fail if doesn't end with the right control letter", "00000000S", "bad con\
19 trol letter"},
20 
21 	}
22 	for _, test := range tests {
23 		t.Run(test.name, func(t *testing.T) {
24 			_, err := NewNif(test.example)
25 
26 			if err.Error() != test.expected  && err.Error() != "bad format" {
27 				t.Errorf("Expected %s, got %s", test.expected, err.Error())
28 			}
29 		})
30 	}
31 }
32 
33 func TestShouldCreateNifTypeWithValidCandidate(t *testing.T) {
34 	tests := []struct {
35 		name string
36 		example string
37 	}{
38 		{"should accept mod23 being 0", "00000000T"},
39 		{"should accept mod23 being 0 letter T", "00000023T"},
40 		{"should accept mod23 being 1 letter R", "00000024R"},
41 		{"should accept NIE starting with X", "X0000000T"},
42 		{"should accept NIE starting with Y", "Y0000000Z"},
43 	}
44 	for _, test := range tests {
45 		t.Run(test.name, func(t *testing.T) {
46 			nif, err := NewNif(test.example)
47 
48 			if nif != Nif(test.example) {
49 				t.Errorf("Expected Nif(%s), got %s", test.example, nif)
50 			}
51 
52 			if err != nil {
53 				t.Errorf("Unexpected error %s", err.Error())
54 			}
55 		})
56 	}
57 }

We only need to add the corresponding pairs:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 	"strconv"
 7 	"strings"
 8 )
 9 
10 type Nif string
11 
12 func NewNif(candidate string) (Nif, error) {
13 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
14 
15 	if !valid.MatchString(candidate) {
16 		return "", errors.New("bad format")
17 	}
18 
19 	if candidate[8] == shouldHaveControl(candidate) {
20 		return Nif(candidate), nil
21 	}
22 
23 	return "", errors.New("bad control letter")
24 }
25 
26 func shouldHaveControl(candidate string) uint8 {
27 	const controlMap = "TRWAGMYFPDXBNJZSQVHLCKE"
28 
29 	var numPart = candidate[0:8]
30 
31 	re := strings.NewReplacer("X", "0", 
32 	                          "Y", "1", 
33 	                          "Z", "2")
34 
35 	numeric, err := strconv.Atoi(re.Replace(numPart))
36 
37 	if err != nil {
38 		panic("Numeric part contains letters")
39 	}
40 
41 	modulus := numeric % 23
42 
43 	return controlMap[modulus]
44 }

After a short while of refactoring, this would be a possible solution:

 1 package nif
 2 
 3 import (
 4 	"errors"
 5 	"regexp"
 6 	"strconv"
 7 	"strings"
 8 )
 9 
10 type Nif string
11 
12 func NewNif(candidate string) (Nif, error) {
13 	valid := regexp.MustCompile(`(?i)^[0-9XYZ]\d{7}[^UIOÑ0-9]$`)
14 
15 	if !valid.MatchString(candidate) {
16 		return "", errors.New("bad format")
17 	}
18 
19 	if candidate[8] != controlLetterFor(candidate) {
20 		return "", errors.New("bad control letter")
21 	}
22 
23 	return Nif(candidate), nil
24 }
25 
26 func controlLetterFor(candidate string) uint8 {
27 	const controlMap = "TRWAGMYFPDXBNJZSQVHLCKE"
28 
29 	position, err := controlPosition(candidate[0:8])
30 
31 	if err != nil {
32 		panic("Numeric part contains letters")
33 	}
34 
35 	return controlMap[position]
36 }
37 
38 func controlPosition(numPart string) (int, error) {
39 	re := strings.NewReplacer("X", "0", "Y", "1", "Z", "2")
40 
41 	numeric, err := strconv.Atoi(re.Replace(numPart))
42 
43 	return numeric % 23, err
44 }

What have we learned in this kata

  • Use sad paths to move development
  • Use table tests in Go to reduce the cost of adding new tests
  • A technique to change the returned errors by a more general one without breaking the tests
  • A technique to change the public interface of the production code without breaking the tests

References