Leanpub: Publish Early, Publish Often

Strings

In Go, string literals are defined using double quotes, similar to other popular programming languages in the C family:

An example of a string literal

1 package main
2 
3 import "fmt"
4 
5 func ExampleString() {
6 	s := "I am a string - 你好"
7 	fmt.Println(s)
8 	// Output: I am a string - 你好
9 }

As the example shows, Go string literals and code may also contain non-English characters, like the Chinese 你好 ³.

Appending to Strings

Strings can be appended to with the addition (+) operator:

Appending to a string

 1 package main
 2 
 3 import "fmt"
 4 
 5 func ExampleAppend() {
 6 	greeting := "Hello, my name is "
 7 	greeting += "Inigo Montoya"
 8 	greeting += "."
 9 	fmt.Println(greeting)
10 	// Output: Hello, my name is Inigo Montoya.
11 }

This method of string concatenation is easy to read, and great for simple cases. But while Go does allow us to concatenate strings with the + (or +=) operator, it is not the most efficient method. It is best used only when very few strings are being added, and not in a hot code path. For a discussion on the most efficient way to do string concatenation, see the later chapter on optimization.

In most cases, the built-in fmt.Sprintf function available in the standard library is a better choice for building a string. We can rewrite the previous example like this:

Using fmt.Sprintf to build a string

 1 package main
 2 
 3 import "fmt"
 4 
 5 func ExampleFmtString() {
 6 	name := "Inigo Montoya"
 7 	sentence := fmt.Sprintf("Hello, my name is %s.", name)
 8 	fmt.Println(sentence)
 9 	// Output: Hello, my name is Inigo Montoya.
10 }

The %s sequence is a special placeholder that tells the Sprintf function to insert a string in that position. There are also other sequences for things that are not strings, like %d for integers, %f for floating point numbers, or %v to leave it to Go to figure out the type. These sequences allow us to add numbers and other types to a string without casting, something the + operator would not allow due to type conflicts. For example:

Using fmt.Printf to combine different types of variables in a string

 1 package main
 2 
 3 import "fmt"
 4 
 5 func ExampleFmtComplexString() {
 6 	name := "Inigo Montoya"
 7 	age := 32
 8 	weight := 76.598
 9 	t := "Hello, my name is %s, age %d, weight %.2fkg"
10 	fmt.Printf(t, name, age, weight)
11 	// Output: Hello, my name is Inigo Montoya, age 32, weight 76.60kg
12 }

Note that here we used fmt.Printf to print the new string directly. In previous examples, we used fmt.Sprintf to first create a string variable, then fmt.Println to print it to the screen (notice the S in Sprintf, short for string). In the above example, %d is a placeholder for an integer, %.2f a for a floating point number that should be rounded to the second decimal, and %s a placeholder for a string, as before. These codes are analogous to ones in the printf and scanf functions in C, and old-style string formatting in Python. If you are not familiar with this syntax, have a look at the documentation for the fmt package. It is both expressive and efficient, and used liberally in Go code.

What would happen if we tried to append an integer to a string using the plus operator?

Breaking code that tries to append an integer to a string

1 package main
2 
3 func main() {
4 	s := "I am" + 32 + "years old"
5 }

Running this with go run, Go returns an error message during the build phase:

1 $ go run bad_append.go 
2 # command-line-arguments
3 ./bad_append.go:4: cannot convert "I am" to type int
4 ./bad_append.go:4: invalid operation: "I am" + 32 (mismatched types string an\
5 d int)

As expected, Go’s type system catches our transgression, and complains that it cannot append an integer to a string. We should rather use fmt.Sprintf for building strings that mix different types.

Next we will have a look at a very useful standard library package that allows us to perform many common string manipulation tasks: the built-in strings package.

Splitting strings

The strings package is imported by simply adding import "strings", and provides us with many string manipulation functions. One of these is a function that split a string by separators, and obtain a slice of strings:

Splitting a string

 1 package main
 2 
 3 import "fmt"
 4 import "strings"
 5 
 6 func ExampleSplit() {
 7 	l := strings.Split("a,b,c", ",")
 8 	fmt.Printf("%q", l)
 9 	// Output: ["a" "b" "c"]
10 }

The strings.Split function takes a string and a separator as arguments. In this case, we passed in "a,b,c" and the separator “,” and received a string slice containing the separate letters a, b, and c as strings.

Counting and finding substrings

Using the strings package, we can also count the number of non-overlapping instances of a substring in a string with the aptly-named strings.Count. The following example uses strings.Count to count occurrences of both the single letter a, and the substring ana. In both cases we pass in a string⁴. Notice that we get only one occurrence of ana, even though one may have expected it to count ana both at positions 1 and 3. This is because strings.Count returns the count of non-overlapping occurrences.

Count occurrences in a string

 1 package main
 2 
 3 import (
 4 	"fmt"
 5 	"strings"
 6 )
 7 
 8 func ExampleCount() {
 9 	s := "banana"
10 	c1 := strings.Count(s, "a")
11 	c2 := strings.Count(s, "ana")
12 	fmt.Println(c1, c2)
13 	// Output: 3 1
14 }

If we want to know whether a string contains, starts with, or ends with some substring, we can use the strings.Contains, strings.HasPrefix, and strings.HasSuffix functions, respectively. All of these functions return a boolean:

Count occurrences in a string

 1 package main
 2 
 3 import (
 4 	"fmt"
 5 	"strings"
 6 )
 7 
 8 func ExampleContains() {
 9 	str := "two gophers on honeymoon"
10 	if strings.Contains(str, "moon") {
11 		fmt.Println("Contains moon")
12 	}
13 	if strings.HasPrefix(str, "moon") {
14 		fmt.Println("Starts with moon")
15 	}
16 	if strings.HasSuffix(str, "moon") {
17 		fmt.Println("Ends with moon")
18 	}
19 	// Output: Contains moon
20 	// Ends with moon
21 }

For finding the index of a substring in a string, we can use strings.Index. Index returns the index of the first instance of substr in s, or -1 if substr is not present in s:

Using strings.Index to find substrings in a string

 1 package main
 2 
 3 import "fmt"
 4 import "strings"
 5 
 6 func ExampleIndex() {
 7 	an := strings.Index("banana", "an")
 8 	am := strings.Index("banana", "am")
 9 	fmt.Println(an, am)
10 	// Output: 1 -1
11 }

The strings package also contains a corresponding LastIndex function, which returns the index of the last (ie. right-most) instance of a matching substring, or -1 if it is not found.

The strings package contains many more useful functions. To name a few: ToLower, ToUpper, Trim, Equals and Join, all performing actions that match their names. For more information on these and other functions, refer to the strings package docs. As a final example, let’s see how we might combine some of the functions in the strings package in a real program, and discover some of its more surprising functions.

Advanced string functions

The program below repeatedly takes input from the user, and declares whether the typed sentence is palindromic. For a sentence to be palindromic, we mean that the words should be the same when read forwards and backwards. We wish to ignore punctuation, and assume the sentence is in English, so there are spaces between words. Take a look and notice how we use two new functions from the strings package, FieldsFunc and EqualFold, to keep the code clear and concise.

A program that declares whether a sentence reads the same backward and forward, word for word

 1 package main
 2 
 3 import (
 4 	"bufio"
 5 	"fmt"
 6 	"os"
 7 	"strings"
 8 	"unicode"
 9 )
10 
11 // getInput prompts the user for some text, and then
12 // reads a line of input from standard input. This line
13 // of text is then returned.
14 func getInput() string {
15 	fmt.Print("Enter a sentence: ")
16 	scanner := bufio.NewScanner(os.Stdin)
17 	scanner.Scan()
18 	return scanner.Text()
19 }
20 
21 func isNotLetter(c rune) bool {
22 	return !unicode.IsLetter(c)
23 }
24 
25 // isPalindromicSentence returns whether or not the given sentence
26 // is palindromic. To calculate this, it splits the string into words,
27 // then creates a reversed copy of the word slice. It then checks
28 // whether the reverse is equal (ignoring case) to the original.
29 // It also ignores any non-alphabetic characters.
30 func isPalindromicSentence(s string) bool {
31 	// split into words and remove non-alphabetic characters
32 	// in one operation by using FieldsFunc and passing in
33 	// isNotLetter as the function to split on.
34 	w := strings.FieldsFunc(s, isNotLetter)
35 
36 	// iterate over the words from front and back
37 	// simultaneously. If we find a word that is not the same
38 	// as the word at its matching from the back, the sentence
39 	// is not palindromic.
40 	l := len(w)
41 	for i := 0; i < l/2; i++ {
42 		fw := w[i]     // front word
43 		bw := w[l-i-1] // back word
44 		if !strings.EqualFold(fw, bw) {
45 			return false
46 		}
47 	}
48 
49 	// all the words matched, so the sentence must be
50 	// palindromic.
51 	return true
52 }
53 
54 func main() {
55 	// Go doesn't have while loops, but we can use for loop
56 	// syntax to read into a new variable, check that it's not
57 	// empty, and read new lines on subsequent iterations.
58 	for l := getInput(); l != ""; l = getInput() {
59 		if isPalindromicSentence(l) {
60 			fmt.Println("... is palindromic!")
61 		} else {
62 			fmt.Println("... is not palindromic.")
63 		}
64 	}
65 }

Save this code to palindromes.go, and we can then run it with go run palindromes.go.

An example run of the palindrome program

1 $ go run palindromes.go 
2 Enter a sentence: This is magnificent!
3 ... is not palindromic.
4 Enter a sentence: This is magnificent, is this!
5 ... is palindromic!
6 Enter a sentence:

As expected, when we enter a sentence that reads the same backwards and forwards, ignoring punctuation and case, we get the output ... is palindromic!. Now, let’s break down what this code is doing.

The getInput function uses a bufio.Scanner from the bufio package to read one line from standard input. scanner.Scan() scans until the end of the line, and scanner.Text() returns a string containing the input line.

The meat of this program is in the isPalindromicSentence function. This function takes a string as input, and returns a boolean indicating whether the sentence is palindromic, word-for-word. We also want to ignore punctuation and case in the comparison. First, on line 34, we use strings.FieldsFunc to split the string at each Unicode code point for which the isNotLetter function returns true. In Go, you can pass around functions like any other value. A function’s type signature describes the types of its arguments and return values. Our isNotLetter function satisfies the function signature specified by FieldsFunc, which is to take a rune as input, and return a boolean. Runes are a special character type in the Go language - for now, just think of them as more or less equivalent to a single character, like char in Java.

In isNotLetter, we return false if the passed in rune is a letter as defined by the Unicode standard, and true otherwise. We can achieve this in a single line by using unicode.IsLetter, another built-in function provided by the standard unicode library.

Putting it all together, strings.FieldsFunc(s, isNotLetter) will return a slice of strings, split by sequences of non-letters. In other words, it will return a slice of words.

Next, on line 40, we iterate over the slice of words. We keep an index i, which we use to create both fw, the word at index i, and bw, the matching word at index l - i - 1. If we can walk all the way through the slice without finding two words that are not equal, we have a palindromic sentence. And we can stop halfway through, because then we have already done all the necessary comparisons. The next table shows how this process works for an example sentence as i increases. As we walk through the slice, words match, and so we continue walking until we reach the middle. If we were to find a non-matching pair, we can immediately return false, because the sentence is not palindromic.

The palindromic sentence algorithm by example
	“Fall”	“leaves”	“as”	“as”	“leaves”	“fall”	EqualFold
i=0	fw					bw	true
i=1		fw			bw		true
i=2			fw	bw			true

The equality check of strings is performed on line 44 using strings.EqualFold - this function compares two strings for equality, ignoring case.

Finally, on line 58, we make use of the semantics of the Go for loop definition. The basic for loop has three components separated by semicolons:

the init statement: executed before the first iteration
the condition expression: evaluated before every iteration
the post statement: executed at the end of every iteration

We use these definition to instantiate a variable l and read into it from standard input, conditionally break from the loop if it is empty, and set up reading for each subsequent iteration in the post statement.

Ranging over a string

When the functions in the strings package don’t suffice, it is also possible to range over each character in a string:

Iterating over the characters in a string

 1 package main
 2 
 3 import "fmt"
 4 
 5 func ExampleIteration() {
 6 	s := "ABC你好"
 7 	for i, r := range s {
 8 		fmt.Printf("%q(%d) ", r, i)
 9 	}
10 	// Output: 'A'(0) 'B'(1) 'C'(2) '你'(3) '好'(6)
11 }

You might be wondering about something peculiar about the output above. The printed indexes start from 0, 1, 2, 3 and then jump to 6. Why is that? This is the topic in the next chapter, Supporting Unicode.

Up next

Supporting Unicode