Text Blocks
Text Blocks were introduced in Java 12 as a preview feature and became a standard feature in Java 14. The following table shows the brief history of text blocks in Java.
| Java Release | Status | JEP |
|---|---|---|
| Java 12 | Preview | JEP 355 |
| Java 13 | Second Preview | JEP 368 |
| Java 14 | Standard | JEP 378 |
It’s a common requirement to embed multi-line string literals in Java source code. A typical example is to generate JSON/YAML/XML content based on string templates. Before Java 12, we had to use string concatenation to write multi-line strings in Java.
The following code shows how to use string concatenation to generate XML strings. Multi-line strings like this are hard to maintain. If you want to add double quotes in the string, these double quotes need to be escaped.
Text Blocks
Text blocks are multi-line string literals delimited by three double quotes. The following code shows two examples of text blocks. Double quotes don’t need to be escaped. Line terminators are kept.
Text blocks are processed by the Java compiler. In the runtime, text blocks are indistinguishable from normal single-line strings. Java compiler processes a text block in three steps:
- Translate line terminators in the content to
LF. - Remove incidental white space surrounding the content.
- Interpret escape sequences in the content.
The first step is straight-forward. Java compiler normalizes CR and CRLF to LF.
Re-indentation
White spaces may play an important role in multi-line strings. For XML and JSON content, white spaces can improve readability. For YAML content, white spaces are significant for the data structure. When text blocks are embedded in the source code, their indentation needs to match Java source code’s format.
In the code below, the first and fifth line of the text block have an indentation of 6 spaces, while other lines have an indentation of 8 spaces. For all lines, the indentation of 6 spaces is introduced by Java source code.
The actual content should be as shown below. This means the 6 spaces introduced by Java source code should be removed. This process is called re-indentation.
1 <user>
2 <id>001</id>
3 <name>alex</name>
4 <email>alex@example.com</email>
5 </user>
A line in a text block may contain both leading and trailing white spaces. Trailing white spaces are removed automatically by the Java compiler. For leading white spaces, they may be used for indentation. Java compiler applies a re-indentation algorithm to remove extra leading white spaces.
To generate content with correct indentation, Java compiler removes the same number of white spaces from each line. The number of white spaces to remove is the minimal number of white spaces in all lines. In the code above, the number of white spaces to remove is 6. After removing 6 white spaces in each line, we can get the desired output.
Special attention should be paid to the trailing blank line. The position of the closing delimiter in the trailing blank line can affect the number of white spaces to remove.
In the code below, the trailing blank line has an indentation of 6 spaces, which is the minimal number of white spaces in all lines. So only 6 white spaces will be removed. In the result content, the first line will have an indentation of 4 spaces.
Escape Sequences
Text blocks support all of the escape sequences supported in string literals. We can use " and "" freely in a text block. However, if you want to add more than two double quotes in a text block, some of these double quotes need to be escaped. More specifically, if you want to add n double quotes, at least Math.floorDiv(n, 3) of them need to be escaped. For example, to add 7 double quotes in a text block, at least 2 double quotes need to be escaped, which can be written as ""\"""\"".
Two new escape sequences are added.
\<line-terminator>explicitly suppresses the insertion of a newline character. It can only be used in text blocks.\stranslates to a single space. It can be in text blocks, traditional string literals, and character literals.
Sometimes we may want to add very long strings in the code. Instead of using string concatenations, we can text blocks. However, text blocks will add line terminators when the long string is broken into multiple lines. We can use \<line-terminator> to suppress this behavior.
In the code below, the result string LONG_STRING won’t have line terminators.
\s escape sequence has a special usage in text blocks. Trailing white spaces are removed by the Java compiler. If we want to keep some trailing white spaces for formatting purposes, we can add \s to the end of the line. All white spaces before \s will be kept, while other white spaces after \s will be removed.
In the code below, all lines will have exactly 11 characters.
New String Methods
New methods are added to String related to text blocks.
stripIndent()removes incidental white spaces from a string. This method actually implements the re-indentation algorithm performed by Java compiler to text blocks.translateEscapes()translates escape sequences in a string.formatted(Object... args)formats using current string as the format string with provided arguments.
The formatted method is very convenient when using text blocks to format strings.
The code below shows an example of using text blocks to format strings.