Table of Contents
1. Infrastructure Resources Integration
Applications are seldom found in the void: they interact with resources such as file systems, databases, SMTP servers, FTP servers and the like. Infrastructure integration refers to how application interactions with those resources can be tested and validated. While mocks or fakes can be used in place of infrastructure resources, they will not help validate the behavior of those resources.
This chapter covers strategies and tools to achieve testing resources in the following domains:
- System Time
- Filesystem
- Databases
- Mail servers
- FTP servers
Similar strategies can be used for other resource types. However, Web Services integration will be the subject of the next chapter given its scope.
1.1 Common resource integration testing techniques
This section describes techniques that enable and ease Integration Testing resources integration.
1.1.1 No hard-coded resource reference
Most infrastructure resources have something in common: they can be located through a unique identifier and offer a set of commands to be called. The following table displays some resources with an example identifier:
 Resource | Identifier | Example |
---|---|---|
Relational DataBase Management System | Java Database Connectivity URL | jdbc:mysql://localhost:3306/myapp |
SMTP server | Domain | smtp.gmail.com |
FTP server | URL | ftp://ftp.ietf.cnri.reston.va.us/ |
To enable testing, those URLs and domains should neither be hard-coded nor packaged into the deployed application in any way, so as to be able to change them without redeploying.
In pure Java, not hard-coding resource references is typically achieved by using Properties (http://docs.oracle.com/javase/7/docs/api/java/util/Properties.html) file(s). Properties are plain key value pairs, formatted as key=value
. This is a basic example of reading such a file:
1.1.2 Setting up and cleaning data
When integration testing resources have state (e.g. filesystems, databases, JMS queues, etc), data sometimes has to be set up. There are two different cases:
- Repository data that is to be set up and never (or very rarely) removed. For example, a list of countries
- Contextual data that provides data to be manipulated. For example, emails to test the POP3 receiving feature
Most of the time, sample data is set up before running tests and all data cleaned after the run. But, if a test fails, and the resource state has been wiped clean, there is no clue as to the reason why it failed. Thus, it is much more convenient to first clear data then set sample data during initialization and do nothing during clean up (at least nothing regarding data) to keep the state if needed.
1.2 System Time integration
Time-dependent components are among the hardest ones to test, since time is something that cannot easily be mocked/faked/stubbed for testing purposes. Fortunately, there are a couple of answers to this:
- Manually
- Manually changing the system time is a possible option when testing manually. However, this is completely out of the question in regard to automated testing.
- Compatible design
- In Java, time usage is based on
System.currentTimeMillis()
but there is no counterpart programmatic setter for this getter. Directly using this API couples the calling code to the system clock. In order to improve testability, the call should be embedded inside an abstraction, with a default implementation usingSystem.currentTimeMillis()
. Here is a proposed design:This way, the
Clock
can be replaced by a Mock in tests. - Joda Time
- Joda Time is a library that aims to provide a better
Date
class, but offers a whole API for time management (for happy Java 8 users, it is integrated into thejava.time
package). It also offers a wrapper aroundSystem.currentTimeMillis()
out-of-the-box: by default,DateTimeUtils.currentTimeMillis()
delegates to the former. However, unlike the native API, Joda Time’s can be plugged with alternative implementations.Alternative implementations include:
- A fixed time implementation with
DateTimeUtils.setCurrentMillisFixed(long fixedMillis)
-
DateTimeUtils.setCurrentMillisOffset(long offsetMillis)
provides an offset time implementation e.g. a fixed time difference with the system time.
Getting back to using the system time is achieved by calling
DateTimeUtils.setCurrentMillisSystem()
. If the provided alternative are not enough, a customMillisProvider
implementation can be developed. - A fixed time implementation with
1.3 Filesystem integration
Some applications expect file(s) as input(s) and produce file(s) as output(s); these are commonly known as Batch-type applications. The goal of this section is to describe how to provide the former and to assert the existence and content of the latter.
Before Java 7, files were handled through the java.io.File
class. Here is a class dedicated to reading from and writing to files using this legacy class.
Given this code, testing requires the injection of File
instances. Since File
depends on the file system, referencing a real file is necessary. This is possible by using the java.io.tmpdir
system property that points to the system temporary directory (it should be writable on most if not all systems with no privileges). Testing code would look something like the following:
Things are easier since Java 7, as the API offers the new java.nio.file.Path
interface to provide an abstraction over the file system. The previous component can be replaced with this one while still keeping the same test code. Just replace the File
parameter with a Path
, this is easily achieved with file.getPath()
.
1.4 Database integration
Different Java projects use various persistence strategies (SQL vs NoSQL), different database vendors and different ways to access their persistence store(s).
In the SQL realm, RDBMS are standardized enough to provide a common abstraction layer so there are many more ways to access them from Java. From the oldest to the most recent, these are:
- Plain old Java DataBase Connectivity, JDBC
- EJB 2.x Entity Beans, CMP & BMP
- Java Data Objects (http://db.apache.org/jdo/index.html), JDO
- Java Persistence API (http://docs.oracle.com/javaee/5/tutorial/doc/bnbpz.html), JPA
- Hibernate (http://hibernate.org/orm/)
- EclipseLink (http://projects.eclipse.org/projects/rt.eclipselink)
- MyBatis (http://mybatis.github.io/mybatis-3/)
- Spring Data (http://projects.spring.io/spring-data/)
- jOOQ (http://www.jooq.org/)
In general, for NoSQL solutions, the vendor offers both the store and the connector(s) to access it. For example, MongoDB (http://www.mongodb.org/) is fully under the vendor control and consists of the persistence store itself, a command-line interface and a Java driver. For the record, there is also a third-party Jongo (http://jongo.org/) driver.
1.4.1 SQL Database integration
Regarding Database Integration Testing, a benefit of SQL standardization is that the following section applies to any database product.
1.4.1.1 Datastore environment strategies
Integration Testing with a data store cannot use a single shared instance as is the case for deployed environments (Development, Q&A, Production, etc.). There is probably more than one developer, and they each need to test on their own database, without the possibility of stepping on their colleagues toes. This requires each developer to have a dedicated database. Basically, there are 3 different strategies to address this:
- In-memory database
- In-memory databases are dedicated data stores, running in-memory on the developer machine. Available in-memory databases include Apache Derby (http://docs.oracle.com/javadb/10.8.3.0/getstart/index.html) (also released by Oracle under the name “Java DB”), HSQLDB (http://hsqldb.org/) and H2 (http://www.h2database.com/). In-memory DBs greatest advantage are their ability to be set up (and discarded) quickly and easily, thus making test execution completely independent from infrastructure.
Using a in-memory database for testing means handling the mismatch between the testing DB and the deployed DB(s) capabilities. At a minimum, the portion of SQL syntax used should stick to the syntax supported by both, which usually limits development options.
All above referenced in-memory databases also support persistent storage on the file system. Then, in case of failure, the DB state can be checked after test execution. Since it is very easy to configure - with the only requirement a change in the JDBC URL - there is no reason not to use this. For example, H2 persistent file storage can be configured with a URL such as
jdbc:h2:[<path>]/<databaseName>
. - Local database
- With a local database, each developer installs a copy of the deployed RDBMS on his local development machine. This strategy guarantees 100% compatibility with the deployed platform, at the cost of always keeping the local software version synchronized with the deployed one. Moreover, software installation becomes a prerequisite as it cannot be automated (or very painfully) during application build.
- Single remote database per developer
- This strategy is about creating a dedicated database per application and per developer. Following this strategy mandates that the Database Administrator(s) has to create one database per developer on the deployed platform (in general, the deployed development environment) and each developer to use it for their tests. It is the best strategy to ensure the exact same behavior during testing and in production. On the downside, it either require developers to have admin rights on a database or a working process enrolling the DBA.
Choosing one strategy over the others really is a matter of context. If you enjoy a good relationship with the DBAs in the organization and plenty of disk space is available, you should probably go for a single remote database. If the project is Open Source and anyone can get the sources and execute the tests, you should use in-memory databases.
All the aforementioned strategies require the same flexibility: to be able to configure the mapping between a developer and a URL (or a schema) during test runs. There are different ways to achieve this, but the primart way should be to use a properties file, as with file system integration.
1.4.1.2 Data management with DBUnit
From a database integration point of view, requirements are twofold:
- Put the database in a specific state during initialization
- Assert the database is in an expected state at the end of the test
For example, for testing the whole order process, initialization has to put customers and products in setup, and after it has run, we check a new order line has been written in the database.
[DBUnit (http://www.dbunit.org/)] is a Java framework that dates back to 2002, has seen no code-related activity since January 2013 and is based on JUnit v3.x. However, it is the only one of its kind and offers both the features required above.
- Creating datasets
- In DBUnit, datasets can be either created “by hand” or exported from an existing database; they both mimic the database structure. They are available in two different XML file formats, standard and flat.
As can be seen, standard format is much more verbose but its Document Type Definition grammar is the same across all databases:
On one hand, this makes it reusable from test suite to test suite; on the other hand, this also makes it almost useless, as it will not catch any table or attribute mistyping in the XML. To do that, it is necessary to use flat files and create the DTD. If by chance the database schema already exists, this snippet (http://www.dbunit.org/faq.html#generatedtd) connects to the database, reads the schema and writes it in a DTD file. It can then simply be referenced by the XML.
Whether using standard or flat XML files, exporting a sample dataset from an existing database can be achieved with the following snippet: http://www.dbunit.org/faq.html#extract. In essence, it shows how to connect to the database, execute a query and write results and dependent the results in an XML file.
- Inserting datasets
- Handling of datasets requires a DBUnit
DatabaseOperation
instance. TheDatabaseOperation
interface has a single methodexecute(IDatabaseConnection, IDataSet)
. Implementations of those are found as constants of the same class. Among them, some are of particular interest:-
DELETE
deletes the dataset (and only the dataset) from the database -
DELETE_ALL
deletes all data from the database, but only from tables referenced in the dataset -
INSERT
inserts the dataset into the database -
CLEAN_INSERT
is an ordered composite operation ofDELETE_ALL
thenINSERT
Given that test data should be kept after test execution, it is advised to use only
CLEAN_INSERT
. This way, data put in reference tables (such as zip codes, countries, etc.) will not be erased. -
- Asserting datasets
- Finally, when a test has run its course, the database has to be in the expected state. Doing that manually would require connecting to the database, querying tables and checking data line by line.
DBUnit is able to handle all of that for us by:
- Caching the connection in the
IConnection
implementation - Creating the dataset from the connection through the
createDataSet()
method - Providing an
Assertion
class that can compare both dataset and table contents.
- Caching the connection in the
Imagine a full-fledged example to test the former OrderService
. This would require setting up reference customers and products, passing the order, then checking there is another new order. Let’s use H2 as a persistent database (file-based) as it shows more complex setup logic.
This part displays the setup done once before all the tests in this class. The steps are the following:
- Line 8
- As for any standard direct database connection, the H2 driver is loaded in memory, so JDBC knows which driver to use in order to communicate with H2 databases.
- Lines 9-11
- Existing database files are removed so test(s) can run from a clean state.
- Line 12-13
- The JDBC connection to the database is created.
- Line 14
- The JDBC connection is wrapped in a DBUnit connection wrapper and the latter stored as an attribute for future use.
- Lines 15-16
- DBUnit’s default compatibility is with Apache Derby. DBUnit is explicitly configured to use H2 in order to prevent potential data type mismatch(es) between Derby and H2. Other data type factories for major database vendors are available in the library (see all packages (http://dbunit.sourceforge.net/apidocs/) starting with
org.dbunit.ext
). - Lines 17-25
- Database tables used in the tests are (finally) created. This is done through standard SQL DDL statements.
It is a good practice to close the connection to the database after the tests have completed.
In-memory databases are only accessible during a JVM run. In a test context, this means this step could well be omitted with no side-effects. However, closing the connection should be a habit and makes the above snippet a template reusable in other scenarios.
As in previous tests, before each test case, the setup initializes the instance under test so as to have no side-effect from state changed in previous test cases.
The test itself does the following:
- Line 4-8:
- The database is set in the expected state. In this case, an existing file is loaded to populate the database. Since there are no other test cases, this could also have been done during setup. Also, it would have been enough to use
INSERT
instead ofCLEAN_INSERT
. However, as for closing the database connection, it makes sense to support more advanced usages, such as having multiple test cases (a likely occurrence in real-life projects). - Line 9:
- The method to be tested (at last!)
- Line 10-14:
- An expected data set is created from a provided file. Then the actual data is read from the database and compared. If both are equal, the test passes; if not, it fails.
1.4.1.3 Setting up data with DBSetup
Using DBUnit to set up data at test initialization requires tons of XML which is:
- Time-consuming
- DBUnit XML files mimic the database tables structure. Columns are repeated as XML attributes on each line, so that writing an XML file involves copy-pasting and changing the value.
Also, XML is declarative only and prevents tests and loops.
- Error-prone
- XML is not compiled. At best, it can be validated against a Document Type Definition (or an XML schema). Anyone who has ever generated a DTD (http://dbunit.sourceforge.net/faq.html#generatedtd) for DBUnit knows the configuration to use in both tests and IDE with no error takes time.
DBSetup (http://dbsetup.ninja-squad.com/) is an initiative aimed at correcting those drawbacks for inserting data. However, it does not provide ways to validate data after test execution.
1.4.2 NoSQL Database integration
While Integration Testing benefits from SQL standards, NoSQL offers no such standardization: each product might have an associated product dedicated to help with Integration Testing… or not. At the time of this writing, only MongoDB (https://www.mongodb.org/) offers such a testing framework.
1.4.2.1 MongoDB integration
MongoDB is a NoSQL document database. From a developer point of view, once MongoDB has been installed on a server, using this particular database instance looks as follows:
This snippet assumes a MongoDB instance is running on localhost and listening on port 27017. For testing purposes, this means one has:
- To make sure the MongoDB binaries are available on the system
- If not, download and extract them
- To have permissions to setup MongoDB
- To get it up and running during test startup
- To shut it down during cleanup
It is possible, but not exactly worthwile from a ROI perspective. This process can however be replaced by using Fongo (https://github.com/fakemongo/fongo).
Fongo usage is very straightforward: it provides a Fongo
class that offers more or less the same methods as the standard MongoClient
but the former does not share any contract with the latter. However, Fongo also provides a MockMongoClient
that both inherits from MongoClient
and wraps a Fongo
instance.
That requires application design to allow for the Mongo client to be injected. The starting point would look something like the following:
Updating the design would involve replacing the above constructor code with this:
Nothing fancy here: simply that the lessons for achieving a testing-friendly design from Chapter 3 - Test-Friendly Design have been applied. At this point, testing becomes very easy:
Here is the explanation of the code.
- Line 19
- A new
Fongo
instance is created. Notice the constructor requires aString
parameter: it is the instance’s name but plays no role whatsoever. - Line 20
- Create the Mongo client required for the class under test at line 24, using the
MockMongoClient.create()
factory method. - Lines 26-30
- Set up the database state by adding a new
DBObject
to the Mongo collection.
1.5 eMail integration
Many applications have email-sending capabilities, while some have email-receiving ones. This is generally achieved by connecting to a Simple Mail Transfer Protocol server (respectively POP3/IMAP) and issuing the relevant commands.
A huge Integration Testing issue occurs when application use-cases require to send or receive emails. As with databases, the components responsible for those features can be mocked, but this does not guarantee that they will work correctly.
Fortunately, two fake SMTP servers are available for use during tests:
- Dumbster (http://quintanasoft.com/dumbster/) is a fake SMTP server, where outbound mails are stored for later retrieval and asserts
- Greenmail (http://www.icegreen.com/greenmail/) is an SMTP, POP3 and IMAP server with SSL support all rolled into one
Greenmail has two big advantages over Dumbster: it is feature-complete regarding emails and is available as a Maven dependency (refer to Chapter 4 - Automated testing for a refresher if needed) on repo1.
Greenmail main classes are:
The Greenmail root class is GreenMail
which represents the email server itself. It can respectively be started and stopped with its start()
and stop()
methods.
- To get the messages received by the GreenMail fake server, call the
receiveMessages()
method on theGreenMail
instanceThe previous code gets the messages from the fake server, checks the number of received messages, get the single message and then checks the sender. Other data could also be checked (sender, subject, body).
- To send messages to the Greenmail fake server, call the static
GreenMailUtil.sendTextEmailTest()
method (or its secured protocol ports counterpartGreenMailUtil.sendTextEmailSecureTest()
instead)The above code uses Greenmail to send two dummy email messages.
1.6 FTP integration
Some applications require interacting with an FTP server, either uploading or downloading files. As for other components interacting with infrastructure resources, FTP-responsible components have to be tested by using the techniques already seen above.
In this area, the MockFTPServer (http://mockftpserver.sourceforge.net/) project is a great library that offers two different capabilities depending on the entry-point class used:
-
org.mockftpserver.fake.FakeFtpServer
represents a fake FTP server -
org.mockftpserver.stub.StubFtpServer
is an abstraction over an FTP server that can be stubbed with behavior required for testing
As with the Greenmail server, the server’s lifecyle can be managed through the start()
and stop()
methods.
1.6.1 Fake FTP server
FakeFtpServer
is one of the two components of MockFTPServer: the Fake server is set a virtual filesystem, complete with accounts and permissions.
When the server is issued FTP commands, it is the filesystem that is used. The API offers two concrete filesystem implementations, one *nix-like and one Windows-like. The good thing is that since the filesystem is virtual, either of the two implementations can be plugged-in regardless of the physical Operating System that the tests are executed on, meaning one should target the production filesystem.
As an example, imagine a component that gets text files from FTP servers and that needs to be tested. Here is some sample code to achieve that, using the Fake server:
The Fake server is good enough for typical FTP server behavior. However, it cannot simulate error behavior. For this, custom-made stubs are required.
1.6.2 Stub FTP server
In order to provide the desired behavior during test execution, it is possible to provide custom CommandHandler
s that are invoked when calling a specific FTP command.
MockFTPServer provides a default command handler implementation for each FTP command, and there are many. Note that command handler classes are also used internally in the Fake server. The following diagram displays the upper levels of the CommandHandler
class hierarchy.
In regard to the previous example, instead of using a Fake server with a file added at the root, one could provide a command handler that returns the file on RETR
commands. Line 28 from the previous code can be replaced with the following:
Notes:
- The Stub server does not allow a filesystem to be set.
- A more advanced (and expensive) setup would be to extend
RetrCommandHandler
and overrideprocessData()
to analyze the parameters of theCommand
argument. If it is"/dummy.txt"
then return the content, otherwise send the appropriate reply (500
).
At this point, using the handler is a simple as calling server.set(RETR, new AdvancedRetrCommandHandler())
.
1.7 Summary
This chapter detailed how to test integration with some important infrastructure resources, such as filesystems, system time, databases, email servers and FTP servers, as well as tools that help in doing so.
There is no common lesson here, just to be aware of the Faking tool appropriate for each resource. However, in this day and age, most resources that need integrating with are web services. Those will be covered in the next chapter.