Leanpub: Publish Early, Publish Often

Going into Production

In the preceding chapters we covered a lot of ground, from Go basics to advanced corner cases that production systems often need to concern themselves with. By now, your application should be running locally, well-tested, documented and ready for its first users. Good job! It’s time to show it to the world. This is an exciting phase, but it comes with a whole new set of challenges. Your code is about to move from development to production, changing things forever. In this chapter, we present a high-level checklist of what it takes to get a modern Go application production-ready and running on the Internet, hopefully serving many satisfied users.

Continuous Integration & Continuous Deployment (CI/CD)

Continuous integration is a general term that describes the process of running tests and builds automatically whenever a team member adds new code to the codebase. Continuous integration in a production-ready Go system often also includes running miscellaneous checks and linters.

It is possible to write your own continuous integration system, but the effort is likely not worth it. There are many existing continuous integration systems out there. Some, like Travis CI, are hosted for you and are free for open source projects. Others, like Drone, are open source and free for self-hosting, but also comes with enterprise versions. We tabulated some popular options that currently support Go in the following table:

CI Service	Open Source	Configuration	Free version available
Travis CI	No	.travis.yml	Yes, for open source projects
Drone CI	Yes	.drone.yml that describes	Yes, as well as enterprise
		Docker build steps	options. Always self-hosted
Jenkins	Yes	Jenkinsfile, either	Yes, if self-hosted. Paid hosted
		declarative or scripted	versions also exist.
CircleCI	No	.circleci/config.yml	Yes, for open source projects
			hosted on GitHub or Bitbucket
GitHub Actions	No	.github/workflows/actions.yml	Yes, for open source projects
			hosted on GitHub

In larger organizations, the decision of which system to use will likely already have been made for you. But if you’re doing this on your own or in a team without an existing CI/CD solution, you should bias towards simplicity so that your energy can be focused on improving the application itself. Github Actions and Travis CI are two good, fully managed solutions for cost-conscious small projects. Jenkins and Drone CI, in contrast, are open source options that can be run on your own infrastructure for free. These are good solutions if you are willing to put in more time to save a few dollars. The best choice will be project-dependent, and we encourage you to do your own research.

Continuous Deployment (CD) is similar to Continuous Integration in that it is an automated process that kicks off as soon as code is pushed to the central repository. It is usually, but not always, run on the same system as CI. After CI steps complete, a CD process will automatically deploy the latest version to Staging and/or Production environments, or make images available in a central repository for later deployment, ideally with a click of a button. Different engineers can have differing preferences when it comes to how often, and in how automated of a fashion, code should be deployed. This can even be a contentious topic, if not handled with care. You (and your team) will need to decide where you would like to fall on the scale between manual ad-hoc deployments over SSH and fully automated deployments as soon as code is pushed. That said, there is strong evidence that high-performing teams are consistently pushing themselves to get closer to the automated side of this scale ¹³. Intuitively, fewer manual in a deployment lead to fewer mistakes, less time wasted and more frequent feedback from users.

Deployment

This section is a work in progress.

Logging

This section is a work in progress.

Monitoring

The key difference between running code in development and running code in production is that in production, you are no longer an active part of the loop. During development, you might notice an error or a bug and fix it. But in production, unless you have good observability over your system as it runs and others use it, you will not know when the system encounters problems. You may not be watching the logs, or you may be asleep! Besides, who has the time to stare at server logs all day long? Instead, you want to be sure that if something goes wrong, you will be notified and able to diagnose the issue quickly. This is a critical part of a production-ready web application, and is the focus of this chapter.

Metrics and Dashboards

For reasons of redundancy, latency or scale, a production web application is typically deployed to multiple servers. As soon as this happens, it becomes important to have high-level visibilty over what is happening on each of the servers, without needing to log into them first. Centralized logging, discussed in Logging, is one way to send logs from all your servers to a centralized location for ease of use. This can allow you to discover errors and even get the information you need to debug them. However, logs alone don’t always tell the full story. If there are no error logs, does that mean users are not experiencing any issues? Or can it be that the server stopped running? Or stopped sending logs? How much disk and memory pressure do we have on each server, is there likely to be an outage soon? Are users experiencing acceptable latency?

A production application should be coupled with a dashboard that can answer these questions (and many others) from a high level. Dashboards like these can be powered by logs, but typically are not. Instead, engineers turn to specialized timeseries databases to collect high-level metrics across all servers. These are then displayed on a web interface where a developer acquainted with the system can tell at a glance whether anything is out of place.

Alerting

This section is a work in progress.

Up next