The DevOps 2.5 Toolkit: Monitoring, Logging, and Auto-Scaling Kubernetes
Minimum price
Suggested price

The DevOps 2.5 Toolkit: Monitoring, Logging, and Auto-Scaling Kubernetes

Making Resilient, Self-Adaptive, And Autonomous Kubernetes Clusters

About the Book

Kubernetes is probably the biggest project we know. It is vast, and yet many think that after a few weeks or months of reading and practice they know all there is to know about it. It's much bigger than that, and it is growing faster than most of us can follow. How far did you get in Kubernetes adoption?

From my experience, there are four main phases in Kubernetes adoption.

In the first phase, we create a cluster and learn intricacies of Kube API and different types of resources (e.g., Pods, Ingress, Deployments, StatefulSets, and so on). Once we are comfortable with the way Kubernetes works, we start deploying and managing our applications. By the end of this phase, we can shout "look at me, I have things running in my production Kubernetes cluster, and nothing blew up!" I explained most of this phase in The DevOps 2.3 Toolkit: Kubernetes.

The second phase is often automation. Once we become comfortable with how Kubernetes works and we are running production loads, we can move to automation. We often adopt some form of continuous delivery (CD) or continuous deployment (CDP). We create Pods with the tools we need, we build our software and container images, we run tests, and we deploy to production. When we're finished, most of our processes are automated, and we do not perform manual deployments to Kubernetes anymore. We can say that things are working and I'm not even touching my keyboard. I did my best to provide some insights into CD and CDP with Kubernetes in The DevOps 2.4 Toolkit: Continuous Deployment To Kubernetes.

The third phase is in many cases related to monitoring, alerting, logging, and scaling. The fact that we can run (almost) anything in Kubernetes and that it will do its best to make it fault tolerant and highly available, does not mean that our applications and clusters are bulletproof. We need to monitor the cluster, and we need alerts that will notify us of potential issues. When we do discover that there is a problem, we need to be able to query metrics and logs of the whole system. We can fix an issue only once we know what the root cause is. In highly dynamic distributed systems like Kubernetes, that is not as easy as it looks.

Further on, we need to learn how to scale (and de-scale) everything. The number of Pods of an application should change over time to accommodate fluctuations in traffic and demand. Nodes should scale as well to fulfill the needs of our applications.

Kubernetes already has the tools that provide metrics and visibility into logs. It allows us to create auto-scaling rules. Yet, we might discover that Kuberentes alone is not enough and that we might need to extend our system with additional processes and tools. This phase is the subject of this book. By the time you finish reading it, you'll be able to say that your clusters and applications are truly dynamic and resilient and that they require minimal manual involvement. We'll try to make our system self-adaptive.

I mentioned the fourth phase. That, dear reader, is everything else. The last phase is mostly about keeping up with all the other goodies Kubernetes provides. It's about following its roadmap and adapting our processes to get the benefits of each new release.

About the Author

Viktor Farcic
Viktor Farcic

Viktor Farcic is a Developer Advocate at CloudBees and a member of the Docker Captains group.

His big passions are Microservices, Continuous Deployment and Test-Driven Development (TDD).

He often speaks at community gatherings and conferences.

Table of Contents

  • Preface
  • Overview
  • About the Author
  • Dedication
  • Audience
  • Requirements
  • Autoscaling Deployments and StatefulSets Based On Resource Usage
    • Creating A Cluster
    • Observing Metrics Server Data
    • Auto-Scaling Pods Based On Resource Utilization
    • To Replicas Or Not To Replicas In Deployments And StatefulSets?
    • What Now?
  • Auto-Scaling Nodes Of A Kubernetes Cluster
    • Creating A Cluster
    • Setting Up Cluster Autoscaling
    • Scaling Up The Cluster
    • The Rules Governing Nodes Scale-Up
    • Scaling Down The Cluster
    • The Rules Governing Nodes Scale-Down
    • Can We Scale Up Too Much Or De-Scale To Zero Nodes?
    • Cluster Autoscaler Compared in GKE, EKS, and AKS
    • What Now?
  • Collecting And Querying Metrics And Sending Alerts
    • Creating A Cluster
    • Choosing The Tools For Storing And Querying Metrics And Alerting
    • A Quick Introduction To Prometheus And Alertmanager
    • Which Metric Types Should We Use?
    • Alerting On Latency-Related Issues
    • Alerting On Traffic-Related Issues
    • Alerting On Error-Related Issues
    • Alerting On Saturation-Related Issues
    • Alerting On Unschedulable Or Failed Pods
    • Upgrading Old Pods
    • Measuring Containers Memory And CPU Usage
    • Comparing Actual Resource Usage With Defined Requests
    • Comparing Actual Resource Usage With Defined Limits
    • What Now?
  • Debugging Issues Discovered Through Metrics And Alerts
    • Creating A Cluster
    • Facing A Disaster
    • Using Instrumentation To Provide More Detailed Metrics
    • Using Internal Metrics To Debug Potential Issues
    • What Now?
  • Extending HorizontalPodAutoscaler With Custom Metrics
    • Creating A Cluster
    • Using HorizontalPodAutoscaler Without Metrics Adapter
    • Exploring Prometheus Adapter
    • Creating HorizontalPodAutoscaler With Custom Metrics
    • Combining Metric Server Data With Custom Metrics
    • The Complete HorizontalPodAutoscaler Flow Of Events
    • Reaching Nirvana
    • What Now?
  • Visualizing Metrics And Alerts
    • Creating A Cluster
    • Which Tools Should We Use For Dashboards?
    • Installing And Setting Up Grafana
    • Importing And Customizing Pre-Made Dashboards
    • Creating Custom Dashboards
    • Creating Semaphore Dashboards
    • A Better Dashboard For Big Screens
    • Prometheus Alerts vs. Grafana Notifications vs. Semaphores vs. Graph Alerts
    • What Now?
  • Collecting And Querying Logs
    • Creating A Cluster
    • Exploring Logs Through kubectl
    • Choosing A Centralized Logging Solution
    • Exploring Logs Collection And Shipping
    • Exploring Centralized Logging Through Papertrail
    • Combining GCP StackDriver With A GKE Cluster
    • Combining AWS CloudWatch With An EKS Cluster
    • Combining Azure Log Analytics With An AKS Cluster
    • Exploring Centralized Logging Through Elasticsearch, Fluentd, and Kibana
    • Switching To Elasticsearch For Storing Metrics
    • What Should We Expect From Centralized Logging?
    • What Now?
  • What Did We Do?
  • Contributions

The Leanpub 45-day 100% Happiness Guarantee

Within 45 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.

See full terms

Do Well. Do Good.

Authors have earned$10,578,526writing, publishing and selling on Leanpub, earning 80% royalties while saving up to 25 million pounds of CO2 and up to 46,000 trees.

Learn more about writing on Leanpub

Free Updates. DRM Free.

If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).

Most Leanpub books are available in PDF (for computers), EPUB (for phones and tablets) and MOBI (for Kindle). The formats that a book includes are shown at the top right corner of this page.

Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.

Learn more about Leanpub's ebook formats and where to read them

Write and Publish on Leanpub

You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!

Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.

Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF, EPUB and/or MOBI files and then publish with one click!) It really is that easy.

Learn more about writing on Leanpub