The DevOps 2.2 Toolkit: Self-Sufficient Docker Clusters

Retired

This book is no longer available for sale.

The DevOps 2.2 Toolkit: Self-Sufficient Docker Clusters

Building Self-Adaptive And Self-Healing Docker Clusters

About the Book

It seems that with each new book the scope gets fuzzier and less precise. When I started writing Test-Driven Java Development the scope of the whole book was done in advance. I had a team working with me. We defined the index and a short description of each chapter. From there on we worked on a schedule as most technical authors do. Then I started writing the second book. The scope was more obscure. I wanted to write about DevOps practices and processes and had only a very broad idea what will be the outcome. I knew that Docker had to be there. I knew that configuration management is a must. Microservices, centralized logging, and a few other practices and tools that I used in my projects were part of the initial scope. For that book, I had no one behind me. There was no team but me, a lot of pizzas, an unknown number of cans of Red Bull, and many sleepless nights. The result is "The DevOps 2.0 Toolkit: Automating the Continuous Deployment Pipeline with Containerized Microservices". With the third book, the initial scope became even more obscure. I started writing without a plan. It was supposed to be about cluster management. After a couple of months of work, I attended DockerCon in Seattle where we were presented with the new Docker Swarm Mode. My immediate reaction was to throw everything I wrote to trash and start over. I did not know what will the book be about except that it must be something about Docker Swarm. I was impressed with the new design. Something about Swarm ended up being "The DevOps 2.1 Toolkit: Docker Swarm: Building, testing, deploying, and monitoring services inside Docker Swarm clusters". While working on it, I decided to make DevOps Toolkit Series. I thought that it would be great to record my experiences from different experiments, and from working with various companies and open source projects. So, naturally, I started thinking and planning the third installment in the series; "The DevOps Toolkit 2."2. The only problem is that, this time, I honestly did not have a clue what will it about. One idea was to do a deep comparison of different schedulers (e.g., Docker Swarm, Kubernetes, and Mesos/Maraton). The another was to explore serverless. Even though it is a terrible name (there are servers, we just don't manage them), it is a great subject. The ideas kept coming, but there was no clear winner. So, I decided not to define the scope. Instead, I defined some general objectives.

The goals I set in front of were to build a self-adaptive and self-healing system based on Docker. When I started writing this book, I did not know how I will do that. There were different bits of practices and tools I've been using, but there was no visible light at the end of the tunnel. Instead of defining what the book will be, I defined what I want to accomplish. You can think of this book as my recording of the journey. I had to explore a lot. I had to adopt some new tools and write some code myself. Think of this book as "Viktor's diary while trying to do stuff."

The objectives are to go beyond a simple setup of a cluster, services, continuous deployment, and all the other things you probably already know. If you don't, read my older books.

About the Author

Viktor Farcic
Viktor Farcic

Viktor Farcic is a lead rapscallion at Upbound, a member of the CNCF AmbassadorsGoogle Developer Experts, CDF Ambassadors, and GitHub Stars groups, and a published author.

He is a host of the YouTube channel DevOps Toolkit and a co-host of DevOps Paradox.

Table of Contents

  • Preface
  • Overview
  • Audience
  • About the Author
  • Dedication
  • Introduction to Self-Adapting and Self-Healing Systems
    • What Is A Self-Adaptive System?
    • What Is A Self-Healing System?
    • What Now?
  • Choosing A Solution For Metrics Storage And Query
    • Non-Dimensional vs. Dimensional Metrics
    • Graphite
    • InfluxDB
    • Nagios and Sensu
    • Prometheus
    • Which Tool Should We Choose?
    • What Now?
  • Deploying And Configuring Prometheus
    • Deploying Prometheus Stack
    • Designing A More Dynamic Monitoring Solution
    • Deploying Docker Flow Monitor
    • Integrating Docker Flow Monitor With Docker Flow Proxy
    • What Now?
  • Scraping Metrics
    • Creating The Cluster And Deploying Services
    • Deploying Exporters
    • Exploring Exporter Metrics
    • Querying Metrics
    • Updating Service Constraints
    • Using Memory Reservations and Limits in Prometheus
    • What Now?
  • Defining Cluster-Wide Alerts
    • Creating The Cluster And Deploying Services
    • Creating Alerts Based On Metrics
    • Defining Multiple Alerts For A Service
    • Postponing Alerts Firing
    • Defining Additional Alert Information Through Labels And Annotations
    • Using Shortcuts To Define Alerts
    • What Now?
  • Alerting Humans
    • Creating The Cluster And Deploying Services
    • Setting Up Alertmanager
    • Using Templates In Alertmanager Configuration
    • What Now?
  • Alerting The System
    • The Four Quadrants of A Dynamic And Self-Sufficient System
  • Self-Healing Applied To Services
    • Creating The Cluster And Deploying Services
    • Using Docker Swarm For Self-Healing Services
    • Is It Enough To Have Self-Healing Applied To Services?
    • What Now?
  • Self-Adaptation Applied To Services
    • Choosing The Tool For Scaling
    • Creating The Cluster And Deploying Services
    • Preparing The System For Alerts
    • Creating A Scaling Pipeline
    • Preventing The Scaling Disaster
    • Notifying Humans That Scaling Failed
    • Integrating Alertmanager With Jenkins
    • What Now
  • Painting The Big Picture: The Self-Sufficient System Thus Far
    • Developer’s Role In The System
    • Continuous Deployment Role In The System
    • Service Configuration Role In The System
    • Proxy Role In The System
    • Metrics Role In The System
    • Alerting Role In The System
    • Scheduler Role In The System
    • Cluster Role In The System
    • What Now?
  • Instrumenting Services
    • Defining Requirements Behind Service Specific Metrics
    • Differentiating Services Based On Their Types
    • Choosing Instrumentation Type
    • Creating The Cluster And Deploying Services
    • Instrumenting Services Using Counters
    • Instrumenting Services Using Gauges
    • Instrumenting Services Using Histograms And Summaries
    • What Now?
  • Self-Adaptation Applied to Instrumented Services
    • Setting Up The Objectives
    • Creating The Cluster And Deploying Services
    • Scraping Metrics From Instrumented Services
    • Querying Metrics From Instrumented Services
    • Firing Alerts Based On Instrumented Metrics
    • Scaling Services Automatically
    • Sending Error Notifications To Slack
    • What Now?
  • Setting Up A Production Cluster
    • Creating a Docker For AWS Cluster
    • Deploying Services
    • Securing Services
    • Persisting State
    • Alternatives to CloudStor Volume Driver
    • Setting Up Centralized Logging
    • Extending The Capacity Of The Cluster
    • What Now?
  • Self-Healing Applied To Infrastructure
    • Automating Cluster Setup
    • Exploring Fault Tolerance
    • What Now?
  • Self-Adaptation Applied To Infrastructure
    • Creating A Cluster
    • Scaling Nodes Manually
    • Creating Scaling Job
    • Scaling Cluster Nodes Automatically
    • Rescheduling Services After Scaling Nodes
    • Scaling Nodes When Replica State Is Pending
    • What Now?
  • Blueprint Of A Self-Sufficient System
    • Service Tasks
    • Infrastructure Tasks
    • Logic Matters, Tools Might Vary
    • What Now?

The Leanpub 60 Day 100% Happiness Guarantee

Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.

Now, this is technically risky for us, since you'll have the book or course files either way. But we're so confident in our products and services, and in our authors and readers, that we're happy to offer a full money back guarantee for everything we sell.

You can only find out how good something is by trying it, and because of our 100% money back guarantee there's literally no risk to do so!

So, there's no reason not to click the Add to Cart button, is there?

See full terms...

80% Royalties. Earn $16 on a $20 book.

We pay 80% royalties. That's not a typo: you earn $16 on a $20 sale. If we sell 5000 non-refunded copies of your book or course for $20, you'll earn $80,000.

(Yes, some authors have already earned much more than that on Leanpub.)

In fact, authors have earnedover $13 millionwriting, publishing and selling on Leanpub.

Learn more about writing on Leanpub

Free Updates. DRM Free.

If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).

Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.

Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.

Learn more about Leanpub's ebook formats and where to read them

Write and Publish on Leanpub

You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!

Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.

Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF and/or EPUB files and then publish with one click!) It really is that easy.

Learn more about writing on Leanpub