Kubernetes on TV: Watch How Your Services are Operating in Production

This article introduces a way to watch Kubernetes services in production thanks to a comprehensive visualization that provides a quick yet effective insight on the heath of your applications. This is a multi-cluster federation and consolidation approach that leverages native containers’ and pods’ health checks from Kubernetes. On top of that, the proposed visualization approach provides an advanced aggregation and simple dashboards allowing to quickly assess how services are operating in each individual namespace.

Illustration of the Proposed Namespace Activity Monitoring.

What You’ll Learn

This work intends to describe a visualization for Kubernetes that provides:

  • Easy/quick analysis of root causes in case of problems.
  • Comprehensive visualization for proactive monitoring that organizations typically need for T1/T2 helpdesk and NOC monitoring (NOC — Network Operations Centers).
  • Federated multi-cluster and multi-tenant visualization.
  • Service level indicators to enable analytics of failures trends over time.
  • Email notification when namespaces consolidated activity status changes.

Building Consolidated Namespace Status

As illustrated on the Figure below, the foundation of the approach is that:

  • The logic behind each namespace’s microservice tree is that: at the bottom we have containers bound to their pods, which pods are in turn bound to their services, that are finally bound to the namespace that is expected to represent a virtual application space.
  • Within the microservice tree, the status of components are propagated using a bottom up approach. That is designed to always highlight and propagate weird behaviors or situations that can suggest potential failures. For illustration, imagine a service based on replicated pods. If we consider a situation where there is a failed pod, the propagation shall show a problem to highlight the fact that you should consider a potential failure on the underlying pods — even if there are still running pods matching the service’s selectors (see illustration below). Another interesting point is that we’ll be always warned when there is a service whose selectors match no pod.
Illustration of Application’s Consolidated Status.

Practice

Its well admitted, practice is better than speech. In this part we’re going to demonstrate using RealOpInsight how the approach presented above can be implemented on your Kubernetes clusters. In a few steps we’ll see how you can use it in your Kubernetes monitoring environments in a few minutes.

Installing RealOpInsight

RealOpInsight is open source and available on Github. It’s also released as Docker images. It’s also provided with Helm manifests to deploy it on Kubernetes in a couple of minutes, or even seconds.

$ docker run -d \
--name realopinsight \
--network host \
--publish 4583:4583 \
rchakode/realopinsight

Accessing RealOpInsight UI

Once the container started (check withdocker ps), you shall be able to access RealOpInsight UI at http://127.0.0.1:4583/ui.

RealOpInsight — Administrator Home Page

Integration with a Kubernetes Cluster

RealOpInsight requires a read-only access to Kubernetes API, and the integration involves the following steps.

  • Select the menu Monitoring Sources to open the source configuration page.
  • Set the API Endpoint URL to https://kubernetes.default/ (in-cluster API URL) and , optionally if the cluster uses a self-signed certificate, check Don't verify SSL certificate.
  • Leave the field Auth String Token empty, meaning that RealOpInsight does authenticate against the Kubernetes API using its RBAC service account. That service account (named realopinsight) along with its RBAC permissions are created during the deployment with Helm.
  • Click Add as source and select an ID for the source when prompted.
  • Click on Apply to finish the operation.
RealOpInsight — Configuration of a Kubernetes source.

Verify the Kubernetes Source

Select the menu Manage Operations Views to check that all the namespaces within Kubernetes have been successfully discovered and imported as on the below screenshot (list at the right side).

RealOpInsight — List of namespaces automatically discovered and imported from Kubernetes (right side)

Preparing for the Final Visualization

At this step we’re almost ready to visualize our services as expected, but we need to prepare our environment for that:

  • Then select the menu Manage Operations Views and move to next step.
  • In the user list at the left side, select the username created previously (kops for this tutorial).
  • In the namespace list at the right side, select items the user should visualize. You can hold the Ctrl key to select multiple items. Remark that, when you have several users you can assign to each of them a specific set of items for visualization. This capability is typically useful for multi-tenant monitoring environments.
  • Click on the button Assign to validate your choice.
  • We’re done and can move forward for the visualization.
RealOpInsight — Assignment of namespaces for visualization for a user

Go To Visualization

Log into RealOpInsight as the user you created previously (kops for this tutorial). Upon the login the user’s default dashboard will be loaded and we shall watch a comprehensive view that looks like on the below screenshot. In this dashboard we have:

  • A Reports section also at the left side: it provides for each namespace a history of pods’ status over a selected period of time (30 last days by default).
  • An Open Events section at the right side: it provides a feed of last failures on pods — regardless of the affected namespace.
RealOpInsight — Tactical operations dashboard providing overview of namespaces’ status

Explore Failure Impact and Root Causes

Each namespace’s microservice tree is backed by a console that simplifies the analysis of incident impact and the identification of problems’ root causes.

RealOpInsight — Namespace’s console for easy analysis of failure impact and root causes.

Conclusion/Next Steps

In this story, we’ve shown a way to watch Kubernetes services in production environments operated by help-desk and Network Operations Centers (NOC) teams. We first introduced the basic concepts behind the proposed approach, then demonstrated step-by-step an implementation based on RealOpInsight.

PhD Computer Science && Cloud Specialist && Entrepreneur & & Sunday open source dev (https://github.com/rchakode - https://krossboard.app).

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store