Kubernetes on TV: Watch How Your Services are Operating in Production
This article introduces a way to watch Kubernetes services in production thanks to a comprehensive visualization that provides a quick yet effective insight on the heath of your applications. This is a multi-cluster federation and consolidation approach that leverages native containers’ and pods’ health checks from Kubernetes. On top of that, the proposed visualization approach provides an advanced aggregation and simple dashboards allowing to quickly assess how services are operating in each individual namespace.
The practical part of this work is based on RealOpInsight, an open source (Github) business activity visualization add-on that works on top of Kubernetes API to make operations monitoring easy.
Disclaimer: I’m the author of RealOpInsight.
What You’ll Learn
This work intends to describe a visualization for Kubernetes that provides:
- Tactical dashboards with a perspective of namespace activity monitoring.
- Easy/quick analysis of root causes in case of problems.
- Comprehensive visualization for proactive monitoring that organizations typically need for T1/T2 helpdesk and NOC monitoring (NOC — Network Operations Centers).
- Federated multi-cluster and multi-tenant visualization.
- Service level indicators to enable analytics of failures trends over time.
- Email notification when namespaces consolidated activity status changes.
Building Consolidated Namespace Status
As illustrated on the Figure below, the foundation of the approach is that:
- All namespaces are automatically discovered and, for each of them, its components (containers, pods and services) are also automatically discovered to generate a namespace microservice tree that binds the relationship among those components. Basically the discovery of components and the relationship binding within the microservice tree fully rely on pods’ labels and services’ selectors in Kubernetes.
- The logic behind each namespace’s microservice tree is that: at the bottom we have containers bound to their pods, which pods are in turn bound to their services, that are finally bound to the namespace that is expected to represent a virtual application space.
- Within the microservice tree, the status of components are propagated using a bottom up approach. That is designed to always highlight and propagate weird behaviors or situations that can suggest potential failures. For illustration, imagine a service based on replicated pods. If we consider a situation where there is a failed pod, the propagation shall show a problem to highlight the fact that you should consider a potential failure on the underlying pods — even if there are still running pods matching the service’s selectors (see illustration below). Another interesting point is that we’ll be always warned when there is a service whose selectors match no pod.
Practice
It’s well admitted, practice is better than speech. In this part we’re going to demonstrate using RealOpInsight how the approach presented above can be implemented on your Kubernetes clusters. In a few steps we’ll see how you can use it in your Kubernetes monitoring environments in a few minutes.
Installing RealOpInsight
RealOpInsight is open source and available on Github. It’s also released as Docker images. It’s also provided with Helm manifests to deploy it on Kubernetes in a couple of minutes, or even seconds.
Assuming that you’re running Docker on your local machine, the following command shall start an instance of RealOpInsight. See the installation guide for a deployment on Kubernetes.
$ docker run -d \
--name realopinsight \
--network host \
--publish 4583:4583 \
rchakode/realopinsight
Accessing RealOpInsight UI
Once the container started (check withdocker ps
), you shall be able to access RealOpInsight UI at http://127.0.0.1:4583/ui
.
The default credentials to log in are:admin/password.
The administrator home page looks as on the following screenshot.
Integration with a Kubernetes Cluster
RealOpInsight requires a read-only access to Kubernetes API, and the integration involves the following steps.
- Sign in as administrator (default credentials:
admin/password
). - Select the menu
Monitoring Sources
to open the source configuration page. - Set the
API Endpoint URL
tohttps://kubernetes.default/
(in-cluster API URL) and , optionally if the cluster uses a self-signed certificate, checkDon't verify SSL certificate
. - Leave the field
Auth String Token
empty, meaning that RealOpInsight does authenticate against the Kubernetes API using its RBAC service account. That service account (namedrealopinsight
) along with its RBAC permissions are created during the deployment with Helm. - Click
Add as source
and select an ID for the source when prompted. - Click on
Apply
to finish the operation.
Verify the Kubernetes Source
Select the menu Manage Operations Views
to check that all the namespaces within Kubernetes have been successfully discovered and imported as on the below screenshot (list at the right side).
Additionally, by using the menu Preview
you can see how each namespace’s microservice tree shall look like. But that’s not what we want at the end, so let’s move forward.
Preparing for the Final Visualization
At this step we’re almost ready to visualize our services as expected, but we need to prepare our environment for that:
- Select the menu
New User
and fill in the form to create a new user. Set the required fields and take care to set the user profile as Operator; the password should to be an alpha numeric string with at least 6 characters. For this tutorial we assume that the user created is namedkops
. - Then select the menu
Manage Operations Views
and move to next step. - In the user list at the left side, select the username created previously (
kops
for this tutorial). - In the namespace list at the right side, select items the user should visualize. You can hold the
Ctrl
key to select multiple items. Remark that, when you have several users you can assign to each of them a specific set of items for visualization. This capability is typically useful for multi-tenant monitoring environments. - Click on the button
Assign
to validate your choice. - We’re done and can move forward for the visualization.
Go To Visualization
Log into RealOpInsight as the user you created previously (kops
for this tutorial). Upon the login the user’s default dashboard will be loaded and we shall watch a comprehensive view that looks like on the below screenshot. In this dashboard we have:
- A Tactical Overview section at the left side: it provides for each namespace a tile describing the overall status propagated by the underlying microservice tree. By clicking on a tile you will open the microservice tree console providing details on containers, pods and services. This console is further introduced in the next section.
- A Reports section also at the left side: it provides for each namespace a history of pods’ status over a selected period of time (30 last days by default).
- An Open Events section at the right side: it provides a feed of last failures on pods — regardless of the affected namespace.
Explore Failure Impact and Root Causes
Each namespace’s microservice tree is backed by a console that simplifies the analysis of incident impact and the identification of problems’ root causes.
See the screenshot below for illustration.
Basically the console provides: a Tree View (left side) and a Map (top right side) that display the microservice tree with two exploration perspectives; a Message Panel (bottom right side) to display status messages related to containers and pods. There is also a pie chart (bottom left side) displaying the ratio of pods according to their status — running, failed, pending, etc.
Conclusion/Next Steps
In this story, we’ve shown a way to watch Kubernetes services in production environments operated by help-desk and Network Operations Centers (NOC) teams. We first introduced the basic concepts behind the proposed approach, then demonstrated step-by-step an implementation based on RealOpInsight.
While we’ve mentioned a multi-cluster approach but made the demonstration with only one cluster, it’s worth noting that integrating other Kubernetes sources in RealOpInsight is just simple as what has been described above with one source. If you have multiple Kubernetes clusters, just try to integrate them out and things would just work. You can even set up visualization for your users with namespace items coming from different Kubernetes clusters. Beside that, you can also use the RealOpInsight’s Editor to combine imported items to have a federated visualization item.
Note also that you can configure RealOpInsight to enable email notification when the overall status of namespaces' microservice tree changes from a normal to a non-normal state — and vice versa. See the menu Notification
from the administrator home.
Folks, that’s all for this story. Enjoy and don’t hesitate to share feedback!