Friday, September 14, 2018

Kubernetes Part V - HPA, Prometheus and Metrics Server

Now that I have a cluster built, I am trying to implement some of the more advanced functions of Kubernetes, such as Scaling.

I'm familiar with how Scaling works in OpenStack with an ETSI MANO Orchestrator (OpenBaton). Now, I would like to see how Kubernetes implements this kind of concept.

A previous engineer made reference to a project in GitHub, which is a Kubernetes Horizontal Pod Autoscaler, with Prometheus custom metrics.
So, I will start with that.  The  link in GitHub to the project I am referring to here is:
https://github.com/stefanprodan/k8s-prom-hpa

This GitHub site, unlike many, has some pretty extensive documentation about metrics in Kubernetes.

What this site covers, is the following steps:

1. Create a Metrics Server (Basic)

This allows you to "do things" (i.e. Scale Horizontally) based on "policies" (like CPU Usage, Memory Usage, File System Usage, etc).

With the "basic" Metrics Server, it comes with the most commonly used metrics. From what I can gather, nodes and pods are queried (polled) for the common metrics by the HPA (Horizontal Pod Autoscaler), which in turn uses a Metrics API to send the metrics to the Metrics Server. Kubelet then pulls the metrics from the Metrics Server.

2. Creating a Custom Metrics Server

For Custom Metrics, applications and services send their metrics via a Custom Metrics API to a Prometheus database.  These then get sent to Pods. From there the process works pretty much the same, where the HPA polls these metrics and makes them available to the kubectl command.

The scaling policies are yaml files, and these will be applied to the default namespace if one is not explicitly specified with the "-n" option.


So in summary, I ran through these examples and was able to run the kubectl get --raw commands to pull metrics.

What I have not done yet, is to run the actual and load scale tests. I will update this blog post once I have done that.

Wednesday, August 8, 2018

Kubernetes - Part IV - Kubernetes Dashboard


I did some more work on Kubernetes.

So the way Kubernetes was set up in here, was that SD-WAN traffic would be "routed" through Kubernetes nodes.  It wouldn't be controlled (no control plane elements in Kubernetes), nor would traffic be sourced or received with Kubernetes nodes.

So in other words, Kubernetes is only being used as a traffic relay, such that traffic would loopback through Kubernetes as though Kubernetes was a cloud of its own, in and of itself.

I noticed the Python scripts to provision everything on the control plane element were not working, so I took my REST API Client library that I built for OpenStack, and ported that into the project and that works like a charm now.

Now, we can spin up a control plane element, and deploy two traffic relay nodes into Kubernetes.

There is an issue with the traffic relay nodes. I'm having trouble understanding the issues. So now I am trying to learn how to install and use the GUI administrative tools for Kubernetes.

The first thing I tried to do is install the dashboard. It installed, but wouldn't come up properly (namespace errors). I found a website discussing this issue:
https://github.com/kubernetes/dashboard/wiki/Creating-sample-user

I followed these steps to create the user and the binding and that worked successfully. Once you do this, you can generate a token, and use that token to log onto the Dashboard.

NOTE: The Dashboard will not work without running "kubectl proxy" which redirects to localhost. Once you run the proxy you can put the url in the browser and it comes up successfully.  This can all be found documented at the dashboard website in GitHub. https://github.com/kubernetes/dashboard


Wednesday, July 25, 2018

Ciena 3906mvi Service Virtualization Switch

Someone dropped one of these off at my desk today and asked me to reverse engineer it.

Ciena 3906mvi Service Virtualization Switch
Completely unfamiliar and untrained on this device, I decided to go to the web first, and I downloaded a data sheet on the product.

What this beast is, is a typical switch, with a Network Function Virtualization Server module (optional).

There is no video board port on this device, as one might expect from a network device like this. So getting to a boot menu on it is painful, requiring a CAT5 to serial cable with a specific pin-out.

The first thing I did, was to plug a CAT 5 cable from my router into various ports, and then dump traffic so I could see what these ports were trying to do.

When I plugged the CAT 5 into the Console port, or the Management port, nothing happened. All I saw was my router sending an ARP request that went unanswered.

When I moved the CAT 5 into a data port labeled "port 1", I observed my router responding to a DHCP request and assigning an IP configuration. With an IP, was able to run an nmap scan on the device and I saw an ssh port open.


From there I was able to log onto the device, which had a slimmed-down Linux operating system, and a daemon program called ONIE (Open Network Installation Environment) that ran in the background, trying to contact some urls. So I was able to realize that I had logged into the NFV Virtualization Server module.

UPDATE:
I was able to learn that I would need to load a specific binary image onto the device (manually, using an onie-utility), because there was no system set up to load the image via the tftp protocol that I kept seeing the ONIE daemon trying to use.

Friday, July 20, 2018

Kubernetes Part III - The etcd package

On this post, I wanted to remark about a package called etcd.

In most installation documents for Kubernetes, these documents tend to abstract out the underlying dependency packages and components.

When I installed the Flannel network, I noticed that it used a package called etcd. I had no clue what this package was. I decided to look into it.

Etcd is a package that essentially allows you to store json parameters in a database, as opposed to storing them on the file system. Flannel uses the package because you need to store the networking parameters in /etcd.

This is GOOD TO KNOW, if you happen to make a typo, or enter incomplete or incorrect network configuration parameters.


The link I found useful for this package is located here:

https://coreos.com/etcd/docs/latest/getting-started-with-etcd.html

In Yogesh Mehta's video, he is using a painful approach to entering his etcd parameters:

# etcdctl mkdir /kube-centos/network
# etcdctl mk /kube-centos/network/config "{ \"Network\": \"172.30.0.0/16\", \"SubnetLen\":24, \"Backend\": ( \"Type\": \"vxlan\" ) }"

This 2nd command did not work for me. I kept getting an error on the Backend portion of the JSON.

I found another post that made a lot more sense, where they crafted the json into an actual json file, and then sourced that file into etcdctl by using the following approach instead:
# etcdctl mk /kube-centos/network/config < networkparms.json

Another tip is that if you screw up the entry, you can simply remove the old entry by typing:
# etcdctl rm /kube-centos/network/config
At this point you can re-enter a new corrective entry.



Kubernetes Part II - Installation on CentOS 7

Initially, I started to follow some instructions on installing Kubernetes that someone sent to me in an email.

I had trouble with those, so naturally I went looking for others, and then proceeded to use these at this link:

https://www.howtoforge.com/tutorial/centos-kubernetes-docker-cluster/

These seemed to work for the most part, but I kept noticing that all of the DNS was failing. I was convinced it was an issue with these particular instructions.

At wits end, I finally wound up using Yogesh Mehta's instructions on creating a cluster.

https://www.youtube.com/watch?v=lTyKeNRopqY&t=82s

The youtube process which Yogesh covers is a bit different than what you see in this link I provided above. But one of the things I learned in following Yogesh's instructions, was that I had inadvertently put the entries in the hosts file backwards on all three nodes. In other words I put the hostname in first followed by the IP. This was "caught" with Yogesh's process because he has a sensible step to ping each node by hostname. 

But this I can tell you - this is an easy to make mistake, and you can pull your hair out trying to understand what the issue is because when you pull up 3 hosts files that all look like and have entries in them, it's not obvious that the order is wrong!

In the end, I was able to get a Master and two Nodes (or Workers, or Minions or whatever they catchphrase of the day is) up and running.

Now the Master, it could not run "kubectl get nodes". This is because the instructions from Yogesh do not make it clear that the "kubectl config" commands that he covers in his last step apply to the Master (his text states that these commands are only run on nodes, and not the "Master"). But when I ran these commands on the Master, the Master could run the "kubectl get nodes" command and get a proper status.

NOTE: It is worth mentioning also that the Master does NOT need to run kubelet; yet many instructions I saw had kubelet being installed on the master.

So far so good. I have a working Master and two Nodes (workers)....

Kubernetes - Part I - Getting Started

After finishing up my last project, I was asked to reverse engineer a bunch of work a departing developer had done on Kubernetes.

Immediately I found Kubernetes a bit of a trick because unlike OpenStack, which has extremely thorough documentation, I found the Kubernetes documentation to be all over the web, in bits and pieces. And most of what I found was "how to" recipes that didn't explain any of the big picture concepts that one normally wants to see before they start 'cooking the recipes'.

So, it took me a while to get focused and make some progress (admittedly, I had some other distractions going on at the time or I might have done this considerably faster). But slowly, I found some good information, and I'll share some of that here.

First, Architecture...

Rather than refer a web link, I am going to refer a youtube link. I have found youtube to be an increasingly valuable place to learn things. Rather than "read", you can kick back on the couch with a beverage of choice, and let some stuff sink in visually. This can be difficult if you're not following along on a laptop or keyboard, but there's some definite merit to "seeing" things visually, or seeing someone do something visually, a la classroom training.

So after watching a number of different youtube videos on Kubernetes, I settled on a couple of these from a gentlemen named Yogesh Mehta. I found THESE videos allowed me to get the thorough understanding of the architecture I needed, and even got a Kubernetes cluster up and running (I did have to fix a few things, which I will comment on).

So the first link is:
https://www.youtube.com/watch?v=o8SpqqKJtFw&t=289s

And this link is entitled:
What is Kubernetes? And what are its Key Components?
 Thanks Yohesh...for asking this fundamental first question and making a video about it.

Next, Building a Cluster....

This link can be found at:
https://www.youtube.com/watch?v=lTyKeNRopqY&t=82s

Here, Yogesh takes you through the process of setting up a very simple cluster, with the following elements:
  • Master 
  • Node 1
  • Node 2
One of the things that Yogesh "brushes over" in this video, is the fact that he installs a "Flannel" network. I found myself asking, "what the hell is a flannel network?"

After all, I'm a Networks guy, and I knew that there had to be some underlying networking in this thing. The Networking is actually one of the most complex and sophisticated aspects of OpenStack (see my posts on this blog regarding Neutron, ML2, OpenvSwitch, et al).

It took me a while, but I finally found a site that lists all of the Kubernetes network plugins, with a description of the distinctions between them.

https://chrislovecnm.com/kubernetes/cni/choosing-a-cni-provider/

It turns out that Flannel seems to be the "default" networking plugin or Container Network Interface (CNI) on just about every setup that I looked at. I wasn't exactly sure why this was the case, how the networking worked, what Flannel brought to the table in terms of features, etc. Now - after reading this site, I was able to get that education.


Tuesday, July 3, 2018

Introduction to SCAP for Hardening Linux Systems


This could be a long post...I could write a book about this based on what I have been doing recently.

Let's start with the concept of "System Hardening" on Linux.

Most people are familiar with SELinux (SE = System Enforcing, I believe). The NSA came up with this originally, and it is now part of mainstream Linux distributions (Centos, Ubuntu, RHEL, et al). It is centered around policy files that are written and loaded, and these essentially govern what an application is allowed to do, and not to do.

Above and beyond this, some additional frameworks such as SCAP have been established.

https://en.wikipedia.org/wiki/Security_Content_Automation_Protocol

The open source implementation of this protocol is OpenSCAP

https://www.open-scap.org/

So what is this protocol about, SCAP? These are markup-language driven policies that bring a system into a compliance level for security and hardening.

There are two primary packages you need to run this:
1. scap-security-guide
2. scap-workbench

The first package does install some dependency packages, so it is best to use yum or a repository-based install method, or you will be installing a bunch of rpms in succession.  The scap-security-guide is the package that "drives" the interpretation of policies, authoring, compilation or customization of existing policies, etc.

The second package is a graphical front-end of the scap-security-guide package. Everything you can do with the GUI can be done on command line, but the GUI does add some value.

For instance:

  • You get some nice graphical graphs of compliancy percentages. 
  • You get a listing of policy rules and descriptions
  • You can run or test policies in various manners (Dry Run, Full Mode)
Keep in mind that many hardened systems can't or don't run graphical support (i.e. X Window Clients or Servers). The GUI allows you to probe a system remotely using SSH from a workhorse system that does indeed have the GUI installed!

With this stated, let's say that you have 100 Linux systems, and you have a policy. For instance, let's say the policy is for a CentOS7 Minimal Install system and that policy has 211 rules.

Let's assume that a baseline CentOS7 system is 80% compliant to these rules out of the box after installation, but that you, as an administrator, bring it up to 90% compliancy with some post-installation steps.

Then, as you deploy this baseline image onto 200 servers, you check it periodically to find that the compliancy level keeps dropping due to system administrators installing packages or whatever.

You can run a "Remediate" feature in SCAP that can 'pull' those systems back into Compliancy per the baseline. The Remediate feature allows for bash, Ansible or Puppet scripts to be run.

NOTE: In my initial testing, using bash, this did not work. But I have not played with it much.

In order to run the baselines, it is best to install these scap packages FIRST on your systems, and take your initial baseline. Then, periodically after additional changes to the system, you can run it again and compare the results to the baseline.

Okay - this is my first introduction on this topic.

SLAs using Zabbix in a VMware Environment

 Zabbix 7 introduced some better support for SLAs. It also had better support for VMware. VMware, of course now owned by BroadSoft, has prio...