Grasping Technology

Friday, September 14, 2018

Kubernetes - Firewall Rules Causing the Kubernetes API Server to Crash

We hired a guy in here who knows a lot about Docker and Apache Mesos. He also has some Kubernetes expertise (a lot more expertise than I have).

I was showing him an "annoying phenomenon" in which I would repeatedly get "Connection Refused" errors printing in a loop in Syslog, on port 6443 (which is the Kubernetes api-server).

We did a TON of debugging on this, and I'm STILL not clear we have pinpointed this issue, but I think the issue has "something" to do with FirewallD and iptables.

What we wound up doing that SEEMS to have fixed the issue, is this:

1. Build a BRAND SPANKING NEW CentOS 7 Virtual Machine (from ISO)

2. Reinstall Packages from Scratch

3. Install a set of Firewall Rules

It turns out that the firewall rules in this architecture are rather complex. Docker puts in a set of Firewall Rules, Kubernetes puts in its own set of rules, and then on top of that there are some rules I see being added that are *not* added by default.

For the Master:
port 6443/tcp
port 2379-2380/tcp
port 10250/tcp
port 10251/tcp
port 10252/tcp
port 10255/tcp

For the Worker Nodes:
port 10250/tcp
port 10255/tcp
port 30000-32767/tcp
port 6783/tcp

Getting familiar with what uses what ports and why is an important part of understanding this kind of technology. 6443 is obviously the api-server. The other ports, honestly, I need to look up and get a better understanding of.

Now in FirewallD, you can NOT put these rules in the direct.xml file. I did that, thinking that was the way to go, and they did not work (I have not debugged why). I had to put each rule in with:
firewall-cmd --permanent --add-port=XXXX/tcp (and then do a firewall-cmd --reload at the end so they apply).

Putting the rules in this way puts the rules into the default zone, which is public with FirewallD. I would imagine if you monkeyed around with your zones, you could easily break these rules and they wouldn't work anymore. So Firewalling with this technology is nothing to take lightly.

Kubernetes Part V - HPA, Prometheus and Metrics Server

Now that I have a cluster built, I am trying to implement some of the more advanced functions of Kubernetes, such as Scaling.

I'm familiar with how Scaling works in OpenStack with an ETSI MANO Orchestrator (OpenBaton). Now, I would like to see how Kubernetes implements this kind of concept.

A previous engineer made reference to a project in GitHub, which is a Kubernetes Horizontal Pod Autoscaler, with Prometheus custom metrics.
So, I will start with that. The link in GitHub to the project I am referring to here is:
https://github.com/stefanprodan/k8s-prom-hpa

This GitHub site, unlike many, has some pretty extensive documentation about metrics in Kubernetes.

What this site covers, is the following steps:

1. Create a Metrics Server (Basic)

This allows you to "do things" (i.e. Scale Horizontally) based on "policies" (like CPU Usage, Memory Usage, File System Usage, etc).

With the "basic" Metrics Server, it comes with the most commonly used metrics. From what I can gather, nodes and pods are queried (polled) for the common metrics by the HPA (Horizontal Pod Autoscaler), which in turn uses a Metrics API to send the metrics to the Metrics Server. Kubelet then pulls the metrics from the Metrics Server.

2. Creating a Custom Metrics Server

For Custom Metrics, applications and services send their metrics via a Custom Metrics API to a Prometheus database. These then get sent to Pods. From there the process works pretty much the same, where the HPA polls these metrics and makes them available to the kubectl command.

The scaling policies are yaml files, and these will be applied to the default namespace if one is not explicitly specified with the "-n" option.

So in summary, I ran through these examples and was able to run the kubectl get --raw commands to pull metrics.

What I have not done yet, is to run the actual and load scale tests. I will update this blog post once I have done that.

Wednesday, August 8, 2018

Kubernetes - Part IV - Kubernetes Dashboard

I did some more work on Kubernetes.

So the way Kubernetes was set up in here, was that SD-WAN traffic would be "routed" through Kubernetes nodes. It wouldn't be controlled (no control plane elements in Kubernetes), nor would traffic be sourced or received with Kubernetes nodes.

So in other words, Kubernetes is only being used as a traffic relay, such that traffic would loopback through Kubernetes as though Kubernetes was a cloud of its own, in and of itself.

I noticed the Python scripts to provision everything on the control plane element were not working, so I took my REST API Client library that I built for OpenStack, and ported that into the project and that works like a charm now.

Now, we can spin up a control plane element, and deploy two traffic relay nodes into Kubernetes.

There is an issue with the traffic relay nodes. I'm having trouble understanding the issues. So now I am trying to learn how to install and use the GUI administrative tools for Kubernetes.

The first thing I tried to do is install the dashboard. It installed, but wouldn't come up properly (namespace errors). I found a website discussing this issue:
https://github.com/kubernetes/dashboard/wiki/Creating-sample-user

I followed these steps to create the user and the binding and that worked successfully. Once you do this, you can generate a token, and use that token to log onto the Dashboard.

NOTE: The Dashboard will not work without running "kubectl proxy" which redirects to localhost. Once you run the proxy you can put the url in the browser and it comes up successfully. This can all be found documented at the dashboard website in GitHub. https://github.com/kubernetes/dashboard

Wednesday, July 25, 2018

Ciena 3906mvi Service Virtualization Switch

Someone dropped one of these off at my desk today and asked me to reverse engineer it.

Ciena 3906mvi Service Virtualization Switch

Completely unfamiliar and untrained on this device, I decided to go to the web first, and I downloaded a data sheet on the product.

What this beast is, is a typical switch, with a Network Function Virtualization Server module (optional).

There is no video board port on this device, as one might expect from a network device like this. So getting to a boot menu on it is painful, requiring a CAT5 to serial cable with a specific pin-out.

The first thing I did, was to plug a CAT 5 cable from my router into various ports, and then dump traffic so I could see what these ports were trying to do.

When I plugged the CAT 5 into the Console port, or the Management port, nothing happened. All I saw was my router sending an ARP request that went unanswered.

When I moved the CAT 5 into a data port labeled "port 1", I observed my router responding to a DHCP request and assigning an IP configuration. With an IP, was able to run an nmap scan on the device and I saw an ssh port open.

From there I was able to log onto the device, which had a slimmed-down Linux operating system, and a daemon program called ONIE (Open Network Installation Environment) that ran in the background, trying to contact some urls. So I was able to realize that I had logged into the NFV Virtualization Server module.

UPDATE:
I was able to learn that I would need to load a specific binary image onto the device (manually, using an onie-utility), because there was no system set up to load the image via the tftp protocol that I kept seeing the ONIE daemon trying to use.

Friday, July 20, 2018

Kubernetes Part III - The etcd package

On this post, I wanted to remark about a package called etcd.

In most installation documents for Kubernetes, these documents tend to abstract out the underlying dependency packages and components.

When I installed the Flannel network, I noticed that it used a package called etcd. I had no clue what this package was. I decided to look into it.

Etcd is a package that essentially allows you to store json parameters in a database, as opposed to storing them on the file system. Flannel uses the package because you need to store the networking parameters in /etcd.

This is GOOD TO KNOW, if you happen to make a typo, or enter incomplete or incorrect network configuration parameters.

The link I found useful for this package is located here:

https://coreos.com/etcd/docs/latest/getting-started-with-etcd.html

In Yogesh Mehta's video, he is using a painful approach to entering his etcd parameters:

# etcdctl mkdir /kube-centos/network
# etcdctl mk /kube-centos/network/config "{ \"Network\": \"172.30.0.0/16\", \"SubnetLen\":24, \"Backend\": ( \"Type\": \"vxlan\" ) }"

This 2nd command did not work for me. I kept getting an error on the Backend portion of the JSON.

I found another post that made a lot more sense, where they crafted the json into an actual json file, and then sourced that file into etcdctl by using the following approach instead:
# etcdctl mk /kube-centos/network/config < networkparms.json

Another tip is that if you screw up the entry, you can simply remove the old entry by typing:
# etcdctl rm /kube-centos/network/config
At this point you can re-enter a new corrective entry.

Kubernetes Part II - Installation on CentOS 7

Initially, I started to follow some instructions on installing Kubernetes that someone sent to me in an email.

I had trouble with those, so naturally I went looking for others, and then proceeded to use these at this link:

https://www.howtoforge.com/tutorial/centos-kubernetes-docker-cluster/

These seemed to work for the most part, but I kept noticing that all of the DNS was failing. I was convinced it was an issue with these particular instructions.

At wits end, I finally wound up using Yogesh Mehta's instructions on creating a cluster.

https://www.youtube.com/watch?v=lTyKeNRopqY&t=82s

The youtube process which Yogesh covers is a bit different than what you see in this link I provided above. But one of the things I learned in following Yogesh's instructions, was that I had inadvertently put the entries in the hosts file backwards on all three nodes. In other words I put the hostname in first followed by the IP. This was "caught" with Yogesh's process because he has a sensible step to ping each node by hostname.

But this I can tell you - this is an easy to make mistake, and you can pull your hair out trying to understand what the issue is because when you pull up 3 hosts files that all look like and have entries in them, it's not obvious that the order is wrong!

In the end, I was able to get a Master and two Nodes (or Workers, or Minions or whatever they catchphrase of the day is) up and running.

Now the Master, it could not run "kubectl get nodes". This is because the instructions from Yogesh do not make it clear that the "kubectl config" commands that he covers in his last step apply to the Master (his text states that these commands are only run on nodes, and not the "Master"). But when I ran these commands on the Master, the Master could run the "kubectl get nodes" command and get a proper status.

NOTE: It is worth mentioning also that the Master does NOT need to run kubelet; yet many instructions I saw had kubelet being installed on the master.

So far so good. I have a working Master and two Nodes (workers)....

Kubernetes - Part I - Getting Started

After finishing up my last project, I was asked to reverse engineer a bunch of work a departing developer had done on Kubernetes.

Immediately I found Kubernetes a bit of a trick because unlike OpenStack, which has extremely thorough documentation, I found the Kubernetes documentation to be all over the web, in bits and pieces. And most of what I found was "how to" recipes that didn't explain any of the big picture concepts that one normally wants to see before they start 'cooking the recipes'.

So, it took me a while to get focused and make some progress (admittedly, I had some other distractions going on at the time or I might have done this considerably faster). But slowly, I found some good information, and I'll share some of that here.

First, Architecture...

Rather than refer a web link, I am going to refer a youtube link. I have found youtube to be an increasingly valuable place to learn things. Rather than "read", you can kick back on the couch with a beverage of choice, and let some stuff sink in visually. This can be difficult if you're not following along on a laptop or keyboard, but there's some definite merit to "seeing" things visually, or seeing someone do something visually, a la classroom training.

So after watching a number of different youtube videos on Kubernetes, I settled on a couple of these from a gentlemen named Yogesh Mehta. I found THESE videos allowed me to get the thorough understanding of the architecture I needed, and even got a Kubernetes cluster up and running (I did have to fix a few things, which I will comment on).

So the first link is:
https://www.youtube.com/watch?v=o8SpqqKJtFw&t=289s

And this link is entitled:
What is Kubernetes? And what are its Key Components?
Thanks Yohesh...for asking this fundamental first question and making a video about it.

Next, Building a Cluster....

This link can be found at:
https://www.youtube.com/watch?v=lTyKeNRopqY&t=82s

Here, Yogesh takes you through the process of setting up a very simple cluster, with the following elements:

Master
Node 1
Node 2

One of the things that Yogesh "brushes over" in this video, is the fact that he installs a "Flannel" network. I found myself asking, "what the hell is a flannel network?"

After all, I'm a Networks guy, and I knew that there had to be some underlying networking in this thing. The Networking is actually one of the most complex and sophisticated aspects of OpenStack (see my posts on this blog regarding Neutron, ML2, OpenvSwitch, et al).

It took me a while, but I finally found a site that lists all of the Kubernetes network plugins, with a description of the distinctions between them.

https://chrislovecnm.com/kubernetes/cni/choosing-a-cni-provider/

It turns out that Flannel seems to be the "default" networking plugin or Container Network Interface (CNI) on just about every setup that I looked at. I wasn't exactly sure why this was the case, how the networking worked, what Flannel brought to the table in terms of features, etc. Now - after reading this site, I was able to get that education.

Grasping Technology

Friday, September 14, 2018

Kubernetes - Firewall Rules Causing the Kubernetes API Server to Crash

Kubernetes Part V - HPA, Prometheus and Metrics Server

Wednesday, August 8, 2018

Kubernetes - Part IV - Kubernetes Dashboard

Wednesday, July 25, 2018

Ciena 3906mvi Service Virtualization Switch

Friday, July 20, 2018

Kubernetes Part III - The etcd package

Kubernetes Part II - Installation on CentOS 7

Kubernetes - Part I - Getting Started

SLAs using Zabbix in a VMware Environment

Search This Blog