Friday, September 21, 2018

Can ETSI-MANO Architecture Framework work with Kubernetes?


This rather simple question has been difficult to answer.

The ONLY reference on the web about this, is a PDF I found that discusses using "Containerized" OpenStack, with OPNFV.

What is not clear to me, is whether this solution uses the "standards-based" descriptors, such as Network Service Descriptors (NSD), etc.

I finally went out on the OpenBaton board on Gitter and asked about this. Makes sense to hear what they have to say (if they respond) before we invest time going down that road.

Ideally, the Kubernetes would be a "VIM", I think, rather than an Openstack "VIM".

But I'm not sure. I need some help on this one.

Kubernetes Networking - A More In-Depth look

I see a lot of people using Flannel, and Weave-Net for their Kubernetes Networking implementations.

I came across a reasonable attempt to explain the distinctions between them at this blog here:
https://chrislovecnm.com/kubernetes/cni/choosing-a-cni-provider/

I think there were about ten or twelve listed there, but Flannel and Weave-Net are the two most prevalent ones.

Flannel has more Git activity currently, but in terms of robustness and features, Weave-Net apparently has more of that, while Flannel has simplicity.

There is no shortage of good blogs out there on how these work, but this one link I came across had some nice packet flows, and those aren't easy to do, so I will show those here for future reference (for me or anyone else that consults this blog).

Here is Part I:
https://medium.com/@ApsOps/an-illustrated-guide-to-kubernetes-networking-part-1-d1ede3322727

In Part I, this packet flow is irrespective of which particular Kubernetes network implementation you use. In other words, this flow is "Kubernetes Centric". It deals with how pods inter-communicate with each other on a single node, and how pods intercommunicate with each other across nodes.

One of the main aspects is that all nodes in a Kubernetes cluster get a routing table that is updated with the pod CIDRs.

NOTE: This does not address pods going out of Kubernetes and back into Kubernetes. Something I need to look into.

and Part II:
https://medium.com/@ApsOps/an-illustrated-guide-to-kubernetes-networking-part-2-13fdc6c4e24c

In Part II, he shows how a Flannel overlay network "bolts on" to the networking implementation in Part I above.  Flannel uses a "flannel0" interface that essentially encapsulates and tunnels packets to the respective pods. A daemon, flanneld, consults Kubernetes for the tunneling information that it uses when it adds source and destination ip addresses for the pods that packets need to be delivered to.

Docker: Understanding Docker Images and Containers


So this week has been emphasized on understanding how Docker works.

First, I learned about Docker images - how to create images, etc.

There is a pretty good blog that can be used to get going on this topic, which can be found here:
https://osric.com/chris/accidental-developer/2017/08/running-centos-in-a-docker-container/

Docker images are actually created in Layers. So you generally start off by pulling in a docker container image for centos, and then running it.

This is done as follows;
# docker pull centos
# docker run centos
# docker image ls

Note: If you run "docker container ls" you won't see it because it's not running and therefore not containerized.  Once you run the container image, THEN it becomes containerized and you can run "docker container ls" and you will be able to see it.

# docker run -it centos

Once you run the image, you are now "in" the container and you get a new prompt with a new guid, as shown below:

[root@4f0b435cbdb6 /]#

Now you can make changes to this image as you see fit; yum install packages, copy things into a running container by using the "docker cp" command.

Once you get a container the way you want it, you can exit that container, and then use the guid of that container (don't lose it!) to "commit" it.

Once committed, you need to push it to a registry.

Registries are another topic. If you use Docker Hub (https://hub.docker.com), you need to create an account, and you can create a public repository or a private repository. If you use a private one, you need to authenticate to use it. If you use a public one, anyone can see, take or use whatever you upload.

JFrog is another artifact repository that can be used.

Ultimately what we wound up doing, is creating a new local registry in a container by using the following command:
# docker run -d -p 5000:5000 --restart=always --name registry registry:2

Then, you can push your newly saved container images to this registry by using a command such as:
# docker push kubernetes-master:5000/centos-revised-k8s:10

So essentially we did the push to a specific host (kubernertes-master), on port 5000, and gave it the image name and a new tag.


Friday, September 14, 2018

Jumping into Kubernetes without understanding Docker and Containers


Clearly, it's best to understand Containers and Docker before you jump right into the deep water of Kubernetes.

I went in and started following cookbooks and how-to guides on Kubernetes without "stepping back" and trying to learn Docker first, and this has bit me and caused me to now go back to learn a bit about Docker.

As it turns out, the Containers I was launching with Kubernetes kept failing with a CrashLoopBackOff error.

It took me a while to learn how to debug this effectively in Kubernetes. I finally learned how to use kubectl to look at the logs, show the container information and events and so forth. I finally came to realize that the container we were pulling from jFrog was running a python script that was failing because someone had hard-coded IP addresses into it that weren't in use.

I decided to build a new container and fix this problem. I had to go to jFrog to learn that Docker containers are built in "Layers". Then, I decided I had to build a new container from scratch to fix this problem.

Doing all of this with no Docker knowledge means that I am essentially going ground up with Docker 101.

Thankfully we have this new guy we hired who is teaching me Docker (and some Kubernetes). Which is cool. It's a good two way exchange. I teach him about Networking and SD-WAN, and OpenStack/OpenBaton, and he teaches me about Docker and Kubernetes.

Kubernetes - Firewall Rules Causing the Kubernetes API Server to Crash


We hired a guy in here who knows a lot about Docker and Apache Mesos. He also has some Kubernetes expertise (a lot more expertise than I have).

I was showing him an "annoying phenomenon" in which I would repeatedly get "Connection Refused" errors printing in a loop in Syslog, on port 6443 (which is the Kubernetes api-server).

We did a TON of debugging on this, and I'm STILL not clear we have pinpointed this issue, but I think the issue has "something" to do with FirewallD and iptables.

What we wound up doing that SEEMS to have fixed the issue, is this:

1. Build a BRAND SPANKING NEW CentOS 7 Virtual Machine (from ISO)

2. Reinstall Packages from Scratch

3. Install a set of Firewall Rules


It turns out that the firewall rules in this architecture are rather complex. Docker puts in a set of Firewall Rules, Kubernetes puts in its own set of rules, and then on top of that there are some rules I see being added that are *not* added by default.

For the Master:
port 6443/tcp
port 2379-2380/tcp
port 10250/tcp
port 10251/tcp
port 10252/tcp
port 10255/tcp

For the Worker Nodes:
port 10250/tcp
port 10255/tcp
port 30000-32767/tcp
port 6783/tcp

Getting familiar with what uses what ports and why is an important part of understanding this kind of technology. 6443 is obviously the api-server. The other ports, honestly, I need to look up and get a better understanding of.

Now in FirewallD, you can NOT put these rules in the direct.xml file. I did that, thinking that was the way to go, and they did not work (I have not debugged why). I had to put each rule in with:
firewall-cmd --permanent --add-port=XXXX/tcp (and then do a firewall-cmd --reload at the end so they apply).

Putting the rules in this way puts the rules into the default zone, which is public with FirewallD. I would imagine if you monkeyed around with your zones, you could easily break these rules and they wouldn't work anymore. So Firewalling with this technology is nothing to take lightly.

Kubernetes Part V - HPA, Prometheus and Metrics Server

Now that I have a cluster built, I am trying to implement some of the more advanced functions of Kubernetes, such as Scaling.

I'm familiar with how Scaling works in OpenStack with an ETSI MANO Orchestrator (OpenBaton). Now, I would like to see how Kubernetes implements this kind of concept.

A previous engineer made reference to a project in GitHub, which is a Kubernetes Horizontal Pod Autoscaler, with Prometheus custom metrics.
So, I will start with that.  The  link in GitHub to the project I am referring to here is:
https://github.com/stefanprodan/k8s-prom-hpa

This GitHub site, unlike many, has some pretty extensive documentation about metrics in Kubernetes.

What this site covers, is the following steps:

1. Create a Metrics Server (Basic)

This allows you to "do things" (i.e. Scale Horizontally) based on "policies" (like CPU Usage, Memory Usage, File System Usage, etc).

With the "basic" Metrics Server, it comes with the most commonly used metrics. From what I can gather, nodes and pods are queried (polled) for the common metrics by the HPA (Horizontal Pod Autoscaler), which in turn uses a Metrics API to send the metrics to the Metrics Server. Kubelet then pulls the metrics from the Metrics Server.

2. Creating a Custom Metrics Server

For Custom Metrics, applications and services send their metrics via a Custom Metrics API to a Prometheus database.  These then get sent to Pods. From there the process works pretty much the same, where the HPA polls these metrics and makes them available to the kubectl command.

The scaling policies are yaml files, and these will be applied to the default namespace if one is not explicitly specified with the "-n" option.


So in summary, I ran through these examples and was able to run the kubectl get --raw commands to pull metrics.

What I have not done yet, is to run the actual and load scale tests. I will update this blog post once I have done that.

SLAs using Zabbix in a VMware Environment

 Zabbix 7 introduced some better support for SLAs. It also had better support for VMware. VMware, of course now owned by BroadSoft, has prio...