It's been a while since I have posted on here. What have I been up to?
Interestingly enough, I have had to reverse-engineer a Kubernetes project. I was initially involved with this, but got pulled off of it, and the project had grown immensely in its layers, complexity and sophistication in my absence. The chief developer on it left, so I had to work with a colleague to try and get the solution working, deployable and tested.
Right off the bat, the issue was related to Kubernetes Networking. That was easy to see.
The project uses Multus to create multi-homed pods (pods with multiple network interface adaptors).
By default, a Kubernetes pod only allows a single NIC (i.e. eth0). If you need two interfaces or more, there is a project call Multus (Intel sponsors this) that accomodates this requirement.
Multus is not a simple thing to understand. Think about it. You have Linux running on baremetal hosts. You have KVM virtual machines running on the VMs (virtualized networking). You have Kubernetes, and its Container Networking Interface plugins that supply a networking fabric amongst pods (Flannel, Weave, Calico, et al). And, now, on top of that, you have - Multus.
Multus is not a CNI itself. It does not "replace" Flannel, or Weave, but instead inserts itself between Kubernetes and Flannel or Weave much like a proxy or a broker would.
This article here has some good diagrams and exhibits that show this:
https://neuvector.com/network-security/advanced-kubernetes-networking/
[ I am skipping some important information about the Multus Daemonset here - and how that all works. But may come back to it. ]
One issue we ran into, is that we had two macvlan (Layer 2) configurations.
One used static host networking configuration:
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
name: macvlan-conf
spec:
config: '{
"cniVersion": "0.3.1",
"type": "macvlan",
"master": "eth1",
"mode": "bridge",
"ipam": {
"type": "host-local",
"subnet": "10.10.20.0/24",
"rangeStart": "10.10.20.1",
"rangeEnd": "10.10.20.254",
"routes": [
{ "dst": "0.0.0.0/0" }
],
"gateway": ""
}
}
while the other used DHCP.
{
"cniVersion": "0.3.1",
"name": "macvlan-conf-2",
"type": "macvlan",
"master": "eth1",
"mode": "bridge",
"ipam": {
"type": "dhcp",
"routes": [ { "dst": "192.168.0.0/16", "gw": "192.168.0.1" } ]
},
"dns": {
"nameservers": [ "4.4.4.4", "8.8.8.8" ]
}
}
The DHCP directive is interesting, because it will NOT work unless you have ANOTHER cni plugin called cni-dhcp deployed into Kubernetes so that it is installed on each Kubernetes node that is receiving containers that use this. This took me a WHILE to understand. I didn't even know about the plugin, its existence, or anything like that.
We were running into an issue where the DHCP Multus pods (those that used this macvlan-conf-2) where stuck in an Initializing state. After considerable debugging, I figured out the issue was with DHCP.
Once I realized the plugin existed, I knew the issue had to either be with the plugin (which requests leases), or the upstream DHCP server (which responds). In the end, it turned out to be that the upstream DHCP server was returning routes that the dhcp plugin could not handle. By removing these routes, and letting the upstream DHCP server just worry about ip assignment, the pods came up successfully.
Thursday, April 18, 2019
Monday, March 4, 2019
Artificial Intelligence and Deep Learning - Tensorflow
This is a tool that someone told me about which could be a good way to get hands-on started with AI, should the spirit move you to do so.
Tensorflow
Tensorflow
FPGA
FPGA stands for Field Programmable Gate Array.
Per Wikipedia definition, "an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence the term field-programmable"
https://en.wikipedia.org/wiki/Field-programmable_gate_array
Per Wikipedia definition, "an integrated circuit designed to be configured by a customer or a designer after manufacturing – hence the term field-programmable"
https://en.wikipedia.org/wiki/Field-programmable_gate_array
Wednesday, February 13, 2019
Hairpin NAT
A lot of folks don't understand Hairpin NAT, meaning what it is, why it exists, or the specific use cases in which it applies.
This is an awesome site that explains it nicely - although you have to read the very very last paragraph to get to the bottom of it:
Hairpin NAT Explained
This is an awesome site that explains it nicely - although you have to read the very very last paragraph to get to the bottom of it:
Hairpin NAT Explained
Friday, February 1, 2019
NOSQL databases - are we taking a step backwards?
One of the solutions I am looking at happens to be utilizing Cassandra, a NOSQL database project from the Apache Foundation.
I am pretty deep with SQL databases, but not so much with NOSQL databases. I may have done a couple remark-based blogs on the topic of NOSQL databases in the past, but really have not looked into them in any kind of depth.
However, in noticing a java process running and realizing it was Cassandra, I went to the Cassandra website and started to take a closer look. When I went to the site and clicked:
So, if I want more introductory information, I will probably have to blog surf.
But, I did find this very interesting Quora page, entitled: What are the pros and cons of the Cassandra database? It can be found at this link: What-are-the-pros-and-cons-of-using-the-Cassandra-database?
This reminds me of the old Object Oriented database days, when products like Versant hit the scene. Speedy databases that made it easy to get your data IN, but when it came to getting it OUT, it was an absolute nightmare.
There are no aggregate functions (SUM, AVG, etc). No table joins or filters. It uses a CSQL query syntax that looks somewhat like SQL, but will result in confusion because it does not naturally support ANSI-SQL concepts.
Makes me wonder. Are we taking a big step backwards with these kinds of databases becoming so pervasive?
I am pretty deep with SQL databases, but not so much with NOSQL databases. I may have done a couple remark-based blogs on the topic of NOSQL databases in the past, but really have not looked into them in any kind of depth.
However, in noticing a java process running and realizing it was Cassandra, I went to the Cassandra website and started to take a closer look. When I went to the site and clicked:
- Documentation
- Architecture
- Overview
So, if I want more introductory information, I will probably have to blog surf.
But, I did find this very interesting Quora page, entitled: What are the pros and cons of the Cassandra database? It can be found at this link: What-are-the-pros-and-cons-of-using-the-Cassandra-database?
This reminds me of the old Object Oriented database days, when products like Versant hit the scene. Speedy databases that made it easy to get your data IN, but when it came to getting it OUT, it was an absolute nightmare.
There are no aggregate functions (SUM, AVG, etc). No table joins or filters. It uses a CSQL query syntax that looks somewhat like SQL, but will result in confusion because it does not naturally support ANSI-SQL concepts.
Makes me wonder. Are we taking a big step backwards with these kinds of databases becoming so pervasive?
Friday, November 9, 2018
There are other container platforms besides Docker? Like LXC?
I'm relatively new to containers technology. I didn't even realize there were alternatives to Docker (although I hadn't really thought about it).
Colleague of mine knew this though, and sent me this interesting link.
https://robin.io/blog/linux-containers-comparison-lxc-docker/
This link is a discussion about a more powerful container platform called LXC, which could be used as an alternative to Docker.
I'm still in the process of learning about it. Will update the blog later.
Wednesday, October 31, 2018
Data Plane Development Kit (DPDK)
I kept noticing that a lot of the carrier OEMs are implementing their "own" Virtual Switches.
I wasn't really sure why, and decided to look into the matter. After all, there is a fast-performing OpenVSwitch, which while fairly complex, is powerful, flexible, and, well, open source.
Come to learn, there is actually a faster way to do networking than with native OpenVSwitch.
OpenVSwitch minimizes all of the context switching between user space and kernel space when it comes to taking packets from a physical port, and forwarding those packets to virtualized network functions (VNF) and back.
But - DPDK provides a means to circumvent the kernel, and have practically everything in user space interacting directly to the hardware (bypassing the kernel).
This is fast, indeed, if you can do this. But it bypasses all of the purposes of a kernel network stack, so there has to be some sacrifice (which I need to look into and understand better). One of the ways it bypasses the kernel is through Direct Memory Access (DMA), based on some limited reading (frankly, reading it and digesting it and understanding it usually requires several reads and a bit of concentration as this stuff gets very complex very fast).
The other question I have, is that if DPDK is bypassing the kernel en route to a physical NIC, what about other kernel-based networking services that are using that same NIC? How does that work?
I've got questions. More questions.
But up to now, I was unaware of this DPDK and its role in the new generation of virtual switches coming out. Even OpenVSwitch itself has a DPDK version.
Subscribe to:
Posts (Atom)
SLAs using Zabbix in a VMware Environment
Zabbix 7 introduced some better support for SLAs. It also had better support for VMware. VMware, of course now owned by BroadSoft, has prio...

-
After finishing up my last project, I was asked to reverse engineer a bunch of work a departing developer had done on Kubernetes. Immediat...
-
Initially, I started to follow some instructions on installing Kubernetes that someone sent to me in an email. I had trouble with those, s...
-
I did some more work on Kubernetes. So the way Kubernetes was set up in here, was that SD-WAN traffic would be "routed" through...