Wednesday, January 1, 2020

Cloudify

I have been doing some work with Cloudify.

First, someone gave me access to an instance. Without spending up-front time reading up on Cloudify, I always try to see if I can intuitively figure it out without reading anything.

Not the case with Cloudify.

I had to take some steps to "get into" Cloudify, and I will recap some of those.

1. I went to YouTube, and watched a couple of Videos.

This was somewhat valuable, but I felt this was "hands-on" technology. I knew I would need to install this in my home lab to get proficient with it; that was clear from watching the videos.

2. I logged onto a Cloudify Instance, and looked through the UI

I saw the Blueprints, but couldn't read any of the meta information. Finally I figured out that if I switched browsers, I could scroll down and see the descriptors.

3. Reading up on TOSCA - and Cloudify TOSCA specifically

In examining the descriptors, I realized they were Greek to me, and had to take a step back and read and learn. So I first started reading up on some of the TOSCA standards, and standards like these are tedious and frankly, quite boring after a while.  But - as a result of doing this, I started to realize that Cloudify has extended the TOSCA descriptors.  So, there is a degree of proprietary with regards to Cloudify, and in reading a few blogs, Cloudify "sorta kinda" follows the ETSI MANO standards, but in extending (and probably changing) some of the TOSCA YAML descriptors, they are going to create some vendor lock-in. They tout this as "value add", and "innovation" of course. Hey - that is how people try to make money with standards.

4. Finally, I decided to stick my toe in the water

I logged onto Cloudify Manager, and decided I would try the openstack-example-network.

It wouldn't upload, so I had to look into why. We had the v3.x version of the OpenStack Plugin, which requires a compat.xml file that was not loaded. In figuring this out, I realized we probably shouldn't even be using that version of the plugin since the plugin is supported on version 5.x of Cloudify Manager, and we were running version 4.6.

So, I uninstalled version 3.x of the OpenStack plugin. And tried to upload the sample example blueprint, and voila', success. I stopped there, because I wanted to see if I could create my own blueprint.

5. Created my own Blueprint

Our initial interest in a use case was not to deploy services per se, but to onboard new customers onto an OpenStack platform. So, I saw the palette in OpenStack Composer for the OpenStack Plugin v2.14.7, and it allowed you to create all kinds of OpenStack objects. I decided to put a User and a Project on the palette. I used some web documentation to learn about secrets (which were already created by a Cloudify Consultant who set up the system), and used those to configure the openstack_config items on the project and user. I then configured up the user and project sections.
  1. I saved the blueprint, 
  2. validated the blueprint (no complaints from Composer on that), 
  3. and then uploaded the blueprint to Cloudify Manager.
6. I then attempted to Deploy the Blueprint

This seemed to work, but I did not see a new project or user on the system. I saw a bunch of console messages on the Cloudify Manager GUI, but didn't really see any errors.

It is worth noting that I don't see any examples on the web of people trying to "onboard" an OpenStack tenant. Just about all examples are people instantiating some kind of VM on an already-configured OpenStack platform (tenants, users, projects, et al already created).

7. Joined the Cloudify Slack Community

At this point, I signed up for the Cloudify Slack Community, and am trying to seek some assistance from this point on figuring out why my little blueprint did not seem to execute on the target system.

...Meanwhile, I spun up a new thread, and did some additional things:

8. Installed the Cloudify qcow2 image

If you try to do this, it directs you to the Cloudify Sales page. But there is a link to get the back versions, and I downloaded version 4.4 of the qcow2 image. 

NOTE: I did not launch this in OpenStack. It surprised me that this was what they seemed to want you to do, because most Orchestrators I have seen operate from outside the OpenStack domain (as a VM outside of OpenStack).

This qcow2 is a CentOS7 image, and I could not find a password to get into the operating system image itself  (i.e. as root). What they instead ask you to do, is just hit the ip address from a browser using http (not https!), and see if you get a GUI for Cloudify Manager (I did). Then use your default login. I did log in successfully, and that is as far as I have gotten for now.

9. Installed the CLI

The CLI is an rpm, and I installed this rpm and it installed successfully. So I plan to configure that and use that CLI to learn the CLI and interact with Cloudify Manager.

So, let's see what I learn to get to the next steps. More on this later.

SDN- NFV Certified from Metro Ethernet Forum

It has been a while since I have blogged any updates, so I'll knock out a few!

First, I just completed the course and certification from Metro Ethernet Forum for SDN-NFV.

This was a 3 day course, and it was surprisingly hands-on as it focused heavily on OpenFlow and OpenDaylight. I was always wanting to learn more about these, so I found this quite rewarding.

One interesting stumbling block in the labs was the fact that there is a -O option that needs to be used to specify the proper version of OpenFlow. 

The course seemed to focus on the use case and context of using OpenFlow (and OpenDaylight) to configure switches - but not "everything else" out there in the field of networking that could be configured with something like OpenFlow.

For example, it was my understanding that the primary beneficiary of something like OpenFlow (and OpenDaylight) was in the Wireless (802.11x) domain, where people had scores, hundreds or even thousands of access points that had to be configured or upgraded, and it was extremely difficult to this by hand.

But, the course focused on switches - OpenVSwitch switches to be precise. And that was probably because the OpenVSwitch keeps things simple enough for the course and instructor.

Problem is, in my shop, everyone is using Juniper switches, and Juniper does not play ball with OpenFlow and OpenVSwitch. So I'm not sure how much this can or will be put to use in our specific environment. I do, however, use OpenVSwitch in my own personal OpenVSwitch-based OpenStack environment, and since OpenVSwitch works well with DPDK and VPP, this knowledge can come in handy as I need to start doing more sophisticated things with packet flows.

Nontheless, I found the course interesting and valuable. And the exam also centered around the ETSI-MANO Reference Architecture. I had familiar with this architecture, but like all exams like this, I missed questions because of time, or overthinking things, or picking the wrong of two correct answers (not the best answer), et al. But, I passed the exam, and I guess that's what matters most.

Tuesday, December 3, 2019

Virtualized Networking Acceleration Technologies - Part II


In Part I of this series of posts, I recapped my research on these virtualized networking technologies, with the aim to build an understanding of:

  • what they are
  • the history and evolution between them
What I did not cover, was a couple of further questions:
  1. When to Use Them
  2. Can you Combine Them?
This link is a fantastic link that discusses item number one. Now, I can't tell how "right" or "accurate" he is, and I typically look down in comments for rebuttals and refutes (I didn't see any and most commenters seemed relatively uninformed on this topic).

He concludes that in East-West (inter-data center) traffic, DPDK wins, and in North-South traffic, SR-IOV wins.
https://www.telcocloudbridge.com/blog/dpdk-vs-sr-iov-for-nfv-why-a-wrong-decision-can-impact-performance/

Friday, November 15, 2019

How LibVirt Networking Works - Under the Hood

This is the best link on this topic that I have found.

Lots of great pictures. Pictures are worth a thousand words.

https://www.redhat.com/en/blog/introduction-virtio-networking-and-vhost-net

OpenContrail - Part 1

When I came to this shop and found out that they were running OpenStack but were not running Neutron, I about panicked. Especially when I found out they were running OpenContrail.

OpenContrail uses BGP and XMPP as its control plane protocols and route advertisements/exchanges. And it uses MPLS over GRE/UDP to direct packets. The documentation says it CAN use VXLAN - which Neutron also seems to favor (over GRE tunneling). But here at least, it is being run in the way the designed of OpenContrail wanted it to run - which is as an MPLS L3VPN.

I am going to drop some links in here real quick and come back and flush this blog entry out.

Here is an Architectural Guide on OpenContrail. Make sure you have time to digest this.

https://www.juniper.net/us/en/local/pdf/whitepapers/2000535-en.pdf

Once you read the architecture, here is a Gitbook on OpenContrail that can be used to get more familiarity.

https://sureshkvl.gitbooks.io/opencontrail-beginners-tutorial/content/

Perhaps the stash of gold was the location of a 2013 video from one of the developers of vRouter itself. It turns out most of the stuff in this video is still relevant for OpenContrail several years later. I could not find these slides anywhere, so I did make my own slide deck that highlights important discussions that took place on this video, as well as some of the key concepts shown.

https://www.youtube.com/watch?v=xhn7AYvv2Yg

If you read these, you are halfway there. Maybe more than halfway actually.

High Packet Loss in the Tx of TAP Interfaces



I was seeing some bond interfaces that had high dropped counts, but these were all Rx drops.

I noticed that the tap interfaces on OpenStack compute hosts - which were hooked to OpenContrail's vRouter - had drops on the Tx.

So, in trying to understand why we would be dropping packets on Tap interfaces, I did some poking around and found this link.

https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/13/html/ovs-dpdk_end_to_end_troubleshooting_guide/high_packet_loss_in_the_tx_queue_of_the_instance_s_tap_interface

From this, article, an excerpt:
"TX drops occur because of interference between the instance’s vCPU and other processes on the hypervisor. The TX queue of the tap interface is a buffer that can store packets for a short while in case that the instance cannot pick up the packets. This would happen if the instance’s CPU is prevented from running (or freezes) for a long enough time."

The article goes on and elaborates on diagnosis, and how to fix by adjusting the Tx Queue Length.

SaltStack


I had heard of Puppet. I had heard of Chef. And I knew Ansible quite well because someone I know looked at all three (Puppet, Chef and Ansible) and chose Ansible for our organization.

I had never heard of Salt.

Until now.

Mirantis uses Salt to manage OpenStack infrastructure.

So in having some familiarity with Ansible, it made sense to type into the search engine:
"ansible vs salt'.

Well, sure enough. Someone has done a comparison.

Ansible vs Salt

What I see a number of people doing with Salt, is running remote commands on nodes that they otherwise might not have access to. But - recently, I have started looking more into Salt and it appears to be architected quite similar to Ansible, and is also quite powerful.

One of the features I have recently played around with, is the ability to use "Salt Grains". You can get all kinds of "grains of information" from a host with Salt Grains. In my case, I am calling Salt and telling it to give me all of the grains for all of the hosts in JSON format - and then I parse the json and make a csv spreadsheet. Pretty cool.

There's a lot more. Like Salt States (equivalent to Ansible Modules I think?). There are Salt Pillars.

They use the "salt" theme pretty well in naming all of their extensions.

This link, is called Salt in Ten Minutes. Gives a pretty good overview.

Salt in Ten Minutes

This link, below, is quite handy in figuring out how to target your minions using regular expressions.
https://docs.saltstack.com/en/latest/topics/targeting/globbing.html#regular-expressions

SLAs using Zabbix in a VMware Environment

 Zabbix 7 introduced some better support for SLAs. It also had better support for VMware. VMware, of course now owned by BroadSoft, has prio...