Monday, November 27, 2017

Elasticity and Autoscaling - More Testing

Today I went back and did some further testing on AutoScaling.

Here are some new things that I tested:

1. Two Scaling policies in a single descriptor. 

It does no good to have "just" a Scale Out, if you don't have a corresponding "Scale In"!

You cannot have true Elasticity without the expansion and contraction - obviously - right?

So I did this, and this parsed just fine, as it should have.

I also learned you can put these scaling directives in different levels of descriptors - like the NSD. If you do this, I presume that what will happen is that it will factor in all instances across VNFMs. But I did not test this.

2. I tested to make sure that the scaling MAXED OUT where it should. 

If cumulative average CPU across instances was greater then 35% CPU, then the SCALE_OUT 3 would take affect. This seemed to work. I started with 2 instances, and as I added load to CPUs to boost the cum average up, it would scale out 3 - and then scale out 3 more for a total of 8 no matter what load was on the CPUs. So it maxed out at 8 and stayed put. This test passed.

I was curious to see if the engine would instantiate one VM at a time, or would it instantiate in bunches of 3 (per the descriptor), or would it actually just instantiate up to the max (which would be errant behavior).  Nova in OpenStack staggers the instantiations so it APPEARS to look like it is doing one at a time up to three (i.e. 1-1-1), at which point re-processing may kick off another series of 1-1-1.  So this is probably to-be-expected behavior. The devil is in the details when it comes to the Orchestrator, OpenStack, and the OpenStack Nova API in terms of whether it is possible and to what extent you can instantiate VMs simultaneously.

When a new VM comes up, it takes a while for it to participate in measurements. The scaling engine would actually skip the interval due to a "measurements received less than measurements requested" exception and only start evaluating things until and unless it had all of the VMs reporting in measurements that were expected.  I have to think about whether I like this or not.


3. Elasticity Contraction - by using SCALE_IN_TO parameter.

I set things up so that it would scale in to 2 instances - to ensure at least two instances would always be running. I would do this when cumulative average CPU was less than 15% CPU across instances.

This test, actually, failed. I saw the alarm get generated, and I saw the engine attempting to scale in, but some kind of decision-making policy was rejecting the scale in "because conditions are not met".

We will need to go into the code and debug this, and see what is going on.


Thursday, November 23, 2017

Ubiquiti Edge Router ER-X - Impressive


I just love this router.

  • The icon on the top that shows colorized ethernet plugs (colorization related to status). Cool.
  •  It was sooooo easy to configure it.
  •  It has a shell that takes you into Ubuntu Linux (uses Ubuntu = impressive). I'm not even sure it is using BusyBox or some slimmed down quasi-linux. It looks like a custom compile of Ubuntu.
  •  One cool feature is that it can support link aggregation. I am not using that feature, but it's cool.
  • Has excellent support for IPv6.

It can also automatically switch the ports on the router so that you don't need a L3 switch to go with it. 

So for example, you can set up:

  • eth0 as the management port
  • eth1 as the WAN port, and 
  • eth2, eth3 and eth4 are switched so that anything plugged into these are on the same network (you defined the network).  
I don't currently have this router configured a router than actually learns routes. In other words, I am not running BGP, RIP or OSPF on it. I don't really need to learn networks, nor advertise networks, dynamically. All I need is to get out to the internet.

I have it set up with a hairpin NAT, and the firewall rules configured on it are rather trivial at the moment but designed to protect ingress through iptables rules.

This is truly a "power user" routing device, and it can fit into the palm of your hand; it is no bigger than a Raspberry Pi device.

This router also comes with some interesting Wizards that allow you to configure the router for certain use cases, like the WAN+LAN wizard.

So I have not done anything in-depth, but I spent an hour messing around with this device and I'm pretty impressed with it.


Security - Antivirus specifically

My McAfee just expired on this computer and I am now getting a bunch of intrusive "buy me" pop-ups. I have never thought McAfee to be top of the line when it comes to Anti-Virus, but the question is, is anybody really stopping viri these days?

I have started to get smarter about Security. I went to RSA in 2016, and I bought a book on Exploits. This is very very hardcode book, and I have not managed to get through it all yet. It requires Assembler and C programming, and teaches you how hackers actually exploit code. I think once I finish this it will be awesome knowledge, and I am about halfway through it. I got pulled off of this due to the longer hours at work playing with virtualization and orchestration.

So - I am not current on malware. So I spent some time looking around this morning, reading anti-virus reviews.

It does not appear that there is much out there in the way of OpenSource AV. ClamAV looks like the only thing actively maintained. This is a bit of a surprise.

There are some free packages out there, but I am sure they probably nag you incessantly to buy or upgrade. The big question is this: Can you really trust FREE?

I also see some interesting Cloud-based packages out there that are working from outside your network. This would have been an absolute no-no for me in earlier times, but considering the danger of today's malware, maybe this kind of approach is worth re-examining, is good results are coming from it. One such company is Crystal Security.

I see some products like VoodooShield. And some new ones I had not previously encountered like GlarySoft Malware Hunter.

Of course, Kaspersky, ESET - these guys always get good reviews.

It is probably good to stay up to speed on this stuff. To take an  hour here and there and stay current.

OpenBaton Fault Management and AutoScaling

It has been a while since I have taken any notes or blogged anything. That doesn't mean I haven't been doing anything, though. 😎

Over the last month or so I have been testing some of the more advanced features of OpenBaton.
- Fault Management
- Auto Scaling
- Network Slicing

These have taken time to test. I was informed by the development team that the "release" code was not suitable for the kind of rigorous testing I planned to do, and that I needed to use the development branch.

This led me down a road of having to familiarize myself with the "git" software management utility. I know git has been around for a while and silently crept in as almost a de-facto standard for code repository and source management. In many shops it has replaced the classic stalwart Clearcase, CVS, SVN and other software that has actually been in use for decades. Even in my own company's shop, they brought in a "git guy", and of course since that is the recipe he cooks, we now use that. But up to this point, I had not really had a need to do more than "git clone". Now - I am having to work with different branches, and as I said, this took some time. Git is fairly simple if you do simple things, but it is far more complex "under the hood" than it looks, especially if you are doing non-simple things with it. I could do a post just on git alone. I'm not an authority on it, but have picked up a few things - including opinions - on it (some favorable, some not).

The first thing I tested was Fault Management (FM). Fault Management is essentially the ability to identify faults and trigger actions around those faults. The actions can be an attempt to heal - or it can be an attempt to Scale, or it can be an attempt to fail-over based on a configured redundancy mechanism. The ETSI standard descriptors allow you to specify all of this.  The interesting thing about FM in a virtualized context is that it gets into the "philosophy" of whether it makes sense to spend effort healing something, as opposed to just killing it and re-instantiating it. This is called the "Cattle vs Pets" argument.  I think there ARE cases where you need to fix and heal VMs (Pets), but in most cases I think VMs can be treated as Cattle. When VMs are treated as Pets, the nodes are generally going to be important (i.e. they manage something important, as in a control plane or signaling plane element), and cannot just be taken down and re-instantiated due to load or function.

I then tested AutoScaling - or, using a better term, Elasticity. This allows a virtualized network to expand and contract based on real-time utilization. This feature took me a while to get working due to having to compile different modules of code from different moving-target git branches over and over until I could finally get some code that wanted to work, with slight modifications and patches. When I finally got this working, it was super cool to see this work. I could do a separate post on this feature alone. After I got it working I wound up helping some other guys in a German network integration company get the feature working.

Network Slicing has been more difficult to get to work. That is probably a separate post altogether, related and intertwined with QoS such topics.

SLAs using Zabbix in a VMware Environment

 Zabbix 7 introduced some better support for SLAs. It also had better support for VMware. VMware, of course now owned by BroadSoft, has prio...