Wednesday, June 26, 2024

Rocky Generic Cloud Image 9.4 - Image Prep, Cloud-Init and VMware Tools

 

I just fixed an issue on these Rocky 9.x generic cloud images not booting properly on a VMWare platform. It turns out, that if you don't include in your vmx file a CD-ROM - in a connected state - then cloud-init will not run. Which of course makes perfect sense (how can it run if there is no virtual cdrom to boot from?).

I thought the CMP we were using was auto-adding the CD-ROM and mounting the cloud-init iso. And indeed it may be trying to do this. But when you load that OVF up on the first upload, and there is no xml in that OVF file describing a cd-rom drive, then bad things happen as far as initializing the VM properly.

Without cloud-init, you cannot log into one of these generic cloud images. There is no default user and password baked into the image like the old days. So cloud-init is quite important.

Also, Rocky doesn't include VMware Tools (open-vm-tools) as a package on these images. That's a big hassle if you're on VMWare. You have to convert the qcow2 to VMWare (vmx files and ovftool and also generating a cloud-init iso to mount), then load it up on a network that can get to the internet, install open-vm-tools, clean and shut down the VM, then export the OVF files back out - and then reload the thing back into your CMP. Quite the hassle, though yes, it can be (and has been) automated on most of these steps. 

UPDATED NOTE: 

Morpheus VMs need a CD-ROM per their documentation

https://docs.morpheusdata.com/en/latest/getting_started/guides/vmware_guide.html

See "Creating a CentOS/RHEL Image

 

Thursday, June 20, 2024

New AI Book Arrived - Machine Learning for Algorithmic Trading

This thing is like 900 pages long.

You want to take a deep breath and make sure you're committed before you even open it.

I did check the Table of Contents and scrolled quickly through, and I see it's definitely a hands-on applied technology book using the Python programming language.

I will be blogging more about it when I get going.

 




Tuesday, June 4, 2024

What Makes an AI Chip?

I haven't been able to understand why the original chip pioneers, like Intel and AMD, have not been able to pivot in order to compete with NVidia (Stock Symbol: NVDA).

I know a few things, like the fact that when gaming became popular, NVidia made the graphics chips that had graphics acceleration and such. Graphics tend to draw polygons, and drawing polygons is geometric and trigonometric - which require floating point arithmetic (non-integer based mathematics). Floating point is difficult for a CPU to do, so much so that classical CPUs either offloaded or employed other tricks to do these kinds of computations.

Now, these graphics chips are the "rave" for AI. And Nvidia stock has gone through the roof while Intel and AMD have been left behind.

So what does an AI chip have, that is different from an older CPU?

  • Graphics processing units (GPUs) - used mainly for training AI models
  • Field-programmable gate arrays (FPGAs) - used mainly for inference
  • Application-specific integrated circuits (ASICs) - used in various capacities of AI

CPUs use all three of these in some form or another, but an AI chip has all three of these in a highly optimized and accelerated design. Things like prediction (such as branching prediction), parallelism, etc. They're simply better at running "algorithms".

This link, by the way, from NVidia, discusses the distinction between Training and Inference:
https://blogs.nvidia.com/blog/difference-deep-learning-training-inference-ai/

CPUs, they were so bent on running Microsoft for so long, and emulating continuous revisions of instructions to run Windows (286-->386-->486-->Pentium--> and on and on), that they just never went back and "rearchitected" or came up with new chip architectures. They sat back and collected money, along with Microsoft, to give you incremental versions of the same thing - for YEARS.

When you are doing training for an AI model, and you are running algorithmic loops millions upon millions of times, the efficiency and time start to add up - and make a huge difference in $$$ (MONEY). 

So the CPU companies, in order to "catch up", I think, with NVidia, would need to come up with a whole bunch of chip design software. Then there is the software kits necessary to develop to the chips. You also have the foundry (which uses manufacturing equipment, much of it custom per the design), etc. Meanwhile, NVidia has its rocket off the ground, with decreasing G forces (so to speak), which accelerates its orbit. It is easy to see why an increasing gap would occur.

But - when you have everyone (China, Russia, Intel, AMD, ARM, et al) all racing to catch up, they will at some point, catch up. I think. When NVidia slows down. We shall see.

Tuesday, April 16, 2024

What is an Application Binary Interface (ABI)?

After someone mentioned Alma Linux to me, it seemed similar to Rocky Linux, and I wondered why there would be two Linux distros doing the same thing (picking up from CentOS and remaining RHEL compatible).

I read that "Rocky Linux is a 1-to-1 binary to RHEL while AlmaLinux is Application Binary Interface-compatible with RHEL".

Wow. Now, not only did I learn about a new Linux distro, but I also have to run down what an Application Binary Interface, or ABI is.

Referring to this, Stack Exchange post: https://stackoverflow.com/questions/2171177/what-is-an-application-binary-interface-abi, I liked this "oversimplified summary":

API: "Here are all the functions you may call."

ABI: "This is how to call a function."

Friday, March 1, 2024

I thought MacOS was based on Linux - and apparently I was wrong!

I came across this link, which discusses some things I found interesting to learn:

  • Linux is a Monolithic Kernel - I thought because you could load and unload kernel modules, that the Linux kernel had morphed into more of a Microkernel architecture because of this. But apparently not?
  •  The macOS kernel is officially known as XNU, which stands for “XNU is Not Unix.” 
 According to Apple's GitHub page:

 "XNU is a hybrid kernel combining the Mach kernel developed at Carnegie Mellon University with components from FreeBSD and C++ API for writing drivers”.

  Very interesting. I stand corrected now on MacOS being based on Linux.

Neural Network Architecture - Sizing and Dimensioning the Network

In my last blog post, I posed the question of how many hidden layers should be in a neural network, and how many hidden neurons should be in each hidden layer. This is related to the Neural Network Design, or Neural Network Architecture.

Well, I found the answer, I think, in the book entitled An Introduction to Neural Networks for Java authored by Jeff Heaton. I noticed, incidentally, that Jeff was doing AI and writing about it as early as 2008 - fifteen years ago prior to the current AI firestorm we see today - and possibly before that, using languages like Java, C# (C Sharp), and Encog (which I am unfamiliar with).

In this book, in Table 5.1 (Chapter 5), Jeff states (quoted):

"Problems that require two hidden layers are rarely encountered. However, neural networks with two hidden layers can represent functions with any kind of shape. There is currently no theoretical reason to use neural networks with any more than two hidden layers. In fact, for many practical problems, there is no reason to use any more than one hidden layer. Table 5.1 summarizes the capabilities of neural network architectures with various hidden layers." 

Jeff then has the following table...

"There are many rule-of-thumb methods for determining the correct number of neurons to use in the hidden layers, such as the following:

  • The number of hidden neurons should be between the size of the input layer and the size of the output layer.
  • The number of hidden neurons should be 2/3 the size of the input layer, plus the size of the output layer.
  • The number of hidden neurons should be less than twice the size of the input layer."

Simple - and useful! Now, this is obviously a general rule of thumb, a starting point.

There is a Goldilocks method for choosing the right sizes of a Neural Network. If the number of neurons is too small, you get higher bias and underfitting. If you choose too many, you get the opposite problem of overfitting - not to mention the issue of wasting precious and expensive computational cycles on floating point processors (GPUs).

In fact, the process of calibrating a Neural Network leads to a concept of Pruning, where you examine which Neurons affect the total output, and prune out those that don't have the measure of contribution that makes a significant difference to the end result.

AI - Neural Networks and Deep Learning - Nielsen - Chap 5 - Vanishing and Exploding Gradient

When training a Neural Net, it is important to have what is referred to as a Key Performance Indicator - a KPI. This is an objective, often numerical, way of "scoring" the aggregate output so that you can actually tell that the model is learning - that it is trained - and that the act of training the model is improving the output. This seems innate almost, but it is important to always step back and keep this in mind.

Chapter 5 discusses the effort that goes into training a Neural Net, but from the perspective of Efficiency. How well, is the Neural Net actually learning as you run through a specified number of Epochs, with whatever batch sizes you choose, etc.?

In this chapter, Michael Nielsen discusses the Vanishing Gradient. He graphs the "speed of learning" on each Hidden Layer, and it is super interesting to notice that these Hidden Layers do not learn at the same rate! 

In fact, the Hidden Layer closest to the Output always outperforms the preceding Hidden Layer in terms of speed of learning.

So after reading this, the next questions in my mind - and ones that I don't believe Michael Nielsen addresses head-on in his book, is 

  • how many Hidden Layers does one need?
  • how many Neurons are needed in a Hidden Layer?

I will go back and re-scan, but I don't think there are any Rules of Thumb, or general guidance tossed out in this regard - in either book I have covered thus far.  I believe that in the examples chosen in the books, the decisions about how to size (dimension) the Neural Network is more or less arbitrary.

So my next line of inquiry and research will be on the topic of how to "design" a Neural Network, at least from the outset, with respect to the sizing and dimensions.  That might well be my next post on this topic.

Rocky Generic Cloud Image 9.4 - Image Prep, Cloud-Init and VMware Tools

  I just fixed an issue on these Rocky 9.x generic cloud images not booting properly on a VMWare platform. It turns o...