Friday, February 23, 2024

AI - Neural Networks and Deep Learning - Nielsen - Chap 3 - Learning Improvement

As if Chapter 2 wasn't heavy enough, I moved onto Chapter 3, which introduced some great concepts, which I will mention here in this blog post. But, I couldn't really follow the detail very well, largely again due to the heavy mathematical expressions and notations used. 

But, I will iterate what he covers in this chapter, and I think each one of these will require its own "separate study", preferably in a simpler way and manner.

Chapter 3 discusses more efficient ways to learn.

It starts out discussing the Cost Function. 

In previous chapters, Nielsen uses the Quadratic Cost Function. But in Chapter 3, he introduces the Cross-Entropy Cost Function, and discusses how by using this, it avoids learning slowdown. Unfortunately, I can't comment much further on this because frankly, I got completely lost in this discussion.

He spends a GREAT DEAL of text discussing Cross-Entropy, including a discussion about the fact that he uses different Learning Rates on the Quadratic Cost Function (.15), versus the Cross-Entropy Cost Function (.005) - and discusses that the rates being different doesn't matter because it is more about how the speed of learning changes, than the actual speed of learning.

After a while, he mentioned an alternative to Cross-Entropy called Softmax. Now, this term seemed familiar. In doing a backcheck, I found that Softmax was used in the first book I read, called AI Crash Course by Haydelin de Ponteves.  I remembered both Softmax and Argmax being mentioned.

Softmax introduces a layer of neurons parallel to the output neurons. So if you had 4 output neurons, you would have 4 Softmax neurons preceding the 4 output neurons.  What Softmax does, is return a Probability Distribution. All of the neurons add up to 1, and if one neuron decreases, there must be an alternative increase amongst the other Softmax neurons.  This could be useful, for example, in cases where the AI is guessing which animal type it is: Dog, Cat, Parrot, Snake. You might see a higher correlation between Dog and Cat, and a lower one with the Parrot and Snake. 

Nielsen then goes on to discuss Overfitting (OverTraining) and Regularization, which is designed to combat Overfitting. He discusses four approaches to Regularization, which I won't echo here, as I clearly will need to consult elsewhere for simpler discussions, definitions and examples of these.



Wednesday, February 21, 2024

AI - Neural Networks and Deep Learning - Nielsen - Chap 2 - Backpropagation

Backpropagation is the "secret sauce" of Neural Networks. And therefore, very important to understand.

Why? 

Because it is how Neural Networks adapt and, well, learn.  

Backpropagation is responsible, essentially, for re-updating weights (and, potentially bias also I suppose), after calculating the differences between actual results and predicted results, so that the cost is minimized over iterations of training, to ensure that weights (and biases) are optimized - and cost is minimized.

Doing so, is rather difficult, tedious and requires an understanding of Mathematics on several principle levels (i.e. Linear Algebra, Calculus, and even Trigonometry if you truly want to understand Sigmoid functions).

In Chapter 2 of this book, I was initially tracking along, and was following the discussion on Notation (for weights and biases in Neural Network Nodes). But that was as far as I got before I got confused and stuck in the envelope of intimidating mathematical equations.

I was able to push through and read it, but found that I didn't understand it, and after several attempts to reinforce by re-reading several times, I had to hit the eject button and look elsewhere for a simpler discussion on Backpropagation.

This decision to eject and look elsewhere for answers, paid huge dividends that allowed me to come back and comment on why I found this chapter so difficult.

  1. His Cost function was unnecessarily complex
  2. He did not need to consider biases, necessarily, in teaching the fundamentals of backpropagation
  3. The notational symbols introduced are head-spinning

In the end, I stopped reading this chapter, because I don't know that trying to understand all of his jargon is necessary to get the gist and the essence of Backpropagation, even from the standpoint of having/getting some knowledge at a mathematical level on how it's calculated.

To give some credit where credit is due, this video from Mikael Lane helped me get a very good initial understanding of BackPropagation: Simplest Neural Network Backpropagation Example

Now, I did have a problem trying to understand where, at 3:30 or so of the video, he comes up with ∂C/∂w = 1.5 * 2(a-y) = 4.5 * w - 1.5

But, aside of that, his example helped me understand because he removed the bias from the equation! You don't really need a bias! Nobody else that I saw, had dumbed things down by doing this and it was extremely helpful. His explanation of how the Chain Rule of Differentiation is applied, was also timely and helpful.

NOTE: Mikael also has a 2nd follow-up video on the same topic: 

Another Simple Backpropagation Example

From there, I went and watched another video, which does use the bias, but walks you through backpropagation in a way that makes it easier to grasp and understand, even with the Calculus used.

Credit for this video goes to Bevan Smith, and the video link can be found here:

Back Propagation in training neural networks step by step 

Bevan gives a more thorough walk-through of calculations than the initial Mikael Lane video does. Both videos use the Least Squares method of Cost. 

The cost function at the final output, is: Cost = (Ypredicted - Yactual)²

The derivative of this, is quite simple obviously, which helps in understanding examples: 2Yp - Ya 

Nielsen, unfortunately, goes into none of this simple explanation, and chooses a Quadratic Cost Function that, for any newbie with rusty math skills, is downright scary to comprehend: 

Nielsen then goes on to cover 4 Equations of Backpropagation which, frankly, look PhD level to me, or at least as far as I am concerned. There is some initial fun in reading this, as though you are perhaps an NSA decoder trying to figure out how to reverse engineer The Enigma (German CODEC used in WWII). But, after a while, you throw your hands up in the air on it. He even goes into some Mathematical Proofs of the equation (yikes). So this stuff is for very very heavy "Math People".

At the end, he does dump some source code in Python that you can run, which is cool, and that all looked to me like it worked fine when I ran it.
 
 

C=12yaL2=12j(yjaLj)2C=12yaL2=12j(yjaLj)2

Thursday, February 8, 2024

Linux Phones Are Mature - But US Carriers Won't Allow Them

Today I looked into the status of some of the Linux phones, which are mature now.

Librem is one of the ones most people have heard about, but the price point on it is out of reach for anyone daring enough to jump in the pool and start swimming with a Linux phone.

Pinephone looks like it has a pretty darn nice Linux phone now, but after watching a few reviews, it is pretty clear that you need to go with the Pinephone Pro, and put a fast(er) Linux OS on it. 

The main issue with performance on these phones, has to do with the graphics rendering. If you are running the Gnome Desktop for example, the GUI is going to take up most of the cycles and resources that you want for your applications. I learned this on regular Linux running on desktop servers years ago, and got into the habit of installing a more lightweight KDE desktop to try and get some of my resources back under my control.

Today, I found a German phone that apparently is really gaining in popularity in Europe - especially Germany. It is called Volla Phone.  Super nice phone, and they have done some work selecting the hardware components and optimizing the Linux distro for you, so that you don't have to spend hours and hours tweaking, configuring, and putting different OS images on the phone to squeeze performance out of it.

Volla Phone - Linux Privacy Phone

 

Problem is - United States carriers don't allow these phones! They are not on the "Compatibility List". Now, I understand there might be an FCC cost to certifying devices on a cellular network (I have not verified this). The frequencies matter of course, but the SIM cards also matter. Volla Phone will, for instance, apparently work on T-Mobile, but only if you have an older SIM card. If you are on T-Mobile and have a new SIM card, then it won't work because of some fields that aren't exchanged (if I understand correctly).

Carriers that are in bed with Google and Apple, such as at&t and Verizon, they're going to do everything they can to prevent a Linux BYOD (Bring Your Own Device) phone hitting their network. They make too much $$$$$$$$$$$$ off of Apple and Android. T-Mobile, they're German of course, so maybe they have a little bit more of the European mindset. These are your three network rollouts across the United States, and all of your mom and pop cellular plays (i.e. Spectrum Mobile, Cricket, et al) are just MVNOs riding on that infrastructure. 

So if you have one of these Linux phones, you can use it in your home. On WiFi. But if you carry it outdoors, it's a brick apparently. Here we are in 2024, and that STILL seems to be the case.

Wednesday, February 7, 2024

ZWave Plus - First Alert Combo - CO2 and Fire

I thought an ideal use case for ZWave devices, would be a Carbon Monoxide detector and/or Fire Detector that would send you an alert on certain conditions.

Well, sure enough, there is one. The First Alert Combo.

 

I bought one of these, and they were priced right - about $40.  I had a First Alert fire detector mounted in the spot for it, and the bracket was exactly the same as the new one (how nice is that?). So mounting it took 2 seconds.

The next step was to pair it to the ZWave Hub, and from experience, that can be a hassle with certain devices.  I was a little bit worried that they were marketing it for a Ring Hub, with a concern it might work for a Ring Hub but not for a more standard hub like a Wink2 or a Hubitat.

But it only took 3 tries to get it to pair. 

First Attempt:

  1. I added the batteries.
  2. I slid the batteries in, holding the Test button
  3. I released the Test button 
  4. I added a device on my hub (by vendor, First Alert was listed), and clicked "start inclusion".

This didn't work. So, I re-attempted.

  1. I slide out the battery tray
  2. I went to my hub app, added the device by vendor, and clicked "start inclusion"
  3. I pulled out one of the two batteries
  4. I reinserted the battery I pulled out.
  5. I slid the battery tray back in, holding the Test Button while sliding
  6. I released the Test button

At this point, I watched the app, and it sat there on "Start Inclusion", as though it were processing. I decided to leave it, and then came back, re-launched the app, hit "Connect to Hub" and then "Devices", and there it was! I hit Refresh on the app, and all of the pertinent statistics showed up.

Not too bad at all.

SLAs using Zabbix in a VMware Environment

 Zabbix 7 introduced some better support for SLAs. It also had better support for VMware. VMware, of course now owned by BroadSoft, has prio...