I may have posted on this topic before, because I ran into this issue before.
I was setting up OpenStack for a colleague of mine, and had all sorts of issues getting it to work.
A couple of problems I had were related to services that were not enabled, and when you rebooted the unit(s), the services were not starting up. These were easy to fix.
The difficult issue to find and fix - which took me almost a full business day - had to do with how OpenStack Nova configures itself.
In the /etc/nova.conf file, there is a variable called a state_path. This variable is set to /var/lib/nova - a directory nova creates upon installation and sets permissions to the nova user and group.
In this directory, is a subdirectory called "instances", where Nova puts running instances.
The problem, is that Nova on installation does not seem to check or care about partition and file system sizes. It just assumes.
The issue we had, was that on a CentOS7 default installation, the /var directory is part of the root file system, which is very small (15-20 Gb), as it should normally be (you generally separate root file system from apps and data).
When you would start Nova, even in debug mode, you never saw an ERROR about the fact that Nova had issues with any of its filters (disk, ram, compute, et al). They were being written into the log as DEBUG and WARNING. This made finding the problem like finding a needle in a haystack. And you only saw this evidence after enabling debug in the /etc/nova.conf file.
Eventually, after enabling debug and combing through the logs (on both Controller as well as Compute node), we found a message on the Controller node (NOT THE COMPUTE NODE WHERE YOU WOULD EXPECT IT TO BE) about the disk filter returning 0/1 hosts.
So - we moved the /var/lib/nova to the /home/nova directory (which had hundreds of Gb). We also changed the home directory of nova in /etc/passwd to /home/nova (from /var/lib/nova).
We got further...but it was STILL FAILING.
Further debugging indicated that when we moved the directory, we forgot about another variable in /etc/nova.conf called the lock_file_path. Used for RabbitMQ communication, this variable was still pointing to a lock file in /var/lib directory (that had been moved to /home). This caused Compute Filter issues - also showing up as DEBUG and WARNING messages and not errors.
Subscribe to:
Post Comments (Atom)
MySQL Max Allowed Packet
I recently conducted an upgrade, and for the life of me I couldn't figure out why the application wouldn't initialize. I checked MyS...
-
After finishing up my last project, I was asked to reverse engineer a bunch of work a departing developer had done on Kubernetes. Immediat...
-
Initially, I started to follow some instructions on installing Kubernetes that someone sent to me in an email. I had trouble with those, s...
-
On this post, I wanted to remark about a package called etcd. In most installation documents for Kubernetes, these documents tend to abstr...
No comments:
Post a Comment