Grasping Technology: When a Percona Cluster Node Stops Working

Wednesday, September 18, 2024

When a Percona Cluster Node Stops Working

Had a horrible problem where a Percona node (2 of 3) went down and wouldn't start.

I finally ran a command:

> mysqld_safe --wsrep-recover --tc-heuristic-recover=ROLLBACK

This didn't work, so I had to run a journalctl -xe command to find out that the startup for Percona is actually in a temporary startup file: /var/lib/mysql/wsrep_recovery.xxxxx

From this, I could see pending transactions. Well, transactions either need to be committed, or rolled back.

The rollback didn't work, so, I tried the commit, which DID work.

> mysqld_safe --wsrep-recover --tc-heuristic-recover=COMMIT

Now, you can also edit your /etc/my.cnf file and put this option in that file in this format:

[mysqld]

tc-heuristic-recover = COMMIT

So after running the commit, which seemed to run fine, I went ahead and attempted to start the mysql service again:

> systemctl start mysql

Fortunately, it came up!

Now - a quick way to check and make sure your percona node is working properly, is to log into mysql, and run the following query:

mysql> show status like 'wsrep%';

The cluster conf id should be the same on all of your cluster nodes!

Grasping Technology

Wednesday, September 18, 2024

When a Percona Cluster Node Stops Working

No comments:

AI / ML - Feature Engineering - Interaction Features

Search This Blog