MariaDB Operations
Tips and tricks for managing and operating the MariaDB cluster within a Genestack environment.
Connect to the database
Sometimes an operator may need to connect to the database to troubleshoot things or otherwise make modifications to the databases in place. The following command can be used to connect to the database from a node within the cluster.
mysql -h $(kubectl -n openstack get service mariadb-cluster-primary -o jsonpath='{.spec.clusterIP}') \
-p$(kubectl --namespace openstack get secret mariadb -o jsonpath='{.data.root-password}' | base64 -d) \
-u root
Info
The following command will leverage your kube configuration and dynamically source the needed information to connect to the MySQL cluster. You will need to ensure you have installed the mysql client tools on the system you're attempting to connect from.
Manually dumping and restoring databases
When running mysqldump
or mariadbdump
the following commands can be useful for generating a quick backup.
Individual Database Backups
mysqldump --host=$(kubectl -n openstack get service mariadb-cluster -o jsonpath='{.spec.clusterIP}')\
--user=root \
--password=$(kubectl --namespace openstack get secret mariadb -o jsonpath='{.data.root-password}' | base64 -d) \
--single-transaction \
--routines \
--triggers \
--events \
--column-statistics=0 \
${DATABASE_NAME} \
--result-file=/tmp/${DATABASE_NAME}-$(date +%s).sql
Column Statistics
With some versions of mysqldump
the --column-statistics=0
flag maybe be required. If required the following error will be thrown:
All Databases Backup
Run the /opt/genestack/bin/backup-mariadb.sh
script to dump all databases as individual files in ~/backup/mariadb/$(date +%s)
.
Database Backup Script: /opt/genestack/bin/backup-mariadb.sh
#!/bin/bash
# shellcheck disable=SC2124,SC2145,SC2294,SC2086
# The script is used to backup the mariadb database in the openstack namespace
# The script will create a backup directory in the HOME directory with the current timestamp
# The script will dump all the databases except the performance_schema and information_schema
# The script will use the root password from the mariadb secret to connect to the database
# The script will use the clusterIP of the mariadb-cluster service to connect to the database
# The script will use the --column-statistics=0 option if available in the mysqldump command
# The script will create a separate dump file for each database
set -e
set -o pipefail
BACKUP_DIR="${HOME}/backup/mariadb/$(date +%s)"
MYSQL_PASSWORD="$(kubectl --namespace openstack get secret mariadb -o jsonpath='{.data.root-password}' | base64 -d)"
MYSQL_HOST=$(kubectl -n openstack get service mariadb-cluster -o jsonpath='{.spec.clusterIP}')
if mysqldump --help | grep -q column-statistics; then
MYSQL_DUMP_COLLUMN_STATISTICS="--column-statistics=0"
else
MYSQL_DUMP_COLLUMN_STATISTICS=""
fi
mkdir -p "${BACKUP_DIR}"
pushd "${BACKUP_DIR}"
mysql -h ${MYSQL_HOST} \
-u root \
-p${MYSQL_PASSWORD} \
-e 'show databases;' \
--column-names=false \
--vertical | \
awk '/[:alnum:]/ && ! /performance_schema/ && ! /information_schema/' | \
xargs -i mysqldump --host=${MYSQL_HOST} ${MYSQL_DUMP_COLLUMN_STATISTICS} \
--user=root \
--password=${MYSQL_PASSWORD} \
--single-transaction \
--routines \
--triggers \
--events \
--result-file={} \
{}
popd
echo -e "backup complete and available at ${BACKUP_DIR}"
Individual Database Restores
Ensure the destination database exists
The destination database must exist prior to restoring individual SQL
backups. If it does not already exist, it's important to create the
database with the correct charset and collate values. Failing to do so can
result in errors such as Foreign Key Constraint is Incorrectly Formed
during DB upgrades.
Restoring a database
Restore using the MariaDB CRD
To restore the most recent successful backup, create the following resource to spawn a job that will mount the same storage as the backup and apply the dump to your MariaDB database.
Refer to the mariadb-operator restore documentation for more information.
Operator Restore Tips
- If you have multiple backups available, the operator is able to infer
which backup to restore based on the
spec.targetRecoveryTime
field discussed in the operator documentation here. - The referred database (db1 in the example below) must previously exist for the Restore to succeed.
- The mariadb CLI invoked by the operator under the hood only supports
selecting a single database to restore via the
--one-database
option, restoration of multiple specific databases is not supported.
Restore All Databases
The following command may lead to data loss
Restore Single Database
The following command may lead to data loss
Check Restore Progress
Simply get the restore object previously created
Fixing Replication
The MariaDB Operator can handle most cluster issues automatically, but sometimes you’ll need to roll up your sleeves and step in to fix things. This guide walks you through repairing replication on a broken slave to keep your deployment up and running.
In this example, mariadb-cluster-0 is the broken slave, and we’ll use a backup from the current primary, mariadb-cluster-1, to kickstart replication again on the busted pod.
Prepare and Restore Backup
-
Take a full backup of the primary: mariadb-cluster-1
-
Copy the backup off of the pod, onto your machine
-
Copy the backup to the broken slave, mariadb-cluster-0
-
Restore the backup, depending on its contents it may take a while, be patient.
Stop and Reset the Slave
Execute on the broken slave pod, mariadb-cluster-0:
Find Master Log and Position
Identify master log file and position from the backup file:
[SJC3] ubuntu@bastion:~/backups$ grep "CHANGE MASTER TO MASTER_LOG_FILE='mariadb-cluster-bin." mariadb-cluster-1.sql
-- CHANGE MASTER TO MASTER_LOG_FILE='mariadb-cluster-bin.000206', MASTER_LOG_POS=405;
Update and Restart Slave
-
Change the values in the following command to include the master log file and position from your previous grep result, making sure to also replace the master password value with the one from your cluster along with the real MASTER_HOST from your environment, then execute it on the broken slave pod (in our example, that is mariadb-cluster-0).
CHANGE MASTER TO MASTER_HOST='mariadb-cluster-1.mariadb-cluster-internal.openstack.svc.cluster.local', MASTER_USER='repl', MASTER_PASSWORD='<FIND ME IN K8s secret repl-password-mariadb-cluster>', MASTER_LOG_FILE='mariadb-cluster-bin.000206', MASTER_LOG_POS=405;
If
CHANGE MASTER
fails...If the previous command to CHANGE MASTER fails, one may need to
FLUSH PRIVILEGES;
first. -
Start the slave process again
-
Verify replication status is OK
-
Wait for replication to be caught up, then kill the slave pod. We are doing this to ensure it comes back online as expected (the operator should automatically execute CHANGE MASTER for mariadb-operator on the slave). When the pod has started; logs should contain something like the following:
2025-01-28 22:22:55 61 [Note] Master connection name: 'mariadb-operator' Master_info_file: 'master-mariadb@002doperator.info' Relay_info_file: 'relay-log-mariadb@002doperator.info' 2025-01-28 22:22:55 61 [Note] 'CHANGE MASTER TO executed'. Previous state master_host='', master_port='3306', master_log_file='', master_log_pos='4'. New state master_host='mariadb-cluster-1.mariadb-cluster-internal.openstack.svc.cluster.local', master_port='3306', master_log_file='', master_log_pos='4'. 2025-01-28 22:22:55 61 [Note] Previous Using_Gtid=Slave_Pos. New Using_Gtid=Current_Pos 2025-01-28 22:22:55 63 [Note] Master 'mariadb-operator': Slave I/O thread: Start semi-sync replication to master 'repl@mariadb-cluster-1.mariadb-cluster-internal.openstack.svc.cluster.local:3306' in log '' at position 4 2025-01-28 22:22:55 64 [Note] Master 'mariadb-operator': Slave SQL thread initialized, starting replication in log 'FIRST' at position 4, relay log './mariadb-cluster-relay-bin-mariadb@002doperator.000001' position: 4; GTID position '0-11-638858622' 2025-01-28 22:22:55 63 [Note] Master 'mariadb-operator': Slave I/O thread: connected to master 'repl@mariadb-cluster-1.mariadb-cluster-internal.openstack.svc.cluster.local:3306',replication starts at GTID position '0-11-638858622'