mgr-on-k8s

K8s部署mysql group replication

一般来说mysql是单独部署在主机上,我们这里将mysql直接部署在k8s中,方便统一管理。由于数据库对io的要求较高,使用ceph之类pv做后端存储并不合适。因此本文采用local volume来做为后端存储,手工创建。而对于mysql中每个节点不同的配置,则用shell自动生成。同时包含了mysql-exporter用于prometheus采集。

1.前提

根据mysql官网的教程,配置mgr, 以下配置是必要不可少的

server_id=1
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
binlog_checksum=NONE
log_slave_updates=ON
log_bin=binlog
binlog_format=ROW

transaction_write_set_extraction=XXHASH64
loose-group_replication_group_name=”aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa”
loose-group_replication_start_on_boot=off
loose-group_replication_local_address= “myql-0.mysql:24901”
loose-group_replication_group_seeds= “myql-0.mysql:24901,myql-1.mysql:24902,myql-2.mysql:24903”
loose-group_replication_bootstrap_group=off

此外,由于使用hostname在k8s不能解析,还需要添加以下配置

report_host=myql-0.mysql

其中,粗体的部分,需要每个pod设置的不同,参考官网https://kubernetes.io/docs/tasks/run-application/run-replicated-stateful-application/ 的例子, 采用intitcontainer,用shell脚本来生成这些动态配置。

2. 创建配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
apiVersion: v1
data:
int-mysql.sh: |
set -ex
# Generate mysql server-id from pod ordinal index.
[[ `hostname` =~ -([0-9]+)$ ]] || exit 1
ordinal=${BASH_REMATCH[1]}
echo [mysqld] > /etc/mysql/conf.d/server-id.cnf
# Add an offset to avoid reserved server-id=0 value.
echo server-id=$((100 + $ordinal)) >> /etc/mysql/conf.d/server-id.cnf
# Copy appropriate conf.d files from config-map to emptyDir.
echo loose-group_replication_local_address=`hostname`.mysql:24901 >>/etc/mysql/conf.d/server-id.cnf
echo report_host=`hostname`.mysql >>/etc/mysql/conf.d/server-id.cnf
cp /mnt/conf/mysql-conf.cnf /etc/mysql/conf.d/
mkdir -p /var/lib/mysql/{data,tmp,blog,rlog,ulog}
chown -R mysql:mysql /var/lib/mysql
mysql-conf.cnf: |
[mysqld]
skip-host-cache
skip-name-resolve
gtid_mode=ON
enforce_gtid_consistency=ON
master_info_repository=TABLE
relay_log_info_repository=TABLE
binlog_checksum=NONE
log_slave_updates=ON
#log_bin=binlog
binlog_format=ROW
transaction_write_set_extraction=XXHASH64
loose-group_replication_group_name="75320cff-9d1f-4aea-883a-8f5406513a9c"
loose-group_replication_start_on_boot=off
loose-group_replication_bootstrap_group=off
loose-group_replication_ip_whitelist=172.30.0.0/16
loose-group_replication_group_seeds= "mysql-0.mysql:24901,mysql-1.mysql:24901,mysql-2.mysql:24901"
default_authentication_plugin=mysql_native_password

event_scheduler=ON
#************** basic ***************
datadir =/var/lib/mysql/data

kind: ConfigMap
metadata:
name: mysql-config
namespace: test

首先创建configmap,一个是配置中的静态部分,另一个是生成配置的shell脚本,配置中default_authentication_plugin=mysql_native_password是使用老版本的密码配置,建议使用mysql 8.0的时候加上,不然很多库会连不上。

3.创建headless

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
apiVersion: v1
kind: Service
metadata:
name: mysql
labels:
app: mysql
namespace: test
spec:
ports:
- port: 3306
name: client
- port: 9104
name: exporter
clusterIP: None
selector:
app: mysql

创建statefulset都需要这个,没啥可以说的。

4.配置本地卷

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
apiVersion: v1
kind: PersistentVolume
metadata:
name: mysql-volume-a
labels:
volume-type: mysql
spec:
capacity:
storage: 50Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: local-storage
local:

path: /home/dockerdata/mysql-a
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- 10.168.136.193

根据实际配置,修改其中的路径和主机名称。 其中路径中的目录要手动创建,k8s不会自动创建。在部署时,如果希望mysql pod 被部署到指定的主机,那么先创建后面的statefulset, 看到pod变成pending,然后想创建mysql-0所希望的主机卷。再等pod-1变成pending,创建第二个卷,以此类推创建后面的几个。

5.创建statefulset

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
app: mysql
name: mysql
namespace: test
spec:
replicas: 3
serviceName: mysql
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9104"
spec:
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
initContainers:
- name: init-mysql
image: 10.168.136.193:5000/library/mysql:8.0.12
command:
- bash
- "/mnt/conf/int-mysql.sh"
volumeMounts:
- name: config-volume
mountPath: /mnt/conf/
readOnly: true
- name: conf
mountPath: /etc/mysql/conf.d/
- name: mysql-data
mountPath: /var/lib/mysql/
containers:
- name: mysql
image: 10.168.136.193:5000/library/mysql:8.0.12
#command:
# - "mysql-server"
# - "/appconfig/mysql-conf"
resources:
limits:
cpu: "1"
requests:
memory: "1024Mi"
volumeMounts:
- name: conf
mountPath: /etc/mysql/conf.d/
- name: mysql-data
mountPath: /var/lib/mysql/
ports:
- containerPort: 3306
name: client
- containerPort: 24901
env:
- name: TZ
value: "Asia/Shanghai"
- name: MYSQL_ROOT_PASSWORD
value: "1234asdf"
- name: MYSQL_INITDB_SKIP_TZINFO
value: "1"
- name: mysql-exporter
env:
- name: DATA_SOURCE_NAME
value: exporter:018P4NUDZNaqeStsSkmO0A@tcp(127.0.0.1:3306)/performance_schema
image: 10.168.136.193:5000/prom/mysqld-exporter:v0.11.0
ports:
- containerPort: 9104
name: exporter
volumes:
- name: config-volume
configMap:
name: mysql-config
- name: conf
emptyDir: {}
volumeClaimTemplates:
- metadata:
name: mysql-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: local-storage
resources:
requests:
storage: 10Gi
selector:
matchLabels:
volume-type: mysql

这里包含了3个container, init-mysql是用于初始化mysql配置。 mysql是主体, mysql-exporter是监控导出。

注意 : MYSQL_INITDB_SKIP_TZINFO 这个环境变量一定要设置,因为官方镜像,会在创建容器的时候去配置市区信息,配置这个的时候没有关闭binlog, 这个会导致加入mgr的从节点报错,错误原因会说包含了集群中没有的事务云云。

mysql-exporter 使用的是环境变量提供用户名密码,这里也可以改成用configmap配置.my.cnf 文件。

6.启动mgr

6.1 初始化bootstrap节点

attach到第一个pod

kubectl -n test exec -ti mysql-0 -c mysql – mysql

然后执行一下mysql语句

1
2
3
4
5
6
7
8
9
10
11
12
SET SQL_LOG_BIN=0;
CREATE USER rpl_user@'%' IDENTIFIED BY 'password';
GRANT REPLICATION SLAVE ON *.* TO rpl_user@'%';
INSTALL PLUGIN group_replication SONAME 'group_replication.so';
FLUSH PRIVILEGES;
SET SQL_LOG_BIN=1;

CHANGE MASTER TO MASTER_USER='rpl_user', MASTER_PASSWORD='password' FOR CHANNEL 'group_replication_recovery';

SET GLOBAL group_replication_bootstrap_group=ON;
START GROUP_REPLICATION;
SET GLOBAL group_replication_bootstrap_group=OFF;

命令跑完以后用 一下命令检查

1
SELECT * FROM performance_schema.replication_group_members;

看到的状态应该是online

6.2其他节点加入集群

在后面的2台上执行一下命令

1
2
3
4
5
6
7
8
9
10
SET SQL_LOG_BIN=0;
CREATE USER rpl_user@'%' IDENTIFIED BY 'password';
GRANT REPLICATION SLAVE ON *.* TO rpl_user@'%';
INSTALL PLUGIN group_replication SONAME 'group_replication.so';
FLUSH PRIVILEGES;
SET SQL_LOG_BIN=1;


CHANGE MASTER TO MASTER_USER='rpl_user', MASTER_PASSWORD='password' FOR CHANNEL 'group_replication_recovery';
START GROUP_REPLICATION;

和第一台相比,少了 group_replication_bootstrap_group 的设置。
自此,mgr已经启动完毕

7 创建exorter的用户

登录到primary节点,创建exporter用户。这个会自动同步到全部节点

1
2
3
CREATE USER 'exporter'@'127.0.0.1' IDENTIFIED BY '018P4NUDZNaqeStsSkmO0A';
GRANT PROCESS, REPLICATION CLIENT ON *.* TO 'exporter'@'127.0.0.1';
GRANT SELECT ON performance_schema.* TO 'exporter'@'127.0.0.1';

然后用 curl 172.30.77.6:9104/metrics |grep mysql_up 查看exporter是否正常。 172.30.77.6为pod的ip

8 后续

时区,我们在配置的时候,跳过了时区相关的设置, 如果需要补上,attach到primary pod 执行

1
2
kubectl -n test exec -ti mysql-0 -c mysql -- bash
mysql_tzinfo_to_sql /usr/share/zoneinfo | sed 's/Local time zone must be set--see zic manual page/FCTY/' | mysql

说实话,我也不知道这个在干啥。

集群启动后,配置文件中,最好加上superreadonly。因为如果mysql挂掉,再启动,还没加入集群,这个时候如果不设置,会导致可以写入数据,然后数据就不一致了,然后这个节点很可能就不能加入进群了