Etcd Cluster集群管理

etcd

字数统计: 2.1k阅读时长: 10 min

 2017/11/12   Share

Etcd Cluster

Etcd集群采用典型的主从模型，通过raft协议来保证在一段时间内有一个节点为主节点，这样的选举机制，所以就跟其他的分布式集群一样，集群节点个数推荐奇数个，最少3个。

构建集群

node1 10.211.55.6
node2 10.211.55.43
node3 10.211.55.4

静态配置集群信息

和构建其他集群一样修改/etc/hosts和同步时间

node1

1	~]# ./etcd --name n1
2	--initial-cluster-token cluster1
3	--initial-cluster-state new
4	--listen-client-urls http://127.0.0.1:2379,http://10.211.55.6:2379
5	--listen-peer-urls http://10.211.55.6:2380
6	--advertise-client-urls http://10.211.55.6:2379
7	--initial-advertise-peer-urls http://10.211.55.6:2380
8	--initial-cluster n1=http://10.211.55.6:2380,n2=http://10.211.55.43:2380,n3=http://10.211.55.4:2380

node2

1	~]#./etcd --name n2
2	--initial-cluster-token cluster1
3	--initial-cluster-state new
4	--listen-client-urls http://127.0.0.1:2379,http://10.211.55.43:2379
5	--listen-peer-urls http://10.211.55.43:2380
6	--advertise-client-urls http://10.211.55.43:2379
7	--initial-advertise-peer-urls http://10.211.55.43:2380
8	--initial-cluster n1=http://10.211.55.6:2380,n2=http://10.211.55.43:2380,n3=http://10.211.55.4:2380

node3

1	~]# ./etcd --name n3
2	--initial-cluster-token cluster1
3	--initial-cluster-state new
4	--listen-client-urls http://127.0.0.1:2379,http://10.211.55.4:2379
5	--listen-peer-urls http://10.211.55.4:2380
6	--advertise-client-urls http://10.211.55.4:2379
7	--initial-advertise-peer-urls http://10.211.55.4:2380
8	--initial-cluster n1=http://10.211.55.6:2380,n2=http://10.211.55.43:2380,n3=http://10.211.55.4:2380

~]# ./etcdctl cluster-health
member 126d5057628cf9e1 is healthy: got healthy result from http://10.211.55.43:2379
member 6da33978247d0f4c is healthy: got healthy result from http://10.211.55.6:2379
member d143f97740999fa6 is healthy: got healthy result from http://10.211.55.4:2379
cluster is healthy
~]# ./etcdctl member list
126d5057628cf9e1: name=n2 peerURLs=http://10.211.55.43:2380 clientURLs=http://10.211.55.43:2379 isLeader=true
6da33978247d0f4c: name=n1 peerURLs=http://10.211.55.6:2380 clientURLs=http://10.211.55.6:2379 isLeader=false
d143f97740999fa6: name=n3 peerURLs=http://10.211.55.4:2380 clientURLs=http://10.211.55.4:2379 isLeader=false

动态发现

可见动态配置信息在–initial-cluster上需要自己制定个集群节点，如果量大的话，就会很不方便，所以CoreOS也提供了一个Etcd发现服务。

首先需要为集群申请统一一个独一无二的uuid

1	~]# curl https://discovery.etcd.io/new?size=3
2	https://discovery.etcd.io/21b5178b7787340a178d56cb11c8f685

node1

~]# ./etcd --name n1 
			--initial-cluster-token cluster1 
			--initial-cluster-state new 
			--listen-client-urls http://127.0.0.1:2379,http://10.211.55.6:2379 
			--listen-peer-urls http://10.211.55.6:2380 
			--advertise-client-urls http://10.211.55.6:2379 
			--initial-advertise-peer-urls http://10.211.55.6:2380 
			--discovery https://discovery.etcd.io/21b5178b7787340a178d56cb11c8f685

node2

~]#./etcd --name n2 
			--initial-cluster-token cluster1 
			--initial-cluster-state new 
			--listen-client-urls http://127.0.0.1:2379,http://10.211.55.43:2379 
			--listen-peer-urls http://10.211.55.43:2380 
			--advertise-client-urls http://10.211.55.43:2379 
			--initial-advertise-peer-urls http://10.211.55.43:2380 
			--discovery https://discovery.etcd.io/21b5178b7787340a178d56cb11c8f685

node3

~]# ./etcd --name n3 
			--initial-cluster-token cluster1 
			--initial-cluster-state new 
			--listen-client-urls http://127.0.0.1:2379,http://10.211.55.4:2379 
			--listen-peer-urls http://10.211.55.4:2380 
			--advertise-client-urls http://10.211.55.4:2379 
			--initial-advertise-peer-urls http://10.211.55.4:2380 
			--discovery https://discovery.etcd.io/21b5178b7787340a178d56cb11c8f685

~]# ./etcdctl member list
126d5057628cf9e1: name=n2 peerURLs=http://10.211.55.43:2380 clientURLs=http://10.211.55.43:2379 isLeader=true
6da33978247d0f4c: name=n1 peerURLs=http://10.211.55.6:2380 clientURLs=http://10.211.55.6:2379 isLeader=false
d143f97740999fa6: name=n3 peerURLs=http://10.211.55.4:2380 clientURLs=http://10.211.55.4:2379 isLeader=false
~]# ./etcdctl cluster-health
member 126d5057628cf9e1 is healthy: got healthy result from http://10.211.55.43:2379
member 6da33978247d0f4c is healthy: got healthy result from http://10.211.55.6:2379
member d143f97740999fa6 is healthy: got healthy result from http://10.211.55.4:2379
cluster is healthy

集群参数配置

集群为我们提供了横向扩张的能力，但相对的也为整个服务提供一些额外的影响因素，如集群间的网络抖动，时间同步，数据同步的存储压力和网络压力，所以我们对于集群的管理需要更加的精细。

时间同步

时间同步这是每个分布式集群必须强调的地方，而对于etcd集群，时间误差超过1s就会导致Raft协议异常，所以时间必须同步。

心跳消息时间间隔和选举时间间隔

从这里看，无论如何选举时间时间间隔都要比心跳时间间隔要长，一般建议5倍以上，这个参数可以通过–heartbeat-interval和–election-timeout参数来指定。

snapshot频率

etcd会定期将数据存储为snapshot，默认10000次修改才会存储一次，在存储时会有大量数据写入，影响集群性能。

更新节点

$ etcdctl member list
6e3bd23ae5f1eae0: name=node2 peerURLs=http://localhost:23802 clientURLs=http://127.0.0.1:23792
924e2e83e93f2560: name=node3 peerURLs=http://localhost:23803 clientURLs=http://127.0.0.1:23793
a8266ecf031671f3: name=node1 peerURLs=http://localhost:23801 clientURLs=http://127.0.0.1:23791

在本例中，我们假设要更新 ID 为 a8266ecf031671f3 的节点的 peerURLs 为：http://10.0.1.10:2380

$ etcdctl member update a8266ecf031671f3 http://10.0.1.10:2380
Updated member with ID a8266ecf031671f3 in cluster

删除节点

1	$ etcdctl member remove a8266ecf031671f3
2	Removed member a8266ecf031671f3 from cluster

增加节点

$ etcdctl member add infra3 http://10.0.1.13:2380
added member 9bf1b35fc7761a23 to cluster

ETCD_NAME="infra3"
ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380,infra3=http://10.0.1.13:2380"
ETCD_INITIAL_CLUSTER_STATE=existing

在etcd新节点的上执行：

1	$ export ETCD_NAME="infra3"
2	$ export ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380,infra3=http://10.0.1.13:2380"
3	$ export ETCD_INITIAL_CLUSTER_STATE=existing
4	$ etcd -listen-client-urls http://10.0.1.13:2379 -advertise-client-urls http://10.0.1.13:2379 -listen-peer-urls http://10.0.1.13:2380 -initial-advertise-peer-urls http://10.0.1.13:2380 -data-dir %data_dir%

服务故障迁移

首先备份出正常节点的数据

1	$ ./etcdctl backup --data-dir /var/lib/etcd -backup-dir /tmp/etcd_backup
2	$ tar -zcxf backup.etcd.tar.gz /tmp/etcd_backup

然后将Etcd数据恢复到新的集群的任意一个节点上，使用 –force-new-cluster 参数启动Etcd服务。这个参数会重置集群ID和集群的所有成员信息，其中节点的监听地址会被重置为localhost:2379, 表示集群中只有一个节点。

1	$ tar -zxvf backup.etcd.tar.gz -C /var/lib/etcd
2	$ etcd --data-dir=/var/lib/etcd --force-new-cluster ...

启动完成单节点的etcd,可以先对数据的完整性进行验证，确认无误后再通过Etcd API修改节点的监听地址，让它监听节点的外部IP地址，为增加其他节点做准备。例如：

用etcd命令找到当前节点的ID。

1	$ etcdctl member list
2
3	98f0c6bf64240842: name=cd-2 peerURLs=http://127.0.0.1:2580 clientURLs=http://127.0.0.1:2579

由于etcdctl不具备修改成员节点参数的功能，下面的操作要使用API来完成。

1	$ curl http://127.0.0.1:2579/v2/members/98f0c6bf64240842 -XPUT \
2	-H "Content-Type:application/json" -d '{"peerURLs":["http://127.0.0.1:2580"]}'

注意，在Etcd文档中，建议首先将集群恢复到一个临时的目录中，从临时目录启动etcd，验证新的数据正确完整后，停止etcd，在将数据恢复到正常的目录中。

最后，在完成第一个成员节点的启动后，可以通过集群扩展的方法使用 etcdctl member add 命令添加其他成员节点进来。

参考文档：https://www.cnblogs.com/breg/p/5728237.html

原文作者：Zhang Jusene

原文链接：http://jusene.github.io/2017/11/12/etcd-cluster/

发表日期：November 12th 2017, 2:56:50 pm

更新日期：November 30th 2019, 2:11:17 am

Next Post

Docker三剑客之一 Docker Machine
Previous Post

Etcd 高可用的键值对数据库

CATALOG

1. Etcd Cluster
2. 构建集群
1. 2.1. 静态配置集群信息
2. 2.2. 动态发现
3. 集群参数配置
4. 更新节点
5. 删除节点
6. 增加节点
7. 服务故障迁移



1	~]# ./etcdctl cluster-health
2	member 126d5057628cf9e1 is healthy: got healthy result from http://10.211.55.43:2379
3	member 6da33978247d0f4c is healthy: got healthy result from http://10.211.55.6:2379
4	member d143f97740999fa6 is healthy: got healthy result from http://10.211.55.4:2379
5	cluster is healthy
6	~]# ./etcdctl member list
7	126d5057628cf9e1: name=n2 peerURLs=http://10.211.55.43:2380 clientURLs=http://10.211.55.43:2379 isLeader=true
8	6da33978247d0f4c: name=n1 peerURLs=http://10.211.55.6:2380 clientURLs=http://10.211.55.6:2379 isLeader=false
9	d143f97740999fa6: name=n3 peerURLs=http://10.211.55.4:2380 clientURLs=http://10.211.55.4:2379 isLeader=false

1	$ etcdctl member list
2	6e3bd23ae5f1eae0: name=node2 peerURLs=http://localhost:23802 clientURLs=http://127.0.0.1:23792
3	924e2e83e93f2560: name=node3 peerURLs=http://localhost:23803 clientURLs=http://127.0.0.1:23793
4	a8266ecf031671f3: name=node1 peerURLs=http://localhost:23801 clientURLs=http://127.0.0.1:23791
5
6	在本例中，我们假设要更新 ID 为 a8266ecf031671f3 的节点的 peerURLs 为：http://10.0.1.10:2380
7
8	$ etcdctl member update a8266ecf031671f3 http://10.0.1.10:2380
9	Updated member with ID a8266ecf031671f3 in cluster

1	$ etcdctl member add infra3 http://10.0.1.13:2380
2	added member 9bf1b35fc7761a23 to cluster
3
4	ETCD_NAME="infra3"
5	ETCD_INITIAL_CLUSTER="infra0=http://10.0.1.10:2380,infra1=http://10.0.1.11:2380,infra2=http://10.0.1.12:2380,infra3=http://10.0.1.13:2380"
6	ETCD_INITIAL_CLUSTER_STATE=existing