在线做爰 视频网站,html好看的个人主页,wordpress公共课,农产品的网站建设与维护论文zookeeper和etcd有状态服务部署实践 docker etcd zookeeper kubernetes 4k 次阅读 读完需要 78 分钟 0 一. 概述 kubernetes通过statefulset为zookeeper、etcd等这类有状态的应用程序提供完善支持#xff0c;statefulset具备以下特性#xff1a; 为pod提供稳定的唯一… zookeeper和etcd有状态服务部署实践 docker etcd zookeeper kubernetes 4k 次阅读 · 读完需要 78 分钟 0 一. 概述 kubernetes通过statefulset为zookeeper、etcd等这类有状态的应用程序提供完善支持statefulset具备以下特性 为pod提供稳定的唯一的网络标识稳定值持久化存储通过pv/pvc来实现启动和停止pod保证有序优雅的部署和伸缩性本文阐述了如何在k8s集群上部署zookeeper和etcd有状态服务并结合ceph实现数据持久化。 二. 总结 使用k8s的statefulset、storageclass、pv、pvc和ceph的rbd能够很好的支持zookeeper、etcd这样的有状态服务部署到kubernetes集群上。k8s不会主动删除已经创建的pv、pvc对象防止出现误删。如果用户确定删除pv、pvc对象同时还需要手动删除ceph段的rbd镜像。 遇到的坑storageclass中引用的ceph客户端用户必须要有mon rwrbd rwx权限。如果没有mon write权限会导致释放rbd锁失败无法将rbd镜像挂载到其他的k8s worker节点。 zookeeper使用探针检查zookeeper节点的健康状态如果节点不健康k8s将删除pod并自动重建该pod达到自动重启zookeeper节点的目的。因zookeeper 3.4版本的集群配置是通过静态加载文件zoo.cfg来实现的所以当zookeeper节点pod ip变动后需要重启zookeeper集群中的所有节点。 etcd部署方式有待优化本次试验中使用静态方式部署etcd集群如果etcd节点变迁时需要执行etcdctl member remove/add等命令手动配置etcd集群严重限制了etcd集群自动故障恢复、扩容缩容的能力。因此需要考虑对部署方式优化改为使用DNS或者etcd descovery的动态方式部署etcd才能让etcd更好的运行在k8s上。 三. zookeeper集群部署 1. 下载镜像 docker pull gcr.mirrors.ustc.edu.cn/google_containers/kubernetes-zookeeper:1.0-3.4.10
docker tag gcr.mirrors.ustc.edu.cn/google_containers/kubernetes-zookeeper:1.0-3.4.10 172.16.18.100:5000/gcr.io/google_containers/kubernetes-zookeeper:1.0-3.4.10
docker push 172.16.18.100:5000/gcr.io/google_containers/kubernetes-zookeeper:1.0-3.4.10 2. 定义ceph secret cat EOF | kubectl create -f -
apiVersion: v1
data:key: QVFBYy9ndGFRUno4QlJBQXMxTjR3WnlqN29PK3VrMzI1a05aZ3c9PQo
kind: Secret
metadata:creationTimestamp: 2017-11-20T10:29:05Zname: ceph-secretnamespace: defaultresourceVersion: 2954730selfLink: /api/v1/namespaces/default/secrets/ceph-secretuid: a288ff74-cddd-11e7-81cc-000c29f99475
type: kubernetes.io/rbd
EOF 3. 定义storageclass rbd存储 cat EOF | kubectl create -f -
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:name: ceph
parameters:adminId: adminadminSecretName: ceph-secretadminSecretNamespace: defaultfsType: ext4imageFormat: 2imagefeatures: layeringmonitors: 172.16.13.223pool: k8suserId: adminuserSecretName: ceph-secret
provisioner: kubernetes.io/rbd
reclaimPolicy: Delete
EOF 4. 创建zookeeper集群 使用rbd存储zookeeper节点数据 cat EOF | kubectl create -f -
---
apiVersion: v1
kind: Service
metadata:name: zk-hslabels:app: zk
spec:ports:- port: 2888name: server- port: 3888name: leader-electionclusterIP: Noneselector:app: zk
---
apiVersion: v1
kind: Service
metadata:name: zk-cslabels:app: zk
spec:ports:- port: 2181name: clientselector:app: zk
---
apiVersion: policy/v1beta1
kind: PodDisruptionBudget
metadata:name: zk-pdb
spec:selector:matchLabels:app: zkmaxUnavailable: 1
---
apiVersion: apps/v1beta2 # for versions before 1.8.0 use apps/v1beta1
kind: StatefulSet
metadata:name: zk
spec:selector:matchLabels:app: zkserviceName: zk-hsreplicas: 3updateStrategy:type: RollingUpdatepodManagementPolicy: Paralleltemplate:metadata:labels:app: zkspec:affinity:podAntiAffinity:requiredDuringSchedulingIgnoredDuringExecution:- labelSelector:matchExpressions:- key: appoperator: Invalues:- zktopologyKey: kubernetes.io/hostnamecontainers:- name: kubernetes-zookeeperimagePullPolicy: Alwaysimage: 172.16.18.100:5000/gcr.io/google_containers/kubernetes-zookeeper:1.0-3.4.10ports:- containerPort: 2181name: client- containerPort: 2888name: server- containerPort: 3888name: leader-electioncommand:- sh- -c- start-zookeeper \--servers3 \--data_dir/var/lib/zookeeper/data \--data_log_dir/var/lib/zookeeper/data/log \--conf_dir/opt/zookeeper/conf \--client_port2181 \--election_port3888 \--server_port2888 \--tick_time2000 \--init_limit10 \--sync_limit5 \--heap512M \--max_client_cnxns60 \--snap_retain_count3 \--purge_interval12 \--max_session_timeout40000 \--min_session_timeout4000 \--log_levelINFOreadinessProbe:exec:command:- sh- -c- zookeeper-ready 2181initialDelaySeconds: 10timeoutSeconds: 5livenessProbe:exec:command:- sh- -c- zookeeper-ready 2181initialDelaySeconds: 10timeoutSeconds: 5volumeMounts:- name: datadirmountPath: /var/lib/zookeepersecurityContext:runAsUser: 1000fsGroup: 1000volumeClaimTemplates:- metadata:name: datadirannotations:volume.beta.kubernetes.io/storage-class: cephspec:accessModes: [ ReadWriteOnce ]resources:requests:storage: 1Gi
EOF 查看创建结果 [root172 zookeeper]# kubectl get no
NAME STATUS ROLES AGE VERSION
172.16.20.10 Ready none 50m v1.8.2
172.16.20.11 Ready none 2h v1.8.2
172.16.20.12 Ready none 1h v1.8.2[root172 zookeeper]# kubectl get po -owide
NAME READY STATUS RESTARTS AGE IP NODE
zk-0 1/1 Running 0 8m 192.168.5.162 172.16.20.10
zk-1 1/1 Running 0 1h 192.168.2.146 172.16.20.11[root172 zookeeper]# kubectl get pv,pvc
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pv/pvc-226cb8f0-d322-11e7-9581-000c29f99475 1Gi RWO Delete Bound default/datadir-zk-0 ceph 1h
pv/pvc-22703ece-d322-11e7-9581-000c29f99475 1Gi RWO Delete Bound default/datadir-zk-1 ceph 1hNAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
pvc/datadir-zk-0 Bound pvc-226cb8f0-d322-11e7-9581-000c29f99475 1Gi RWO ceph 1h
pvc/datadir-zk-1 Bound pvc-22703ece-d322-11e7-9581-000c29f99475 1Gi RWO ceph 1h zk-0 pod的rbd的锁信息为 [rootceph1 ceph]# rbd lock list kubernetes-dynamic-pvc-227b45e5-d322-11e7-90ab-000c29f99475 -p k8s --user admin
There is 1 exclusive lock on this image.
Locker ID Address
client.24146 kubelet_lock_magic_172.16.20.10 172.16.20.10:0/1606152350 5. 测试pod迁移 尝试将172.16.20.10节点设置为污点让zk-0 pod自动迁移到172.16.20.12 kubectl cordon 172.16.20.10[root172 zookeeper]# kubectl get no
NAME STATUS ROLES AGE VERSION
172.16.20.10 Ready,SchedulingDisabled none 58m v1.8.2
172.16.20.11 Ready none 2h v1.8.2
172.16.20.12 Ready none 1h v1.8.2kubectl delete po zk-0 观察zk-0的迁移过程 [root172 zookeeper]# kubectl get po -owide -w
NAME READY STATUS RESTARTS AGE IP NODE
zk-0 1/1 Running 0 14m 192.168.5.162 172.16.20.10
zk-1 1/1 Running 0 1h 192.168.2.146 172.16.20.11
zk-0 1/1 Terminating 0 16m 192.168.5.162 172.16.20.10
zk-0 0/1 Terminating 0 16m none 172.16.20.10
zk-0 0/1 Terminating 0 16m none 172.16.20.10
zk-0 0/1 Terminating 0 16m none 172.16.20.10
zk-0 0/1 Terminating 0 16m none 172.16.20.10
zk-0 0/1 Terminating 0 16m none 172.16.20.10
zk-0 0/1 Pending 0 0s none none
zk-0 0/1 Pending 0 0s none 172.16.20.12
zk-0 0/1 ContainerCreating 0 0s none 172.16.20.12
zk-0 0/1 Running 0 3s 192.168.3.4 172.16.20.12 此时zk-0正常迁移到172.16.20.12 再查看rbd的锁定信息 [rootceph1 ceph]# rbd lock list kubernetes-dynamic-pvc-227b45e5-d322-11e7-90ab-000c29f99475 -p k8s --user admin
There is 1 exclusive lock on this image.
Locker ID Address
client.24146 kubelet_lock_magic_172.16.20.10 172.16.20.10:0/1606152350
[rootceph1 ceph]# rbd lock list kubernetes-dynamic-pvc-227b45e5-d322-11e7-90ab-000c29f99475 -p k8s --user admin
There is 1 exclusive lock on this image.
Locker ID Address
client.24154 kubelet_lock_magic_172.16.20.12 172.16.20.12:0/3715989358 之前在另外一个ceph集群测试这个zk pod迁移的时候总是报错无法释放lock经分析应该是使用的ceph账号没有相应的权限所以导致释放lock失败。记录的报错信息如下 Nov 27 10:45:55 172 kubelet: W1127 10:45:55.551768 11556 rbd_util.go:471] rbd: no watchers on kubernetes-dynamic-pvc-f35a411e-d317-11e7-90ab-000c29f99475
Nov 27 10:45:55 172 kubelet: I1127 10:45:55.694126 11556 rbd_util.go:181] remove orphaned locker kubelet_lock_magic_172.16.20.12 from client client.171490: err exit status 13, output: 2017-11-27 10:45:55.570483 7fbdbe922d40 -1 did not load config file, using default settings.
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.600816 7fbdbe922d40 -1 Errors while parsing config file!
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.600824 7fbdbe922d40 -1 parse_file: cannot open /etc/ceph/ceph.conf: (2) No such file or directory
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.600825 7fbdbe922d40 -1 parse_file: cannot open ~/.ceph/ceph.conf: (2) No such file or directory
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.600825 7fbdbe922d40 -1 parse_file: cannot open ceph.conf: (2) No such file or directory
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.602492 7fbdbe922d40 -1 Errors while parsing config file!
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.602494 7fbdbe922d40 -1 parse_file: cannot open /etc/ceph/ceph.conf: (2) No such file or directory
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.602495 7fbdbe922d40 -1 parse_file: cannot open ~/.ceph/ceph.conf: (2) No such file or directory
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.602496 7fbdbe922d40 -1 parse_file: cannot open ceph.conf: (2) No such file or directory
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.651594 7fbdbe922d40 -1 auth: unable to find a keyring on /etc/ceph/ceph.client.k8s.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,: (2) No such file or directory
Nov 27 10:45:55 172 kubelet: rbd: releasing lock failed: (13) Permission denied
Nov 27 10:45:55 172 kubelet: 2017-11-27 10:45:55.682470 7fbdbe922d40 -1 librbd: unable to blacklist client: (13) Permission denied k8s rbd volume的实现代码 if lock {// check if lock is already held for this host by matching lock_id and rbd lock idif strings.Contains(output, lock_id) {// this host already holds the lock, exitglog.V(1).Infof(rbd: lock already held for %s, lock_id)return nil}// clean up orphaned lock if no watcher on the imageused, statusErr : util.rbdStatus(b)if statusErr nil !used {re : regexp.MustCompile(client.* kubeLockMagic .*)locks : re.FindAllStringSubmatch(output, -1)for _, v : range locks {if len(v) 0 {lockInfo : strings.Split(v[0], )if len(lockInfo) 2 {args : []string{lock, remove, b.Image, lockInfo[1], lockInfo[0], --pool, b.Pool, --id, b.Id, -m, mon}args append(args, secret_opt...)cmd, err b.exec.Run(rbd, args...)# 执行rbd lock remove命令时返回了错误信息glog.Infof(remove orphaned locker %s from client %s: err %v, output: %s, lockInfo[1], lockInfo[0], err, string(cmd))}}}}// hold a lock: rbd lock addargs : []string{lock, add, b.Image, lock_id, --pool, b.Pool, --id, b.Id, -m, mon}args append(args, secret_opt...)cmd, err b.exec.Run(rbd, args...)} 可以看到rbd lock remove操作被拒绝了原因是没有权限rbd: releasing lock failed: (13) Permission denied。 6. 测试扩容 zookeeper集群节点数从2个扩为3个。 集群节点数为2时zoo.cfg的配置中定义了两个实例 zookeeperzk-0:/opt/zookeeper/conf$ cat zoo.cfg
#This file was autogenerated DO NOT EDIT
clientPort2181
dataDir/var/lib/zookeeper/data
dataLogDir/var/lib/zookeeper/data/log
tickTime2000
initLimit10
syncLimit5
maxClientCnxns60
minSessionTimeout4000
maxSessionTimeout40000
autopurge.snapRetainCount3
autopurge.purgeInteval12
server.1zk-0.zk-hs.default.svc.cluster.local:2888:3888
server.2zk-1.zk-hs.default.svc.cluster.local:2888:3888 使用kubectl edit statefulset zk命令修改replicas3start-zookeeper --servers3, 此时观察pod的变化 [root172 zookeeper]# kubectl get po -owide -w
NAME READY STATUS RESTARTS AGE IP NODE
zk-0 1/1 Running 0 1h 192.168.5.170 172.16.20.10
zk-1 1/1 Running 0 1h 192.168.3.12 172.16.20.12
zk-2 0/1 Pending 0 0s none none
zk-2 0/1 Pending 0 0s none 172.16.20.11
zk-2 0/1 ContainerCreating 0 0s none 172.16.20.11
zk-2 0/1 Running 0 1s 192.168.2.154 172.16.20.11
zk-2 1/1 Running 0 11s 192.168.2.154 172.16.20.11
zk-1 1/1 Terminating 0 1h 192.168.3.12 172.16.20.12
zk-1 0/1 Terminating 0 1h none 172.16.20.12
zk-1 0/1 Terminating 0 1h none 172.16.20.12
zk-1 0/1 Terminating 0 1h none 172.16.20.12
zk-1 0/1 Terminating 0 1h none 172.16.20.12
zk-1 0/1 Pending 0 0s none none
zk-1 0/1 Pending 0 0s none 172.16.20.12
zk-1 0/1 ContainerCreating 0 0s none 172.16.20.12
zk-1 0/1 Running 0 2s 192.168.3.13 172.16.20.12
zk-1 1/1 Running 0 20s 192.168.3.13 172.16.20.12
zk-0 1/1 Terminating 0 1h 192.168.5.170 172.16.20.10
zk-0 0/1 Terminating 0 1h none 172.16.20.10
zk-0 0/1 Terminating 0 1h none 172.16.20.10
zk-0 0/1 Terminating 0 1h none 172.16.20.10
zk-0 0/1 Terminating 0 1h none 172.16.20.10
zk-0 0/1 Pending 0 0s none none
zk-0 0/1 Pending 0 0s none 172.16.20.10
zk-0 0/1 ContainerCreating 0 0s none 172.16.20.10
zk-0 0/1 Running 0 2s 192.168.5.171 172.16.20.10
zk-0 1/1 Running 0 12s 192.168.5.171 172.16.20.10 可以看到zk-0/zk-1都重启了这样可以加载新的zoo.cfg配置文件保证集群正确配置。 新的zoo.cfg配置文件记录了3个实例 [root172 ~]# kubectl exec zk-0 -- cat /opt/zookeeper/conf/zoo.cfg
#This file was autogenerated DO NOT EDIT
clientPort2181
dataDir/var/lib/zookeeper/data
dataLogDir/var/lib/zookeeper/data/log
tickTime2000
initLimit10
syncLimit5
maxClientCnxns60
minSessionTimeout4000
maxSessionTimeout40000
autopurge.snapRetainCount3
autopurge.purgeInteval12
server.1zk-0.zk-hs.default.svc.cluster.local:2888:3888
server.2zk-1.zk-hs.default.svc.cluster.local:2888:3888
server.3zk-2.zk-hs.default.svc.cluster.local:2888:3888 7. 测试缩容 缩容的时候zk集群也自动重启了所有的zk节点缩容过程如下 [root172 ~]# kubectl get po -owide -w
NAME READY STATUS RESTARTS AGE IP NODE
zk-0 1/1 Running 0 5m 192.168.5.171 172.16.20.10
zk-1 1/1 Running 0 6m 192.168.3.13 172.16.20.12
zk-2 1/1 Running 0 7m 192.168.2.154 172.16.20.11
zk-2 1/1 Terminating 0 7m 192.168.2.154 172.16.20.11
zk-1 1/1 Terminating 0 7m 192.168.3.13 172.16.20.12
zk-2 0/1 Terminating 0 8m none 172.16.20.11
zk-1 0/1 Terminating 0 7m none 172.16.20.12
zk-2 0/1 Terminating 0 8m none 172.16.20.11
zk-1 0/1 Terminating 0 7m none 172.16.20.12
zk-1 0/1 Terminating 0 7m none 172.16.20.12
zk-1 0/1 Terminating 0 7m none 172.16.20.12
zk-1 0/1 Pending 0 0s none none
zk-1 0/1 Pending 0 0s none 172.16.20.12
zk-1 0/1 ContainerCreating 0 0s none 172.16.20.12
zk-1 0/1 Running 0 2s 192.168.3.14 172.16.20.12
zk-2 0/1 Terminating 0 8m none 172.16.20.11
zk-2 0/1 Terminating 0 8m none 172.16.20.11
zk-1 1/1 Running 0 19s 192.168.3.14 172.16.20.12
zk-0 1/1 Terminating 0 7m 192.168.5.171 172.16.20.10
zk-0 0/1 Terminating 0 7m none 172.16.20.10
zk-0 0/1 Terminating 0 7m none 172.16.20.10
zk-0 0/1 Terminating 0 7m none 172.16.20.10
zk-0 0/1 Pending 0 0s none none
zk-0 0/1 Pending 0 0s none 172.16.20.10
zk-0 0/1 ContainerCreating 0 0s none 172.16.20.10
zk-0 0/1 Running 0 3s 192.168.5.172 172.16.20.10
zk-0 1/1 Running 0 13s 192.168.5.172 172.16.20.10 四. etcd集群部署 1. 创建etcd集群 cat EOF | kubectl create -f -
apiVersion: v1
kind: Service
metadata:name: etcdannotations:# Create endpoints also if the related pod isnt readyservice.alpha.kubernetes.io/tolerate-unready-endpoints: true
spec:ports:- port: 2379name: client- port: 2380name: peerclusterIP: Noneselector:component: etcd
---
apiVersion: apps/v1beta1
kind: StatefulSet
metadata:name: etcdlabels:component: etcd
spec:serviceName: etcd# changing replicas value will require a manual etcdctl member remove/add# command (remove before decreasing and add after increasing)replicas: 3template:metadata:name: etcdlabels:component: etcdspec:containers:- name: etcdimage: 172.16.18.100:5000/quay.io/coreos/etcd:v3.2.3ports:- containerPort: 2379name: client- containerPort: 2380name: peerenv:- name: CLUSTER_SIZEvalue: 3- name: SET_NAMEvalue: etcdvolumeMounts:- name: datamountPath: /var/run/etcdcommand:- /bin/sh- -ecx- |IP$(hostname -i)for i in $(seq 0 $((${CLUSTER_SIZE} - 1))); dowhile true; doecho Waiting for ${SET_NAME}-${i}.${SET_NAME} to come upping -W 1 -c 1 ${SET_NAME}-${i}.${SET_NAME}.default.svc.cluster.local /dev/null breaksleep 1sdonedonePEERSfor i in $(seq 0 $((${CLUSTER_SIZE} - 1))); doPEERS${PEERS}${PEERS:,}${SET_NAME}-${i}http://${SET_NAME}-${i}.${SET_NAME}.default.svc.cluster.local:2380done# start etcd. If cluster is already initialized the --initial-* options will be ignored.exec etcd --name ${HOSTNAME} \--listen-peer-urls http://${IP}:2380 \--listen-client-urls http://${IP}:2379,http://127.0.0.1:2379 \--advertise-client-urls http://${HOSTNAME}.${SET_NAME}:2379 \--initial-advertise-peer-urls http://${HOSTNAME}.${SET_NAME}:2380 \--initial-cluster-token etcd-cluster-1 \--initial-cluster ${PEERS} \--initial-cluster-state new \--data-dir /var/run/etcd/default.etcd
## We are using dynamic pv provisioning using the standard storage class so
## this resource can be directly deployed without changes to minikube (since
## minikube defines this class for its minikube hostpath provisioner). In
## production define your own way to use pv claims.volumeClaimTemplates:- metadata:name: dataannotations:volume.beta.kubernetes.io/storage-class: cephspec:accessModes:- ReadWriteOnceresources:requests:storage: 1Gi
EOF 创建完成之后的po,pv,pvc清单如下 [root172 etcd]# kubectl get po -owide
NAME READY STATUS RESTARTS AGE IP NODE
etcd-0 1/1 Running 0 15m 192.168.5.174 172.16.20.10
etcd-1 1/1 Running 0 15m 192.168.3.16 172.16.20.12
etcd-2 1/1 Running 0 5s 192.168.5.176 172.16.20.10 2. 测试缩容 kubectl scale statefulset etcd --replicas2[root172 ~]# kubectl get po -owide -w
NAME READY STATUS RESTARTS AGE IP NODE
etcd-0 1/1 Running 0 17m 192.168.5.174 172.16.20.10
etcd-1 1/1 Running 0 17m 192.168.3.16 172.16.20.12
etcd-2 1/1 Running 0 1m 192.168.5.176 172.16.20.10
etcd-2 1/1 Terminating 0 1m 192.168.5.176 172.16.20.10
etcd-2 0/1 Terminating 0 1m none 172.16.20.10 检查集群健康 kubectl exec etcd-0 -- etcdctl cluster-healthfailed to check the health of member 42c8b94265b9b79a on http://etcd-2.etcd:2379: Get http://etcd-2.etcd:2379/health: dial tcp: lookup etcd-2.etcd on 10.96.0.10:53: no such host
member 42c8b94265b9b79a is unreachable: [http://etcd-2.etcd:2379] are all unreachable
member 9869f0647883a00d is healthy: got healthy result from http://etcd-1.etcd:2379
member c799a6ef06bc8c14 is healthy: got healthy result from http://etcd-0.etcd:2379
cluster is healthy 发现缩容后etcd-2并没有从etcd集群中自动删除可见这个etcd镜像对自动扩容缩容的支持并不够好。 我们手工删除掉etcd-2 [root172 etcd]# kubectl exec etcd-0 -- etcdctl member remove 42c8b94265b9b79a
Removed member 42c8b94265b9b79a from cluster
[root172 etcd]# kubectl exec etcd-0 -- etcdctl cluster-health
member 9869f0647883a00d is healthy: got healthy result from http://etcd-1.etcd:2379
member c799a6ef06bc8c14 is healthy: got healthy result from http://etcd-0.etcd:2379
cluster is healthy 3. 测试扩容 从etcd.yaml的启动脚本中可以看出扩容时新启动一个etcd pod时参数--initial-cluster-state new该etcd镜像并不支持动态扩容可以考虑使用基于dns动态部署etcd集群的方式来修改启动脚本这样才能支持etcd cluster动态扩容。