StatefulSet
StatefulSet 是为了解决有状态服务的问题(对应 Deployments 和 ReplicaSets 是为无状态服务而设计),其应用场景包括
    稳定的持久化存储,即 Pod 重新调度后还是能访问到相同的持久化数据,基于 PVC 来实现
    稳定的网络标志,即 Pod 重新调度后其 PodName 和 HostName 不变,基于 Headless Service(即没有 Cluster IP 的 Service)来实现
    有序部署,有序扩展,即 Pod 是有顺序的,在部署或者扩展的时候要依据定义的顺序依次依序进行(即从 0 到 N-1,在下一个 Pod 运行之前所有之前的 Pod 必须都是 Running 和 Ready 状态),基于 init containers 来实现
    有序收缩,有序删除(即从 N-1 到 0)
从上面的应用场景可以发现,StatefulSet 由以下几个部分组成:
    用于定义网络标志(DNS domain)的 Headless Service
    用于创建 PersistentVolumes 的 volumeClaimTemplates
    定义具体应用的 StatefulSet
StatefulSet 中每个 Pod 的 DNS 格式为 statefulSetName-{0..N-1}.serviceName.namespace.svc.cluster.local,其中
    serviceName 为 Headless Service 的名字
    0..N-1 为 Pod 所在的序号,从 0 开始到 N-1
    statefulSetName 为 StatefulSet 的名字
    namespace 为服务所在的 namespace,Headless Service 和 StatefulSet 必须在相同的 namespace
    .cluster.local 为 Cluster Domain

API 版本对照表

Kubernetes 版本
Deployment 版本
v1.5-v1.6
extensions/v1beta1
v1.7-v1.15
apps/v1beta1
v1.8-v1.15
apps/v1beta2
v1.9+
apps/v1

简单示例

以一个简单的 nginx 服务 web.yaml 为例:
1
apiVersion: v1
2
kind: Service
3
metadata:
4
name: nginx
5
labels:
6
app: nginx
7
spec:
8
ports:
9
- port: 80
10
name: web
11
clusterIP: None
12
selector:
13
app: nginx
14
---
15
apiVersion: apps/v1
16
kind: StatefulSet
17
metadata:
18
name: web
19
spec:
20
serviceName: "nginx"
21
replicas: 2
22
selector:
23
matchLabels:
24
app: nginx
25
template:
26
metadata:
27
labels:
28
app: nginx
29
spec:
30
containers:
31
- name: nginx
32
image: k8s.gcr.io/nginx-slim:0.8
33
ports:
34
- containerPort: 80
35
name: web
36
volumeMounts:
37
- name: www
38
mountPath: /usr/share/nginx/html
39
volumeClaimTemplates:
40
- metadata:
41
name: www
42
spec:
43
accessModes: ["ReadWriteOnce"]
44
resources:
45
requests:
46
storage: 1Gi
Copied!
1
$ kubectl create -f web.yaml
2
service "nginx" created
3
statefulset "web" created
4
5
# 查看创建的 headless service 和 statefulset
6
$ kubectl get service nginx
7
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
8
nginx None <none> 80/TCP 1m
9
$ kubectl get statefulset web
10
NAME DESIRED CURRENT AGE
11
web 2 2 2m
12
13
# 根据 volumeClaimTemplates 自动创建 PVC(在 GCE 中会自动创建 kubernetes.io/gce-pd 类型的 volume)
14
$ kubectl get pvc
15
NAME STATUS VOLUME CAPACITY ACCESSMODES AGE
16
www-web-0 Bound pvc-d064a004-d8d4-11e6-b521-42010a800002 1Gi RWO 16s
17
www-web-1 Bound pvc-d06a3946-d8d4-11e6-b521-42010a800002 1Gi RWO 16s
18
19
# 查看创建的 Pod,他们都是有序的
20
$ kubectl get pods -l app=nginx
21
NAME READY STATUS RESTARTS AGE
22
web-0 1/1 Running 0 5m
23
web-1 1/1 Running 0 4m
24
25
# 使用 nslookup 查看这些 Pod 的 DNS
26
$ kubectl run -i --tty --image busybox dns-test --restart=Never --rm /bin/sh
27
/ # nslookup web-0.nginx
28
Server: 10.0.0.10
29
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
30
31
Name: web-0.nginx
32
Address 1: 10.244.2.10
33
/ # nslookup web-1.nginx
34
Server: 10.0.0.10
35
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
36
37
Name: web-1.nginx
38
Address 1: 10.244.3.12
39
/ # nslookup web-0.nginx.default.svc.cluster.local
40
Server: 10.0.0.10
41
Address 1: 10.0.0.10 kube-dns.kube-system.svc.cluster.local
42
43
Name: web-0.nginx.default.svc.cluster.local
44
Address 1: 10.244.2.10
Copied!
还可以进行其他的操作
1
# 扩容
2
$ kubectl scale statefulset web --replicas=5
3
4
# 缩容
5
$ kubectl patch statefulset web -p '{"spec":{"replicas":3}}'
6
7
# 镜像更新(目前还不支持直接更新 image,需要 patch 来间接实现)
8
$ kubectl patch statefulset web --type='json' -p='[{"op":"replace","path":"/spec/template/spec/containers/0/image","value":"gcr.io/google_containers/nginx-slim:0.7"}]'
9
10
# 删除 StatefulSet 和 Headless Service
11
$ kubectl delete statefulset web
12
$ kubectl delete service nginx
13
14
# StatefulSet 删除后 PVC 还会保留着,数据不再使用的话也需要删除
15
$ kubectl delete pvc www-web-0 www-web-1
Copied!

更新 StatefulSet

v1.7 + 支持 StatefulSet 的自动更新,通过 spec.updateStrategy 设置更新策略。目前支持两种策略
    OnDelete:当 .spec.template 更新时,并不立即删除旧的 Pod,而是等待用户手动删除这些旧 Pod 后自动创建新 Pod。这是默认的更新策略,兼容 v1.6 版本的行为
    RollingUpdate:当 .spec.template 更新时,自动删除旧的 Pod 并创建新 Pod 替换。在更新时,这些 Pod 是按逆序的方式进行,依次删除、创建并等待 Pod 变成 Ready 状态才进行下一个 Pod 的更新。

Partitions

RollingUpdate 还支持 Partitions,通过 .spec.updateStrategy.rollingUpdate.partition 来设置。当 partition 设置后,只有序号大于或等于 partition 的 Pod 会在 .spec.template 更新的时候滚动更新,而其余的 Pod 则保持不变(即便是删除后也是用以前的版本重新创建)。
1
# 设置 partition 为 3
2
$ kubectl patch statefulset web -p '{"spec":{"updateStrategy":{"type":"RollingUpdate","rollingUpdate":{"partition":3}}}}'
3
statefulset "web" patched
4
5
# 更新 StatefulSet
6
$ kubectl patch statefulset web --type='json' -p='[{"op":"replace","path":"/spec/template/spec/containers/0/image","value":"gcr.io/google_containers/nginx-slim:0.7"}]'
7
statefulset "web" patched
8
9
# 验证更新
10
$ kubectl delete po web-2
11
pod "web-2" deleted
12
$ kubectl get po -lapp=nginx -w
13
NAME READY STATUS RESTARTS AGE
14
web-0 1/1 Running 0 4m
15
web-1 1/1 Running 0 4m
16
web-2 0/1 ContainerCreating 0 11s
17
web-2 1/1 Running 0 18s
Copied!

Pod 管理策略

v1.7 + 可以通过 .spec.podManagementPolicy 设置 Pod 管理策略,支持两种方式
    OrderedReady:默认的策略,按照 Pod 的次序依次创建每个 Pod 并等待 Ready 之后才创建后面的 Pod
    Parallel:并行创建或删除 Pod(不等待前面的 Pod Ready 就开始创建所有的 Pod)

Parallel 示例

1
---
2
apiVersion: v1
3
kind: Service
4
metadata:
5
name: nginx
6
labels:
7
app: nginx
8
spec:
9
ports:
10
- port: 80
11
name: web
12
clusterIP: None
13
selector:
14
app: nginx
15
---
16
apiVersion: apps/v1beta1
17
kind: StatefulSet
18
metadata:
19
name: web
20
spec:
21
serviceName: "nginx"
22
podManagementPolicy: "Parallel"
23
replicas: 2
24
template:
25
metadata:
26
labels:
27
app: nginx
28
spec:
29
containers:
30
- name: nginx
31
image: gcr.io/google_containers/nginx-slim:0.8
32
ports:
33
- containerPort: 80
34
name: web
35
volumeMounts:
36
- name: www
37
mountPath: /usr/share/nginx/html
38
volumeClaimTemplates:
39
- metadata:
40
name: www
41
spec:
42
accessModes: ["ReadWriteOnce"]
43
resources:
44
requests:
45
storage: 1Gi
Copied!
可以看到,所有 Pod 是并行创建的
1
$ kubectl create -f webp.yaml
2
service "nginx" created
3
statefulset "web" created
4
5
$ kubectl get po -lapp=nginx -w
6
NAME READY STATUS RESTARTS AGE
7
web-0 0/1 Pending 0 0s
8
web-0 0/1 Pending 0 0s
9
web-1 0/1 Pending 0 0s
10
web-1 0/1 Pending 0 0s
11
web-0 0/1 ContainerCreating 0 0s
12
web-1 0/1 ContainerCreating 0 0s
13
web-0 1/1 Running 0 10s
14
web-1 1/1 Running 0 10s
Copied!

zookeeper

另外一个更能说明 StatefulSet 强大功能的示例为 zookeeper.yaml
1
---
2
apiVersion: v1
3
kind: Service
4
metadata:
5
name: zk-headless
6
labels:
7
app: zk-headless
8
spec:
9
ports:
10
- port: 2888
11
name: server
12
- port: 3888
13
name: leader-election
14
clusterIP: None
15
selector:
16
app: zk
17
---
18
apiVersion: v1
19
kind: ConfigMap
20
metadata:
21
name: zk-config
22
data:
23
ensemble: "zk-0;zk-1;zk-2"
24
jvm.heap: "2G"
25
tick: "2000"
26
init: "10"
27
sync: "5"
28
client.cnxns: "60"
29
snap.retain: "3"
30
purge.interval: "1"
31
---
32
apiVersion: policy/v1beta1
33
kind: PodDisruptionBudget
34
metadata:
35
name: zk-budget
36
spec:
37
selector:
38
matchLabels:
39
app: zk
40
minAvailable: 2
41
---
42
apiVersion: apps/v1beta1
43
kind: StatefulSet
44
metadata:
45
name: zk
46
spec:
47
serviceName: zk-headless
48
replicas: 3
49
template:
50
metadata:
51
labels:
52
app: zk
53
annotations:
54
pod.alpha.kubernetes.io/initialized: "true"
55
scheduler.alpha.kubernetes.io/affinity: >
56
{
57
"podAntiAffinity": {
58
"requiredDuringSchedulingRequiredDuringExecution": [{
59
"labelSelector": {
60
"matchExpressions": [{
61
"key": "app",
62
"operator": "In",
63
"values": ["zk-headless"]
64
}]
65
},
66
"topologyKey": "kubernetes.io/hostname"
67
}]
68
}
69
}
70
spec:
71
containers:
72
- name: k8szk
73
imagePullPolicy: Always
74
image: gcr.io/google_samples/k8szk:v1
75
resources:
76
requests:
77
memory: "4Gi"
78
cpu: "1"
79
ports:
80
- containerPort: 2181
81
name: client
82
- containerPort: 2888
83
name: server
84
- containerPort: 3888
85
name: leader-election
86
env:
87
- name : ZK_ENSEMBLE
88
valueFrom:
89
configMapKeyRef:
90
name: zk-config
91
key: ensemble
92
- name : ZK_HEAP_SIZE
93
valueFrom:
94
configMapKeyRef:
95
name: zk-config
96
key: jvm.heap
97
- name : ZK_TICK_TIME
98
valueFrom:
99
configMapKeyRef:
100
name: zk-config
101
key: tick
102
- name : ZK_INIT_LIMIT
103
valueFrom:
104
configMapKeyRef:
105
name: zk-config
106
key: init
107
- name : ZK_SYNC_LIMIT
108
valueFrom:
109
configMapKeyRef:
110
name: zk-config
111
key: tick
112
- name : ZK_MAX_CLIENT_CNXNS
113
valueFrom:
114
configMapKeyRef:
115
name: zk-config
116
key: client.cnxns
117
- name: ZK_SNAP_RETAIN_COUNT
118
valueFrom:
119
configMapKeyRef:
120
name: zk-config
121
key: snap.retain
122
- name: ZK_PURGE_INTERVAL
123
valueFrom:
124
configMapKeyRef:
125
name: zk-config
126
key: purge.interval
127
- name: ZK_CLIENT_PORT
128
value: "2181"
129
- name: ZK_SERVER_PORT
130
value: "2888"
131
- name: ZK_ELECTION_PORT
132
value: "3888"
133
command:
134
- sh
135
- -c
136
- zkGenConfig.sh && zkServer.sh start-foreground
137
readinessProbe:
138
exec:
139
command:
140
- "zkOk.sh"
141
initialDelaySeconds: 15
142
timeoutSeconds: 5
143
livenessProbe:
144
exec:
145
command:
146
- "zkOk.sh"
147
initialDelaySeconds: 15
148
timeoutSeconds: 5
149
volumeMounts:
150
- name: datadir
151
mountPath: /var/lib/zookeeper
152
securityContext:
153
runAsUser: 1000
154
fsGroup: 1000
155
volumeClaimTemplates:
156
- metadata:
157
name: datadir
158
annotations:
159
volume.alpha.kubernetes.io/storage-class: anything
160
spec:
161
accessModes: ["ReadWriteOnce"]
162
resources:
163
requests:
164
storage: 20Gi
Copied!
1
kubectl create -f zookeeper.yaml
Copied!
详细的使用说明见 zookeeper stateful application

StatefulSet 注意事项

    1.
    推荐在 Kubernetes v1.9 或以后的版本中使用
    2.
    所有 Pod 的 Volume 必须使用 PersistentVolume 或者是管理员事先创建好
    3.
    为了保证数据安全,删除 StatefulSet 时不会删除 Volume
    4.
    StatefulSet 需要一个 Headless Service 来定义 DNS domain,需要在 StatefulSet 之前创建好
最近更新 2yr ago