# 步骤10 可用性-健康检查 默认情况下,如果容器出于任何原因崩溃,Kubernetes将重新它。它使用Liveness和Readiness探针,可以将其配置识别运行状况良好的容器以向其发送流量并在需要时重启它们,保证应用程序运行健壮。 1. Kubernetes中使用Liveness探针来了解Pod是alive or dead。Pod可能由于各种原因而处于dead状态。当Liveness探针未通过时,Kubernetes将杀死并重新创建Pod。 2. Kubernetes中使用了Readiness探针,以了解Pod何时准备接收流量。仅当Readiness探针通过时,Pod才会从服务中接收流量。如果就绪探针失败,则不会将流量发送到Pod。 > 本节目的 1. 我们将了解如何定义Liveness和Readiness探针,并针对Pod的不同状态对其进行测试 10.1 配置 Liveness Probe 1. Liveness Probe determines how kubelet should check the container in order to consider whether it is healthy or not. 2. The kubelet uses the periodSeconds field to do frequent check on the Container. 3. The initialDelaySeconds field is used to tell kubelet that it should wait for several seconds before doing the first probe. 4. To perform a probe, in this case, kubelet sends a HTTP GET request to the server hosting this pod and if the handler for the servers /health returns a success code, then the container is considered healthy. If the handler returns a failure code, the kubelet kills the container and restarts it. ```bash # 部署 Liveness Probe kubectl apply -f healthchecks/liveness-app.yaml # Check status kubectl get pod liveness-app NAME READY STATUS RESTARTS AGE liveness-app 1/1 Running 0 6s kubectl describe pod liveness-app Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 41s default-scheduler Successfully assigned default/liveness-app to ip-192-168-14-19.cn-northwest-1.compute.internal Normal Pulling 40s kubelet, ip-192-168-14-19.cn-northwest-1.compute.internal Pulling image "brentley/ecsdemo-nodejs" Normal Pulled 37s kubelet, ip-192-168-14-19.cn-northwest-1.compute.internal Successfully pulled image "brentley/ecsdemo-nodejs" Normal Created 37s kubelet, ip-192-168-14-19.cn-northwest-1.compute.internal Created container liveness Normal Started 36s kubelet, ip-192-168-14-19.cn-northwest-1.compute.internal Started container liveness ``` ### 模拟 Liveness 失败 ```bash # Simulate failure kubectl exec -it liveness-app -- /bin/kill -s SIGUSR1 1 # check how liveness probe work kubectl describe pod liveness-app Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 6m6s default-scheduler Successfully assigned default/liveness-app to ip-192-168-14-19.cn-northwest-1.compute.internal Warning Unhealthy 57s (x3 over 67s) kubelet, ip-192-168-14-19.cn-northwest-1.compute.internal Liveness probe failed: Get http://192.168.29.229:3000/health: net/http: request canceled (Client.Timeout exceeded while awaiting headers) Normal Killing 57s kubelet, ip-192-168-14-19.cn-northwest-1.compute.internal Container liveness failed liveness probe, will be restarted Normal Pulling 27s (x2 over 6m5s) kubelet, ip-192-168-14-19.cn-northwest-1.compute.internal Pulling image "brentley/ecsdemo-nodejs" Normal Pulled 24s (x2 over 6m2s) kubelet, ip-192-168-14-19.cn-northwest-1.compute.internal Successfully pulled image "brentley/ecsdemo-nodejs" Normal Created 24s (x2 over 6m2s) kubelet, ip-192-168-14-19.cn-northwest-1.compute.internal Created container liveness Normal Started 24s (x2 over 6m1s) kubelet, ip-192-168-14-19.cn-northwest-1.compute.internal Started container liveness kubectl get pod liveness-app NAME READY STATUS RESTARTS AGE liveness-app 1/1 Running 1 6m26s # check logs kubectl logs liveness-app kubectl logs liveness-app --previous ``` 10.2 配置 Readiness Probe ```bash # 部署 Readiness Probe kubectl apply -f healthchecks/readiness-deployment.yaml # check status of pod and replicas kubectl get pods -l app=readiness-deployment NAME READY STATUS RESTARTS AGE readiness-deployment-7d8df88986-sgtvz 1/1 Running 0 5m31s readiness-deployment-7d8df88986-xxvt4 1/1 Running 0 5m31s readiness-deployment-7d8df88986-zqczd 1/1 Running 0 5m31s kubectl describe deployment readiness-deployment | grep Replicas: Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable ``` ### 模拟 Readiness 失败 ```bash kubectl exec -it -- rm /tmp/healthy kubectl get pods -l app=readiness-deployment NAME READY STATUS RESTARTS AGE readiness-deployment-7d8df88986-sgtvz 0/1 Running 0 6m37s readiness-deployment-7d8df88986-xxvt4 1/1 Running 0 6m37s readiness-deployment-7d8df88986-zqczd 1/1 Running 0 6m37s # Traffic will not be routed to the first pod in the above deployment. The ready column confirms that the readiness probe for this pod did not pass and hence was marked as not ready. # check for the replicas that are available to serve traffic when a service is pointed to this deployment. kubectl describe deployment readiness-deployment | grep Replicas: Replicas: 3 desired | 3 updated | 3 total | 2 available | 1 unavailable # Make pod recreate the /tmp/healthy file. Then the pod can pass the probe, it getting marked as ready and will begin to receive traffic again. kubectl exec -it -- touch /tmp/healthy kubectl get pods -l app=readiness-deployment NAME READY STATUS RESTARTS AGE readiness-deployment-7d8df88986-sgtvz 1/1 Running 0 7m30s readiness-deployment-7d8df88986-xxvt4 1/1 Running 0 7m30s readiness-deployment-7d8df88986-zqczd 1/1 Running 0 7m30s kubectl describe deployment readiness-deployment | grep Replicas: Replicas: 3 desired | 3 updated | 3 total | 3 available | 0 unavailable ``` ## Cleanup ```bash kubectl delete -f healthchecks/liveness-app.yaml kubectl delete -f healthchecks/readiness-deployment.yaml ```