Prometheus Operator?

개념

  • Operator :
  • Service : Kubernetes에서 돌고있는 application, 본문에서는 Redis cluster 를 의미함
  • ServiceMonitor : 위의 service, 즉 redis cluster를 scraping 하는 동작을 의미함

kubernetes 위에서 Pod는 유동적으로 변하지만 ServiceMonitor는 Pod를 label로 구분하여 scraping 하고,
Operator는 이 ServiceMonitor만 바라보면 되기 때문에 운영자 입장에서는 application 배포, 운영관리가 자동화되는 효과가 있음

모니터링을 구축해보자

기존 redis-cluster helm chart 수정

  • helm values.yaml 수정
## Prometheus Exporter / Metrics

metrics:
  enabled: true
  
  # Enable this if you're using https://github.com/coreos/prometheus-operator
  serviceMonitor:
    enabled: true

=> redis-cluster Pod에 Prometheus Exporter 를 띄우겠다는 설정
metrics와 serviceMonitor를 띄워 metric를 수집하도록 설정함

  • redis-exporter 확인
$ kubectl get pod
NAME                                              READY   STATUS      RESTARTS   AGE
kimdubi-test-redis-cluster-0                      2/2     Running     0          3m21s
kimdubi-test-redis-cluster-1                      2/2     Running     0          3m53s
kimdubi-test-redis-cluster-2                      2/2     Running     0          4m28s
kimdubi-test-redis-cluster-3                      2/2     Running     0          5m12s
kimdubi-test-redis-cluster-4                      2/2     Running     0          5m53s
kimdubi-test-redis-cluster-5                      2/2     Running     0          6m18s

$ kubectl exec -it kimdubi-test-redis-cluster-0 -c metrics /bin/sh

kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.

$ ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
1001         1     0  0 07:35 ?        00:00:00 redis_exporter

=> Pod 내 container가 1/1 -> 2/2, Pod 하나에 Container 두대가 떴다는 의미
Pod 접속 시 -c 옵션을 통해 Container명을 지정해서 접속할 수 있음

prometheus-opertor 설치

  • prometheus-operator 다운로드
$ helm fetch stable/prometheus-operator
$ tar zxvpf prometheus-operator-9.3.2.tgz
  • helm chart 수정
### vi ~/prometheus-operator/charts/grafana

persistence:
  type: pvc
  enabled: true
  storageClassName: kimdubi-test
  accessModes:
    - ReadWriteOnce
  size: 1Gi
  • helm install
$ helm install --namespace monitoring --create-namespace monitoring ./prometheus-operator
  • 구성 확인
$ helm ls -n monitoring

NAME      	NAMESPACE 	REVISION	UPDATED                             	STATUS  	CHART                    	APP VERSION
monitoring	monitoring	1       	2021-03-17 16:13:07.047828 +0900 KST	deployed	prometheus-operator-9.3.2	0.38.1

$ kubectl get all -n monitoring
NAME                                                         READY   STATUS    RESTARTS   AGE
pod/alertmanager-monitoring-prometheus-oper-alertmanager-0   2/2     Running   0          4m13s
pod/monitoring-grafana-5694798c88-dhnqf                      2/2     Running   0          4m20s
pod/monitoring-kube-state-metrics-5f4d9ddc46-qfmrj           1/1     Running   0          4m20s
pod/monitoring-prometheus-node-exporter-6vd26                1/1     Running   0          4m20s
pod/monitoring-prometheus-node-exporter-r99gf                1/1     Running   0          4m20s
pod/monitoring-prometheus-node-exporter-x4qfw                1/1     Running   0          4m20s
pod/monitoring-prometheus-oper-operator-8bd4fb5b8-7xr9q      2/2     Running   0          4m20s
pod/prometheus-monitoring-prometheus-oper-prometheus-0       3/3     Running   1          4m3s

NAME                                              TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                      AGE
service/alertmanager-operated                     ClusterIP   None             <none>        9093/TCP,9094/TCP,9094/UDP   4m13s
service/monitoring-grafana                        ClusterIP   10.254.108.201   <none>        80/TCP                       4m20s
service/monitoring-kube-state-metrics             ClusterIP   10.254.87.45     <none>        8080/TCP                     4m20s
service/monitoring-prometheus-node-exporter       ClusterIP   10.254.160.157   <none>        9100/TCP                     4m20s
service/monitoring-prometheus-oper-alertmanager   ClusterIP   10.254.129.73    <none>        9093/TCP                     4m20s
service/monitoring-prometheus-oper-operator       ClusterIP   10.254.163.131   <none>        8080/TCP,443/TCP             4m20s
service/monitoring-prometheus-oper-prometheus     ClusterIP   10.254.29.59     <none>        9090/TCP                     4m20s
service/prometheus-operated                       ClusterIP   None             <none>        9090/TCP                     4m3s

NAME                                                 DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/monitoring-prometheus-node-exporter   3         3         3       3            3           <none>          4m20s

NAME                                                  READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/monitoring-grafana                    1/1     1            1           4m20s
deployment.apps/monitoring-kube-state-metrics         1/1     1            1           4m20s
deployment.apps/monitoring-prometheus-oper-operator   1/1     1            1           4m20s

NAME                                                            DESIRED   CURRENT   READY   AGE
replicaset.apps/monitoring-grafana-5694798c88                   1         1         1       4m20s
replicaset.apps/monitoring-kube-state-metrics-5f4d9ddc46        1         1         1       4m20s
replicaset.apps/monitoring-prometheus-oper-operator-8bd4fb5b8   1         1         1       4m20s

NAME                                                                    READY   AGE
statefulset.apps/alertmanager-monitoring-prometheus-oper-alertmanager   1/1     4m13s
statefulset.apps/prometheus-monitoring-prometheus-oper-prometheus       1/1     4m13s


$ kubectl get servicemonitor -n monitoring
NAME                                                 AGE
monitoring-prometheus-oper-alertmanager              23h
monitoring-prometheus-oper-apiserver                 23h
monitoring-prometheus-oper-coredns                   23h
monitoring-prometheus-oper-grafana                   23h
monitoring-prometheus-oper-kube-controller-manager   23h
monitoring-prometheus-oper-kube-etcd                 23h
monitoring-prometheus-oper-kube-proxy                23h
monitoring-prometheus-oper-kube-scheduler            23h
monitoring-prometheus-oper-kube-state-metrics        23h
monitoring-prometheus-oper-kubelet                   23h
monitoring-prometheus-oper-node-exporter             23h
monitoring-prometheus-oper-operator                  23h
monitoring-prometheus-oper-prometheus                23h


$ kubectl get crd -n monitoring
NAME                                    CREATED AT
alertmanagers.monitoring.coreos.com     2021-03-15T15:13:43Z
podmonitors.monitoring.coreos.com       2021-03-15T15:13:43Z
probes.monitoring.coreos.com            2021-03-15T15:13:43Z
prometheuses.monitoring.coreos.com      2021-03-15T15:13:44Z
prometheusrules.monitoring.coreos.com   2021-03-15T15:13:44Z
servicemonitors.monitoring.coreos.com   2021-03-15T15:13:44Z
thanosrulers.monitoring.coreos.com      2021-03-15T15:13:44Z

prometheus-operator 연동

  • grafana service 수정
$ kubectl edit service/monitoring-grafana -n monitoring

spec:
  clusterIP: 10.254.108.201
  externalTrafficPolicy: Cluster
  ports:
  - name: service
    nodePort: 31000
    port: 80
    protocol: TCP
    targetPort: 3000
  selector:
    app.kubernetes.io/instance: monitoring
    app.kubernetes.io/name: grafana
  sessionAffinity: None
  type: NodePort

=> grafana Pod를 외부 port로 접속하기 위해 매핑된 Service를 ClusterIP -> NodePort type으로 변경해준다

  • redis-cluster Servicemonitor 수정
$ kubectl edit servicemonitor kimdubi-test-redis-cluster

  labels:
    app.kubernetes.io/instance: kimdubi-test
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: redis-cluster
    helm.sh/chart: redis-cluster-4.3.1
    release: monitoring

=> lables 설정에 release: monitoring 을 추가하여 Prometheus-operator 가 인식할 수 있게 함
각각의 redis Pod은 건드릴 필요 없이 Opertor가 바라보는 Serviemonitor만 수정해주면 되기 때문에 Pod가 확장 되거나 축소되어도 모니터링엔 영향없음

  • 외부 접근을 위해 kubernetes node 중 하나에 FIP를 할당해준다

  • grafana dashboard 접속 확인

  • default 접속 정보 admin / prom-opertor