K8s | Linuxer Blog

helm-sentry-install-fail

helm install sentry sentry/sentry coalesce.go:175: warning: skipped value for kafka.config: Not a table. coalesce.go:175: warning: skipped value for kafka.zookeeper.topologySpreadConstraints: Not a table. W1023 08:00:35.276931 15594 warnings.go:70] spec.template.spec.containers[0].env[39]: hides previous definition of "KAFKA_ENABLE_KRAFT" Error: INSTALLATION FAILED: failed post-install: 1 error occurred: * job failed: DeadlineExceeded job failed: DeadlineExceeded 에러가 발생한다. 이 job은 DB가 정상적으로 올라왔는지 확인하는 job이다. k get job NAME COMPLETIONS DURATION AGE sentry-db-check 0/1 5m23s 5m23s 이 Job은 다음을 검증한다. name: sentry-db-check namespace: sentry resourceVersion: "4700657" uid: 12533bba-b35b-4b7d-9007-8c625b389a98 spec: activeDeadlineSeconds: 1000 backoffLimit: 6 completionMode: NonIndexed completions: 1 parallelism: 1 selector: matchLabels: batch.kubernetes.io/controller-uid: 12533bba-b35b-4b7d-9007-8c625b389a98 suspend: false template: metadata: creationTimestamp: null labels: app: sentry batch.kubernetes.io/controller-uid: 12533bba-b35b-4b7d-9007-8c625b389a98 batch.kubernetes.io/job-name: sentry-db-check controller-uid: 12533bba-b35b-4b7d-9007-8c625b389a98 job-name: sentry-db-check release: sentry name: sentry-db-check spec: containers: - command: - /bin/sh - -c - | echo "Checking if clickhouse is up" CLICKHOUSE_STATUS=0 while [ $CLICKHOUSE_STATUS -eq 0 ]; do CLICKHOUSE_STATUS=1 CLICKHOUSE_REPLICAS=3 i=0; while [ $i -lt $CLICKHOUSE_REPLICAS ]; do CLICKHOUSE_HOST=sentry-clickhouse-$i.sentry-clickhouse-headless if ! nc -z "$CLICKHOUSE_HOST" 9000; then CLICKHOUSE_STATUS=0 echo "$CLICKHOUSE_HOST is not available yet" fi i=$((i+1)) done if [ "$CLICKHOUSE_STATUS" -eq 0 ]; then echo "Clickhouse not ready. Sleeping for 10s before trying again" sleep 10; fi done echo "Clickhouse is up" echo "Checking if kafka is up" KAFKA_STATUS=0 while [ $KAFKA_STATUS -eq 0 ]; do KAFKA_STATUS=1 KAFKA_REPLICAS=3 i=0; while [ $i -lt $KAFKA_REPLICAS ]; do KAFKA_HOST=sentry-kafka-$i.sentry-kafka-headless if ! nc -z "$KAFKA_HOST" 9092; then KAFKA_STATUS=0 echo "$KAFKA_HOST is not available yet" fi i=$((i+1)) done if [ "$KAFKA_STATUS" -eq 0 ]; then echo "Kafka not ready. Sleeping for 10s before trying again" sleep 10; fi done echo "Kafka is up" image: subfuzion/netcat:latest imagePullPolicy: IfNotPresent name: db-check resources: limits: memory: 64Mi requests: cpu: 100m memory: 64Mi terminationMessagePath: /dev/termination-log terminationMessagePolicy: File dnsPolicy: ClusterFirst restartPolicy: Never schedulerName: default-scheduler securityContext: {} terminationGracePeriodSeconds: 30 Clickhouse / Kafka 가 실행되어야 job은 정상화 가능하다. 시간이 오래걸리는 작업이므로, hook 의 시간을 늘려주면 job은 더 긴시간 대기한다 helm 의 values.yaml 에서 activeDeadlineSeconds를 늘려주면 된다. ...

PKOS-5Week-Final!

마지막 주차이다. 5 주차는 보안 관련한 주제였다. 대표적으로 생각나는 쿠버네티스 보안 사건부터 이야기할까한다. 생각보다 많은 사람들이 쓰는 오픈소스중에 Rancher가 있다. Rancher가 설치된 클러스터에서 이상한 증상이 발생했다. 배포된 POD가 재대로 성능이 나지않고 WorkerNode의 CPU사용율이 굉장히 높았다. 클러스터 외부에서 원인을 파악할 수 없어서 결국 WokerNode에 SSH를 접속해서 확인했었다. WorkerNode에선 대량의 마이닝툴이 발견되었다. Pirvate 환경인데다가 외부에서 SSH도 불가능한 환경에서 발생한 침해사고라, 플랫폼의 문제로 대두되었다. 그러던 중 클러스터에서 동작중인 대시보드들을 확인하였다. 당시에는 다들 쿠버네티스에 익숙한 상황이 아니라서 확인이 좀오래 걸렸다. ...

PKOS-kOps-2Week

ㅠㅠ 울고 시작하려한다. 스터디에 집중을 하려고 한다. 가시다님 그동안 숙제 너무 조금해서 죄송했어요...ㅠㅠ엉엉흑흑 일단 사과를 드리고 시작하며, 이제 살짝 각잡고 kOps 부터 설명하겠다. kOps는 Kubernetes Operations의 약자로, Kubernetes 클러스터를 AWS (Amazon Web Services)에서 손쉽게 설치, 업그레이드 및 관리할 수 있도록 해주는 오픈 소스 도구이다. Kops를 사용하면 CLI(Command Line Interface)를 통해 클러스터를 구성할 수 있으며, YAML 파일을 사용하여 쉽게 클러스터를 정의할 수 있다 Kops는 여러 가지 기능을 제공한다 클러스터 구성: Kops를 사용하여 Kubernetes 클러스터를 쉽게 구성할 수 있다. YAML 파일을 사용하여 클러스터 구성을 정의하고, AWS 리소스를 프로비저닝하고 구성을 배포한다. ...

AKOS-Study-Manual-EKS-istio

클러스터를 먼저 프로비저닝 했다. 30분이상이 걸리는 작업이므로 시작해놓고 기다린다. eksctl create cluster --vpc-public-subnets $WKSubnets --name $CLUSTER_NAME --region $AWS_REGION --version 1.21 \\ > --nodegroup-name $CLUSTER_NAME-nodegroup --node-type t3.medium --nodes 3 --nodes-min 3 --nodes-max 6 \\ > --with-oidc --node-volume-size=20 --ssh-access --ssh-public-key $MySSHKeypair 2021-09-04 11:29:11 [ℹ] eksctl version 0.63.0 2021-09-04 11:29:11 [ℹ] using region ap-northeast-2 2021-09-04 11:29:12 [✔] using existing VPC (vpc-094808933b68add7c) and subnets (private:map[] public:map[ap-northeast-2a:{subnet-0a603a222db0cce10 ap-northeast-2a 10.0.11.0/24} ap-northeast-2b:{subnet-007964ce4a003361a ap-northeast-2b 10.0.12.0/24} ap-northeast-2c:{subnet-007813cf58631ef3b ap-northeast-2c 10.0.13.0/24}]) 2021-09-04 11:29:12 [!] custom VPC/subnets will be used; if resulting cluster doesn't function as expected, make sure to review the configuration of VPC/subnets 2021-09-04 11:29:12 [ℹ] nodegroup "first-eks-nodegroup" will use "" [AmazonLinux2/1.21] 2021-09-04 11:29:12 [ℹ] using EC2 key pair %!q(*string=<nil>) 2021-09-04 11:29:12 [ℹ] using Kubernetes version 1.21 2021-09-04 11:29:12 [ℹ] creating EKS cluster "first-eks" in "ap-northeast-2" region with managed nodes 2021-09-04 11:29:12 [ℹ] will create 2 separate CloudFormation stacks for cluster itself and the initial managed nodegroup 2021-09-04 11:29:12 [ℹ] if you encounter any issues, check CloudFormation console or try 'eksctl utils describe-stacks --region=ap-northeast-2 --cluster=first-eks' 2021-09-04 11:29:12 [ℹ] CloudWatch logging will not be enabled for cluster "first-eks" in "ap-northeast-2" 2021-09-04 11:29:12 [ℹ] you can enable it with 'eksctl utils update-cluster-logging --enable-types={SPECIFY-YOUR-LOG-TYPES-HERE (e.g. all)} --region=ap-northeast-2 --cluster=first-eks' 2021-09-04 11:29:12 [ℹ] Kubernetes API endpoint access will use default of {publicAccess=true, privateAccess=false} for cluster "first-eks" in "ap-northeast-2" 2021-09-04 11:29:12 [ℹ] 2 sequential tasks: { create cluster control plane "first-eks", 3 sequential sub-tasks: { 4 sequential sub-tasks: { wait for control plane to become ready, associate IAM OIDC provider, 2 sequential sub-tasks: { create IAM role for serviceaccount "kube-system/aws-node", create serviceaccount "kube-system/aws-node" }, restart daemonset "kube-system/aws-node" }, 1 task: { create addons }, create managed nodegroup "first-eks-nodegroup" } } 2021-09-04 11:29:12 [ℹ] building cluster stack "eksctl-first-eks-cluster" 2021-09-04 11:29:12 [ℹ] deploying stack "eksctl-first-eks-cluster" 2021-09-04 11:29:42 [ℹ] waiting for CloudFormation stack "eksctl-first-eks-cluster" 2021-09-04 11:30:12 [ℹ] waiting for CloudFormation stack "eksctl-first-eks-cluster" 2021-09-04 11:31:12 [ℹ] waiting for CloudFormation stack "eksctl-first-eks-cluster" 2021-09-04 11:32:12 [ℹ] waiting for CloudFormation stack "eksctl-first-eks-cluster" 2021-09-04 11:33:12 [ℹ] waiting for CloudFormation stack "eksctl-first-eks-cluster" 2021-09-04 11:34:12 [ℹ] waiting for CloudFormation stack "eksctl-first-eks-cluster" 2021-09-04 11:35:12 [ℹ] waiting for CloudFormation stack "eksctl-first-eks-cluster" 2021-09-04 11:36:12 [ℹ] waiting for CloudFormation stack "eksctl-first-eks-cluster" 2021-09-04 11:37:12 [ℹ] waiting for CloudFormation stack "eksctl-first-eks-cluster" 2021-09-04 11:38:12 [ℹ] waiting for CloudFormation stack "eksctl-first-eks-cluster" 2021-09-04 11:39:12 [ℹ] waiting for CloudFormation stack "eksctl-first-eks-cluster" 2021-09-04 11:40:13 [ℹ] waiting for CloudFormation stack "eksctl-first-eks-cluster" 2021-09-04 11:41:13 [ℹ] waiting for CloudFormation stack "eksctl-first-eks-cluster" 2021-09-04 11:45:14 [ℹ] building iamserviceaccount stack "eksctl-first-eks-addon-iamserviceaccount-kube-system-aws-node" 2021-09-04 11:45:14 [ℹ] deploying stack "eksctl-first-eks-addon-iamserviceaccount-kube-system-aws-node" EKS를 Setup 하는 과정에 대해선 이전포스팅을 참고하기 바란다. ...

NKS-Linuxer-Blog-trouble-shooting-lifecycle-not-working

블로그를 이전한지 얼마안됬기 때문에 집중모니터링 기간이다. 먼저 자원부터 본다. k top node NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% nks-pool-1119-w-gzi 223m 5% 1265Mi 16% nks-pool-1119-w-kvi 172m 4% 1540Mi 20% k top pod NAME CPU(cores) MEMORY(bytes) php-fpm-nginx-deployment-6bc7b6df77-fbdx9 9m 138Mi storage-nfs-client-provisioner-5b88c7c55-dvtlj 2m 8Mi 자원은 얼마안쓰지만..혹시나 사용량이 늘어날까봐 scale을 늘렸다. k scale deployment php-fpm-nginx-deployment --replicas=3 deployment.apps/php-fpm-nginx-deployment scaled 그리고 pod 를 확인했는데... k get pod NAME READY STATUS RESTARTS AGE php-fpm-nginx-deployment-6bc7b6df77-bpf2g 2/2 Running 0 19s php-fpm-nginx-deployment-6bc7b6df77-fbdx9 2/2 Running 3 32h php-fpm-nginx-deployment-6bc7b6df77-rfpb2 0/2 ContainerCreating 0 19s storage-nfs-client-provisioner-5b88c7c55-dvtlj 1/1 Running 0 10h 생성단계에서 멈춘 pod 가 있었다. 상태를 확인해보니 ...

NKS-Linuxer-Blog-Rebuilding

블로그를 새로 만들기로 했다. https://www.linuxer.name/posts/aws-linuxer의-블로그-톺아보기/ 2020년 2월에 완성된 블로그의 구조이니..이걸 우려먹은지도 벌써 1년이 훌쩍넘어다는 이야기다. 블로그를 좀더 가볍고 편한구조로 변경하려고 고민했으나..나는 실패했다.ㅠㅠ 능력이나 뭐 그런 이야기가 아니라..게으름에 진거다. 게으름에 이기기 위해서 글을 시작했다. 목적은 K8S 에 새로 만들기고, K8S의 특성을 가져가고 싶었다. 제일먼저 작업한것은 Wordpess 의 근간이 되는 PHP 다. PHP는 도커파일을 먼저 작성했다. FROM php:7.4-fpm RUN apt-get update \\ && apt-get install -y --no-install-recommends \\ libpng-dev \\ libzip-dev \\ libicu-dev \\ libzip4 \\ && pecl install xdebug \\ && docker-php-ext-install opcache \\ && docker-php-ext-enable xdebug \\ && docker-php-ext-install pdo_mysql \\ && docker-php-ext-install exif \\ && docker-php-ext-install zip \\ && docker-php-ext-install gd \\ && docker-php-ext-install intl \\ && docker-php-ext-install mysqli # Clear cache RUN apt-get clean && rm -rf /var/lib/apt/lists/* WORKDIR /srv/app RUN cp /usr/local/etc/php/php.ini-production /usr/local/etc/php/php.ini RUN echo "date.timezone=Asia/Seoul" >> /usr/local/etc/php/php.ini RUN sed -i --follow-symlinks 's|127.0.0.1:9000|/run/php-fpm.sock|g' /usr/local/etc/php-fpm.d/www.conf RUN sed -i --follow-symlinks 's|short_open_tag = Off|short_open_tag = On|g' /usr/local/etc/php/php.ini RUN sed -i --follow-symlinks 's|9000|/run/php-fpm.sock|g' /usr/local/etc/php-fpm.d/zz-docker.conf CMD ["php-fpm"] 몇가지 수정사항이 있었는데 먼저 tcp socket를 사용하지 않고, unix socket을 사용했다. 흔하게 file socket이라고도 하는데 nginx <-> php-fpm 의 socket 통신의 속도가 상승한다. nginx와 php-fpm이 같은 서버내에 있을때 사용할수 있는 방법이다. 또 zz-docker.conf 는 php 이미지에서 ext를 설치할때 docker 패키지를 사용하면설치되는데 이 conf파일안에 unix 소켓을 사용할수 없도록 만드는 설정이 있다. ...

Certified Kubernetes Administrator-CKA-Review

CKA를 취득하기로 마음먹은지 어언 10개월.. 작년 7월부터 고민했던 종착점에 도착했다. 먼저 시험을 보기전의 나에 대해서 이야기해볼까 한다. 컨테이너는 그럭저럭 다루고, ECS기반의 아키텍처설계를 주로했다. EKS는 혼자서 사용하면서 대충~ 이야기할수있는 레벨이었다. 이직을 진행하면서 NKS에 대한 공부를 진행했고, 관리형 K8S는 어느정도 이해도가 높아졌다는 생각을 한 시점이었다. 그리고 DKOS-Docker Kubernetes online study를 진행하면서 나름의 공부를 한터라 자신이 있었다. 1차 시험에는 49점으로 탈락했다. 사실 다 풀었는데 왜이런 점수가 나왔는지 의아했다. 그래서 떨어지고나서 찾아보니..시험에서 원하는 답이 있다는 것을 알 수 있었다. ...

K8s-one-line-Challenge

잔잔한 호수에 돌맹이는 내가던졌다. K8s의 Service는 selector 에서 지정한 label로 pod에게 트래픽을 흘린다. 그런데 아이러니하게도 service 에서 연결된 pod를 한번에 조회할순 없다. service 에서 selector 나 endpoint를 확인해서 labels 를 보고 확인해야 한다. 그 과정을 한번 보자. my-service1 이라는 서비스에서 사용하는 pod를 조회할꺼다. k get svc -o wide NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR kubernetes ClusterIP 198.19.128.1 <none> 443/TCP 2d13h <none> my-service1 NodePort 198.19.231.233 <none> 80:30001/TCP 2d12h app=my-nginx1 my-service2 NodePort 198.19.172.176 <none> 80:30002/TCP 2d12h app=my-nginx2 my-service3 NodePort 198.19.200.20 <none> 80:30003/TCP 2d12h app=my-nginx3 k get pods -l app=my-nginx1 --show-labels NAME READY STATUS RESTARTS AGE LABELS my-nginx1-67f499d79c-g7vr7 1/1 Running 0 26h app=my-nginx1,pod-template-hash=67f499d79c my-nginx1-67f499d79c-j4f9k 1/1 Running 0 26h app=my-nginx1,pod-template-hash=67f499d79c my-nginx1-67f499d79c-mqxzs 1/1 Running 1 26h app=my-nginx1,pod-template-hash=67f499d79c kubectl. 에서 svc 를 get하고 -o wide 명령어를 쓰면 selector 가보인다. 거기서 get pod -l app=my-nginx1 이라 일일이 지정해줘야지만 확인할수 있다. 명령어 두줄치면 되긴한데 귀찮다. 이렇게 된이상 한줄치기는 물러설수 없다. ...

gcp- Google Kubernetes Engine

https://cloud.google.com/kubernetes-engine/docs/quickstart?hl=ko 먼저 cloud shell 에서 프로젝트 지정하고 리전(아님)을 지정한다. - 수정- zone 을 지정한다. linuxer@cloudshell:~ (elated-ranger-263505)$ gcloud config set compute/zone us-central1-a Updated property [compute/zone]. 그리고 컨테이너 클러스터를 생성한다. linuxer@cloudshell:~ (elated-ranger-263505)$ gcloud container clusters create linuxer-k8s WARNING: Currently VPC-native is not the default mode during cluster creation. In the future, this will become the default mode and can be disabled using --no-enable-ip-ali as flag. Use --[no-]enable-ip-alias flag to suppress this warning. WARNING: Newly created clusters and node-pools will have node auto-upgrade enabled by default. This can be disabled using the --no-enable-autoupgrade flag. WARNING: Starting in 1.12, default node pools in new clusters will have their legacy Compute Engine instance metadata endpoints disabled by default. To create a cluster with legacy instance metadata endpoints disabled in the default node pool, run clusters create with the flag --metadata disable-legacy-endpoints=true. WARNING: Your Pod address range (--cluster-ipv4-cidr) can accommodate at most 1008 node(s). This will enable the autorepair feature for nodes. Please see https://cloud.google.com/kubernetes-engine/docs/node-auto-repair for more information on node autorepairs. ERROR: (gcloud.container.clusters.create) ResponseError: code=403, message=Kubernetes Engine API is not enabled for this project. Please ensure it is enabled in Google Cloud Console and try again: visit https://console.cloud.google.com/apis/api/container.googleapis.com/overview?project=elated-ranger-26 to do so. ...