question

Tri avatar image
Tri asked Erick Ramirez commented

cass-operator-v1.3 failed to start on microk8s

Using microk8s 1.18.6 when installing the Cassandra operator 1.3 as instructed in README_CASSANDRA.MD

The pod is not ready even after 20+ minutes

kubectl -n cass-operator apply -f ./cassandra/11-install-cass-operator-v1.3.yaml

$ kubectl get pod
NAME                             READY   STATUS    RESTARTS   AGE
cass-operator-56fcb9ff47-4njb5   0/1     Running   0          24m

It looks like the cause of the error is "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted". Can you please help to sort that out?

$ kubectl describe pod -l name=cass-operator

Name:         cass-operator-56fcb9ff47-4njb5
Namespace:    cass-operator
Priority:     0
Node:         silverbullet/192.168.1.98
Start Time:   Wed, 29 Jul 2020 19:16:01 -0400
Labels:       name=cass-operator
              pod-template-hash=56fcb9ff47
Annotations:  
Status:       Running
IP:           10.1.4.181
IPs:
  IP:           10.1.4.181
Controlled By:  ReplicaSet/cass-operator-56fcb9ff47
Containers:
  cass-operator:
    Container ID:   containerd://c968cb4b1dd2507090f0cffd71dada5742fe9a3d2144679f6ddf9b16b8840428
    Image:          datastax/cass-operator:1.3.0
    Image ID:       docker.io/datastax/cass-operator@sha256:6eb92d0e819cd8243dd8b1892561e319f3aa7e62b435f531a2d765c172a1dbb3
    Port:           
    Host Port:      
    State:          Running
      Started:      Wed, 29 Jul 2020 19:16:10 -0400
    Ready:          False
    Restart Count:  0
    Liveness:       exec [pgrep .*operator] delay=5s timeout=5s period=5s #success=1 #failure=3
    Readiness:      exec [stat /tmp/operator-sdk-ready] delay=5s timeout=5s period=5s #success=1 #failure=1
    Environment:
      WATCH_NAMESPACE:          cass-operator (v1:metadata.namespace)
      POD_NAME:                 cass-operator-56fcb9ff47-4njb5 (v1:metadata.name)
      OPERATOR_NAME:            cass-operator
      SKIP_VALIDATING_WEBHOOK:  FALSE
    Mounts:
      /tmp/ from tmpconfig-volume (rw)
      /tmp/k8s-webhook-server/serving-certs from cass-operator-certs-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from cass-operator-token-hk6zj (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  tmpconfig-volume:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:     Memory
    SizeLimit:  
  cass-operator-certs-volume:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cass-operator-webhook-config
    Optional:    false
  cass-operator-token-hk6zj:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cass-operator-token-hk6zj
    Optional:    false
QoS Class:       BestEffort
Node-Selectors:  
Tolerations:     node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age                  From                   Message
  ----     ------     ----                 ----                   -------
  Normal   Scheduled  11m                  default-scheduler      Successfully assigned cass-operator/cass-operator-56fcb9ff47-4njb5 to silverbullet
  Normal   Pulling    11m                  kubelet, silverbullet  Pulling image "datastax/cass-operator:1.3.0"
  Normal   Pulled     11m                  kubelet, silverbullet  Successfully pulled image "datastax/cass-operator:1.3.0"
  Normal   Created    11m                  kubelet, silverbullet  Created container cass-operator
  Normal   Started    11m                  kubelet, silverbullet  Started container cass-operator
  Warning  Unhealthy  11m                  kubelet, silverbullet  Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "de2cbca4fc07bb19b0042b91c8e2a4ada1f398688740792ef10c17e671f1e35f": OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
  Warning  Unhealthy  11m                  kubelet, silverbullet  Liveness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "43cad3eaf92cb3d926f304c372c18c551244d12b4208564d2cad19543db70150": OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
  Warning  Unhealthy  11m                  kubelet, silverbullet  Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "eb8bb2825e48a98ba9987815ab83eabb137bbb23ff80346ea050cb80e60f9c7c": OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
  Warning  Unhealthy  11m                  kubelet, silverbullet  Liveness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "d248b9c04bdbc8718ab1fb45bf00dceef61ee27f64a06b98dc1bd2144299d003": OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
  Warning  Unhealthy  11m                  kubelet, silverbullet  Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "adddeb6f7e909eb4d2d19bde8d68cbb1ce90339bd7ff73026e34a46c44cfb82b": OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
  Warning  Unhealthy  11m                  kubelet, silverbullet  Liveness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "5bae60b7185ef9d19cad57484979a5963a01296bf00ccb57ee5f9c5183632ba6": OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
  Warning  Unhealthy  10m                  kubelet, silverbullet  Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "3a9ee77df2999af4db4d4d6b286ba946cdd0de96a4dfa3acec6af3aa9374eab6": OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
  Warning  Unhealthy  10m                  kubelet, silverbullet  Liveness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "96471636575d0281febfee1c2b4691b577e4713d138e702caab205732e25a62d": OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
  Warning  Unhealthy  10m                  kubelet, silverbullet  Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "a6560a06f0fa5a50be1d957002022425c0eec2e530a27210cb9168f61c7be549": OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
  Warning  Unhealthy  87s (x226 over 10m)  kubelet, silverbullet  (combined from similar events): Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "1626f0efe87a8268f8bd37f63e443a77e990732a3723033c2f31ddfe75c25a37": OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown
cass-operatorkubernetesmicrok8s
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

1 Answer

Erick Ramirez avatar image
Erick Ramirez answered Erick Ramirez commented

Symptom

This particular error is generic and specific to microk8s:

  Warning  Unhealthy  11m                  kubelet, silverbullet  Readiness probe errored: rpc error: code = Unknown desc = failed to exec in container: failed to start exec "de2cbca4fc07bb19b0042b91c8e2a4ada1f398688740792ef10c17e671f1e35f": OCI runtime exec failed: exec failed: container_linux.go:349: starting container process caused "apparmor failed to apply profile: write /proc/self/attr/exec: operation not permitted": unknown

Lots of microk8s users run into this problem (nothing to do with the cass-operator).

Cause

When installing the cass-operator in microk8s, the operator pod fails to start with an AppArmor error due to a bug in microk8s -- see https://github.com/ubuntu/microk8s/issues/784. This problem was identified by John Trimble in cass-operator issue #176.

There is an underlying issue with how microk8s handles AppArmor profiles when allowPrivilegeEscalation is set to false for the security context of a container.

Workaround

To get around the bug in microk8s, remove allowPrivilegeEscalation from the operator manifest.

For example in cass-operator v1.3.0, edit your copy of cass-operator-manifests-v1.18.yaml and remove the line:

          allowPrivilegeEscalation: false

then apply the change to your cluster.

Recommendation

If you're not experienced with Kubernetes, my suggestion is to stick with using KiND so you're not wasting time fighting microk8s and just learn how to use the operator. Cheers!

7 comments Share
10 |1000

Up to 8 attachments (including images) can be used with a maximum of 1.0 MiB each and 10.0 MiB total.

Tri avatar image Tri commented ·

@Erick Ramirez I'm doing OK with Kubernetes and prefer microk8s over Kind. The issue is specific to microk8s. It is explained in details here knative HelloWorld Serving Code Example, microk8s/issues #784 with the solution:

sudo apparmor_parser -R /var/lib/snapd/apparmor/profiles/snap.microk8s.daemon-containerd
sudo apparmor_parser -a /var/lib/snapd/apparmor/profiles/snap.microk8s.daemon-containerd

Now the Cassandra operator is in working order:

$ kubectl -n cass-operator get pod

NAME                             READY   STATUS    RESTARTS   AGE
cass-operator-56fcb9ff47-4njb5   1/1     Running   0          3h26m
0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ Tri commented ·

Right, I accidentally hit "post" before I finished my answer. It looks like you arrived at the same conclusion and realised it's not an operator issue. Cheers!

0 Likes 0 ·
Tri avatar image Tri Erick Ramirez ♦♦ commented ·

The solution differs however. Instead of lowering the security in the CRD (remove allowPrivilegeEscalation) I prefer to manually reload the apparmor profile.

0 Likes 0 ·
Show more comments
Tri avatar image Tri commented ·

Did you mean /cassandra/11-install-cass-operator-v1.3.yaml instead of cass-operator-manifests-v1.18.yaml ?

0 Likes 0 ·
Erick Ramirez avatar image Erick Ramirez ♦♦ Tri commented ·

If you look closely, 11-install-cass-operator-v1.3.yaml in the workshop repo is a copy of cass-operator-manifests-v1.18.yaml from the cass-operator repo.

I mentioned cass-operator manifest because other users who will stumble on to this post are not likely to have come across it while attending the workshop. They'll run into this problem when they're trying out the operator in the normal course of their day.

We need to provide answers that works for most. Cheers!

1 Like 1 ·