Insufficient memory to satisfy memory quota


Below is the error we’re getting from the operator:

time=“2020-02-24T21:55:12Z” level=error msg=“Cluster setup failed: still failing after 5 retries: [Server Error 400 (psqp1216-cb1-0000.psqp1216-cb1.playground-psrin14.svc:8091/node/controller/setupServices): [error - insufficient memory to satisfy memory quota for the services (requested quota is 2560MB, maximum allowed quota for the node is 2457MB)]]” cluster-name=psqp1216-cb1 module=cluster

Where is this 2457 MB number coming from?


Well containers have to be running somewhere and those hosts only have a finite size. You can see this for yourself with the kubectl describe nodes command:

Name:               gke-penguin-default-pool-def03ee4-t14v
Roles:              <none>
  attachable-volumes-gce-pd:  127
  cpu:                        4
  ephemeral-storage:          98868448Ki
  hugepages-2Mi:              0
  memory:                     3630500Ki
  pods:                       110
  attachable-volumes-gce-pd:  127
  cpu:                        3920m
  ephemeral-storage:          47093746742
  hugepages-2Mi:              0 
  memory:                     2584996Ki
  pods:                       110

Containers aren’t VMs, so they see the underlying host’s resources, not what you set any limits to. Containers interfere with one another (aka noisy neighbors), so one container may be stealing all the memory you expect to be available for another.

Containers are assigned resources limits and requests, which is what they get.
Why isn’t kubernetes scheduling on the node that has the resources?
What QoS class is used while scheduling the pod? Can we change it?

Actually Kubernetes is working perfectly fine. As far as Kubernetes is concerned it sees N memory available, you have told it to use M for the pod. Now it may look like N > M which is totally fine for Kubernetes scheduling, however if you read the Couchbase server documentation, somewhere you will note that it has an overhead

Your memory allocations equations are subject to the additional formula M = Ms + overhead, (where Ms is the total allocation for services running on that container, and O is ~20% I think). So while you may want a 1Gi pod, with 512Mi for data and 512Mi for index, the reality is Couchbase server will go, actually no, I need (512 + 512) * 1.2 = 1126Mi.

Pod priority classes can be set in Operator 2.0 onward.