Cloud native config help?

I changed the password and domain in the file I uploaded (sorry for the hassle). Everything works fine on an earlier version of couchbase with the same values, so I know that the user/passwords and cert are fine

sure, thanks @justin.ashworth , see the attached output form the get pods command
getpod.zip (3.5 KB)

I’m not sure I have seen a secret that was needed, did I miss that in the documentation? I reference a TLS definition (cert) as part of the native gateway definition as shown in the docs, but didn’t see anything about a secret for user/password…can you point me at the docs or an example config?

@ksully

It will automatically create the secret for you, and populate it with a user.

ah yes , I see one out there


Name:         couchbase-cloud-native-gateway-admin-secret-couchbasedev
Namespace:    couchbase
Labels:       <none>
Annotations:  <none>

Type:  Opaque

Data
====
password:  69 bytes
username:  22 bytes

and i see the password and username values…they’re base64 I assume.
Should those work to log into the couchbase server? (did a base64 decode on the values, but no luck logging in through the CB ui, and the UI shows no users when logged in with the admin account)
Let me know how you think the 401 is happening and how it relates to this generated secret…

Try either creating that user within Couchbase Server, or using the admin username/password for the secret and seeing if it boots up after that. I imagine that’s the problem, not sure why the user isn’t being created inside Couchbase Server.

will do and report back (will be tomorrow). any particular rights to define with user for the gateway? (just give it full admin rights to test it out)?

some progress, both containers show running, but not sure that it’s actually running correctly. I created the username/password that were in the secret via the UI as a full admin. I assume both values were base64, and it did seem to progress.
See describe of the gateway container attached.
getpod.zip (3.5 KB)
Also, I don’t see port 18098, which the docs say you should use to do grpc against to check the health, here’s the ports exposed:

NAME                                                                  READY   STATUS    RESTARTS   AGE
pod/couchbasedev-0000                                                 2/2     Running   0          26m
pod/couchbaseoperator-couchbase-admission-controller-67b47b4647tgnt   1/1     Running   0          27m
pod/couchbaseoperator-couchbase-operator-849d9b8cd4-x68k5             1/1     Running   0          27m

NAME                                                       TYPE           CLUSTER-IP     EXTERNAL-IP    PORT(S)                                                                                                                                                                                                                                                                                                                                                                                         AGE
service/couchbasedev                                       ClusterIP      None           <none>         4369/TCP,8091/TCP,8092/TCP,8093/TCP,8094/TCP,8095/TCP,8096/TCP,9100/TCP,9101/TCP,9102/TCP,9103/TCP,9104/TCP,9105/TCP,9110/TCP,9111/TCP,9112/TCP,9113/TCP,9114/TCP,9115/TCP,9116/TCP,9117/TCP,9118/TCP,9120/TCP,9121/TCP,9122/TCP,9130/TCP,9140/TCP,9999/TCP,11207/TCP,11209/TCP,11210/TCP,18091/TCP,18092/TCP,18093/TCP,18094/TCP,18095/TCP,18096/TCP,19102/TCP,19130/TCP,21100/TCP,21150/TCP   27m
service/couchbasedev-0000                                  LoadBalancer   10.244.0.200   172.16.1.172   18091:31801/TCP,18095:32271/TCP,11207:32208/TCP,18092:32742/TCP,18096:30809/TCP,18093:31433/TCP,18094:32723/TCP                                                                                                                                                                                                                                                                                 26m
service/couchbasedev-srv                                   ClusterIP      None           <none>         11210/TCP,11207/TCP                                                                                                                                                                                                                                                                                                                                                                             27m
service/couchbasedev-ui                                    LoadBalancer   10.244.0.96    172.16.1.177   18091:31669/TCP,11207:31290/TCP                                                                                                                                                                                                                                                                                                                                                                 27m
service/couchbaseoperator-couchbase-admission-controller   ClusterIP      10.244.0.42    <none>         443/TCP                                                                                                                                                                                                                                                                                                                                                                                         27m
service/couchbaseoperator-couchbase-operator               ClusterIP      10.244.0.176   <none>         8080/TCP,8383/TCP                                                                                                                                                                                                                                                                                                                                                                               27m

Please advise?

Hey @ksully,

Could you tell me what version of operator you are using? There should have been created a service called couchbasedev-cloud-native-gateway-service that exposes port 443. It can take a bit for this service to propagate, as the cluster has to be up and healthy for it to appear.

I didn’t see that service, do you see something in the describe that indicates that it should there? I can try it again and let it sit for long…

in my helm file, I have (attached as well)
couchbaseOperator using image 2.6.2
admissionController using image 2.4.2
couchbaseserver using image 7.6.0 (have tried 7.6.1 as well I recall)
logging busybox 1.33.1
cloudnativegateway 1.0.0
couchbase-values-1instance.support.zip (2.7 KB)

so I ran it again, manually created the user defined in the native gateway secret, and saw good things in the log for couchbasedev-0000:

│
│ cloud-native-gateway {"level":"warn","ts":"2024-06-04T20:36:02.790Z","logger":"gateway","caller":"gateway/gateway.go:193","msg":"failed to ping cluster","error":"failed to get cluster config: server error: access │
│  denied (status: 401, body: ``)","errorVerbose":"server error: access denied (status: 401, body: ``)\nfailed to get cluster config\ngithub.com/couchbase/stellar-gateway/gateway.pingCouchbaseCluster\n\t/home/couch │
│ base/jenkins/workspace/couchbase-k8s-microservice-build/couchbase-cloud-native-gateway/gateway/gateway.go:138\ngithub.com/couchbase/stellar-gateway/gateway.(*Gateway).Run\n\t/home/couchbase/jenkins/workspace/couc │
│ hbase-k8s-microservice-build/couchbase-cloud-native-gateway/gateway/gateway.go:191\nmain.startGateway\n\t/home/couchbase/jenkins/workspace/couchbase-k8s-microservice-build/couchbase-cloud-native-gateway/cmd/gatew │
│ ay/main.go:399\nmain.glob..func1\n\t/home/couchbase/jenkins/workspace/couchbase-k8s-microservice-build/couchbase-cloud-native-gateway/cmd/gateway/main.go:53\ngithub.com/spf13/cobra.(*Command).execute\n\t/home/cou │
│ chbase/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:987\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/home/couchbase/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1115\ngithub.com/spf13/cobra.(*Com │
│ mand).Execute\n\t/home/couchbase/go/pkg/mod/github.com/spf13/cobra@v1.8.0/command.go:1039\nmain.main\n\t/home/couchbase/jenkins/workspace/couchbase-k8s-microservice-build/couchbase-cloud-native-gateway/cmd/gatewa │
│ y/main.go:463\nruntime.main\n\t/home/couchbase/jenkins/workspace/couchbase-k8s-microservice-build/toolstjyY1/go1.21.6/src/runtime/proc.go:267\nruntime.goexit\n\t/home/couchbase/jenkins/workspace/couchbase-k8s-mic │
│ roservice-build/toolstjyY1/go1.21.6/src/runtime/asm_amd64.s:1650"}                                                                                                                                                   │
│ cloud-native-gateway {"level":"info","ts":"2024-06-04T20:36:02.790Z","logger":"gateway","caller":"gateway/gateway.go:202","msg":"sleeping before trying to ping cluster again","period":5}                           │
│ cloud-native-gateway {"level":"info","ts":"2024-06-04T20:36:07.841Z","logger":"gateway","caller":"gateway/gateway.go:254","msg":"connected agent manager","agent":{}}                                                │
│ cloud-native-gateway {"level":"info","ts":"2024-06-04T20:36:07.841Z","logger":"gateway","caller":"gateway/gateway.go:257","msg":"connected to couchbase cluster"}                                                    │
│ cloud-native-gateway {"level":"info","ts":"2024-06-04T20:36:07.842Z","logger":"gateway","caller":"gateway/gateway.go:341","msg":"initializing protostellar system"}                                                  │
│ cloud-native-gateway {"level":"info","ts":"2024-06-04T20:36:07.847Z","logger":"gateway","caller":"gateway/gateway.go:416","msg":"starting to run protostellar system","advertisedPortPS":18098,"advertisedPortSD":18 │
│ 099}                                                                                                                                                                                                                 │
│ cloud-native-gateway {"level":"info","ts":"2024-06-04T20:36:07.848Z","caller":"webapi/webapi.go:65","msg":"system health marked as healthy"}

The log indicated it was sleeping before checking the cluster again, so I waited at least 5 (I guess minutes).
And even though the log says that it is running protostellar on 18098, I don’t see that port listed on any of the pods or services.
But, no service (as you said, couchbasedev-clod-native-gateway-service or similar) appeared after waiting for quite a while (20 minutes).

I do see this in the operator log, talking about a dns entry that isn’t resolving. I don’t have this defined, is this supposed to point to the service specifically for that 0000 instance? I’m confused as to how this is supposed to work, especially with the native gateway which should be the single interface to all server instances…maybe you can clue me in on that part?


{"level":"info","ts":"2024-06-04T20:44:12Z","logger":"cluster","msg":"Reconciliation failed","cluster":"couchbase/couchbasedev","error":"could not reach hostname: couchbasedev-0000.couchbasedev.meus.global, dial  │
│ tcp: lookup couchbasedev-0000.couchbasedev.meus.global on 10.244.0.10:53: no such host","stack":"github.com/couchbase/couchbase-operator/pkg/util/netutil.WaitForHostPort\n\tgithub.com/couchbase/couchbase-operator │
│ /pkg/util/netutil/netutil.go:37\ngithub.com/couchbase/couchbase-operator/pkg/cluster.waitAlternateAddressReachable\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/networking.go:76\ngithub.com/couchbase/cou │
│ chbase-operator/pkg/cluster.(*Cluster).reconcileMemberAlternateAddresses\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/networking.go:228\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMa │
│ chine).handleNodeServices\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/nodereconcile.go:1323\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*ReconcileMachine).exec\n\tgithub.com/couchbase/couchba │
│ se-operator/pkg/cluster/nodereconcile.go:313\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).reconcileMembers\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/reconcile.go:268\ngithub.com/co │
│ uchbase/couchbase-operator/pkg/cluster.(*Cluster).reconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/reconcile.go:177\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).runReconcile\n\t │
│ github.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:492\ngithub.com/couchbase/couchbase-operator/pkg/cluster.(*Cluster).Update\n\tgithub.com/couchbase/couchbase-operator/pkg/cluster/cluster.go:535\ngit │
│ hub.com/couchbase/couchbase-operator/pkg/controller.(*CouchbaseClusterReconciler).Reconcile\n\tgithub.com/couchbase/couchbase-operator/pkg/controller/controller.go:90\nsigs.k8s.io/controller-runtime/pkg/internal/ │
│ controller.(*Controller).Reconcile\n\tsigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\tsig │
│ s.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\tsigs.k8s.io/controller-runtime@v0.16.3/pk │
│ g/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\tsigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227"}    │
│ {"level":"info","ts":"2024-06-04T20:44:13Z","logger":"cluster","msg":"Waiting for DNS propagation","cluster":"couchbase/couchbasedev","hostname":"couchbasedev-0000.couchbasedev.meus.global"}

Thanks @mreiche for sharing these helpful resources.

1 Like

While that DNS issue is odd, I wouldn’t worry too much about it. This might come down to a K8S DNS issue on your platform. This is where the Couchbase Operator is doing it’s regular reconciliation loop, and it can’t name resolve one of the pods associated with the CouchbaseCluster you have defined. It’s not great, but if it’s just one entry and not continuing, we can probably ignore it as a DNS caching issue of some sort.

I’m confused as to how this is supposed to work, especially with the native gateway which should be the single interface to all server instances…maybe you can clue me in on that part?

After the Couchbase Cloud Native Gateway (CNG) is up and running, you would expose it outside your K8S cluster using a ‘gateway’, to use the modern Kubernetes terminology. Effectively any loadbalancer over HTTPS should work. We test OpenShift Routes (docs) and standard K8S LoadBalancers (docs) from cloud service providers. Other environments may have special loadbalancer

The endpoint from that Route or LoadBalancer is what you would bootstrap against with a couchbase2:// style connection string in the supporting SDKs, like Java.

I hope that helps and glad for any feedback on how we can make this simpler. Maybe a tutorial for OpenShift.

In this case, I am running only o e insta ce of the server just to get things working. But why does each server need a dns entry? Is this a requirement of the native gateway (haven’t seen this waring before).
On the gateway itself, if the services don’t expose the required port mentioned in the docs, you certainly can’t assignable gateway to it…so am still wondering if anyone can tell me why that port is not showing up if the native gateway contains us indeed (see last post with the logs)

When CNG is running in K8S, it runs as a sidecar and becomes a described service. I tried to look back at logs attached, but I only seem to see resources in YAML.

I think what I’ll do is put together a setup that I’d recommend and show you what it looks like. I’ll try to do that today or tomorrow.

One question: what’s your K8S platform? Where do you intend to run Couchbase? If you have multiple environments, that’s good too-- we just might want to treat development locally differently than production.

Thanks Matt (@ingenthr)
My platform is on-prem, built using kubeadm on ubunt, and runs couchbase and a number of other products and app api’s all in that cluster. Thanks much for the assist!

Hey @ksully - apologies for the delay. I had a look at this, and I think what you may want to be looking into, assuming you’re talking about a production deployment on multiple machines, is MetalLB.

To show this in a diagram, it’d look something like this for a 4 node Couchbase cluster across 2 K8S nodes (more would be better). You would create a K8S Service with a LoadBalancer across the cloud-native-gateway ports from the pods. MetalLB has some options. The BGP option would let you spread the traffic across the nodes. Since Operator will take care to not place active and replica data on the same K8S node (see the docs), you’d be highly available, though you should really have 3+ nodes to ensure you’d have quorum if there is a failure (more in the docs).

Now, if you don’t need to access it externally at all, the CNG service can be accessed by other pods internal to K8S directly.

in my case, I use Cilium, which is quite superior to metallb and provides ingress, l2 advertisement, api gateway and kernel level routing, etc. It’s good stuff, and I’ve been using it for a while for all my api’s, etc. For example, today I can expose Couchbase outside the cluster through the configuration, which is picked up and supported by cilium in existing it outside the cluster including dns (which I maintain manually via hosts file for this on-prem cluster).
We can talk more about this, and I get that you don’t support cilium, and I don’t think that’s the problem we’re facing though. I’m hopeful that we can get the CNG exposed, successfully, and if so the next step will be simple to wrap it in the service, etc.

After meeting with Matt, here’s a quick summary of findings:

  1. The secret created by CNG isn’t defined as a user in couchbase, this was causing the 401 error. Manually reading the secret user/pwd and creating that user in couchbase solved this. The couchbase team is looking into how to specify this user/pwd for CNG in the helm values file, and I’m assume they’ll update the docs as applicable.
  2. The port 18098 is not listed in the couchbase services (with the other ports), but if the 2 containers started ok in the couchbase pod, that port is actually open. A get -o yaml on the pod will show that port listed, and it can be port-forwarded (or set up with a loadbalancer, etc.) as needed.
1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.