(GKE) Autonomous Operator :Eventing service node terminated and another created with a new name forever

stuff.095 · March 25, 2019, 11:19am

im building cbc per nod for per service but when i add eventing service to cbc config yaml,kubernetes going crayz and teriminate pod and create new one forever for eventing service then i deleted eventing service from cbc config yaml,everything going fine .

operator version 1.0.0
couchbase server version 5.5.1

any pod logs:
2019-03-25 03:06:44.765 UTC-8Starting Couchbase Server – Web UI available at http://<ip>:8091
2019-03-25 03:06:44.765 UTC-8and logs available in /opt/couchbase/var/lib/couchbase/logs

simon.murray · March 25, 2019, 2:53pm

Hey,

We cannot go any further without logs to diagnose the error (both server and kubernetes/operator). It’s impossible to do this with version 1.0.0 - especially with the symptoms you describe, which is why we brought out verison 1.1.0 So here’s what we need you to do:

Use at least GKE 1.12 or greater
Create a storage class with the volume binding mode set to WaitForFirstConsumer
When you deploy your cluster with that storage class follow this guide to have your cluster use persistent storage
- Our best practice guide tells you what the supported volume mount configurations are
- In this particular instance use the logs mount for your eventing nodes.
Finally when the error happens run the support tool to collect logs with cbopinfo --collect-info. We are particularly interested in the logs for the eventing pods that crashed, it will give you an option to download these files from an orphaned log volume.

Given the potential size of the files that need transferring it may be beneficial to open a support case with our team and go that route.

Regards Si

stuff.095 · March 26, 2019, 7:45am

hi simon thanks for advice,
i solved my problem, i already use latest GKE version just update operator form 1.0 to 1.1 and server from 5.5.1 to 5.5.2.
now its working.

simon.murray · March 26, 2019, 9:13am

Excellent news. Eventing was very new back in 5.5.1, so I’m not surprised an upgrade helped.

Feel free to reach out at any time!

stuff.095 · March 28, 2019, 6:07am

hi simon i have a another problem

you know my issue,after update i cant login my couchbase server so i tried clear setup ,i deleted everything and setup with no changes,just follow docs then nothing changed .According to secret username and password should be Administrator and password.

this is from secret yaml
username: QWRtaW5pc3RyYXRvcg==
password: cGFzc3dvcmQ=

is there any problem ?

simon.murray · March 28, 2019, 9:22am

Looks fine to me:

[Thu 28 Mar 09:18:46 GMT 2019] simon@symphony ~ echo -n QWRtaW5pc3RyYXRvcg== | base64 -d; echo
Administrator
[Thu 28 Mar 09:18:50 GMT 2019] simon@symphony ~ echo -n cGFzc3dvcmQ= | base64 -d; echo
password

You can also verify it works via the CLI…

kubect exec -ti ${my_pod_name} -- curl http://localhost:8091/pools/default -u Administrator:password

stuff.095 · March 28, 2019, 9:40am

i can reach cluster and get data using cli with yours code but on when i tried on web page i get
“Login failed. Please try again.” error.

simon.murray · March 28, 2019, 10:42am

Very odd. Just to be absolutely sure you can also try:

kubectl port-forward ${my_pod_name} 8091

Then login via:

http://localhost:8091

Unless you are already using that method to access the UI?

stuff.095 · March 28, 2019, 10:58am

Exactly what i did,but i cant login

simon.murray · March 28, 2019, 11:36am

Hmm. only other idea I can think of is a browser issue. Have you tried a different one, or at least running an incognito session? Failing that it’s probably a ripe question for our support engineers.

stuff.095 · March 28, 2019, 11:55am

Yes same error.I m creating new gke cluster maybe it will work.

stuff.095 · March 29, 2019, 7:08am

Still doesnt work anyway.

simon.murray · March 29, 2019, 9:28am

Okay, talk to Support Engineering, they’ll guide you through log collection and hopefully get you up and running in short order. Feel free to reference this conversation as context of your issue.

anil · March 29, 2019, 4:40pm

Hi @stuff.095,

Can you please create a JIRA ticket follow the instructions under collecting logs to collect logs and attach them to the ticket. In the description add a link to this forum post so we have the background about the issue.

Thanks!

Topic		Replies	Views
Kubernetes Autonomous Operator (GKE): Node (apparently?) crashed, and another spun up with a new name Kubernetes	3	1740	December 17, 2018
Persistent volumes cause server nodes to fail when deployed to GKE using Helm Charts Kubernetes	3	1650	May 8, 2019
Operator crashes when creating a cluster Kubernetes	5	2165	November 18, 2020
Operator pods are restarting serveral time Kubernetes	4	1226	October 4, 2023
Issues with "cluster-autoscaler" and Autonomous Operator Kubernetes	4	1418	January 28, 2019

(GKE) Autonomous Operator :Eventing service node terminated and another created with a new name forever

Related topics