High indexer CPU on index server after move to different VMware cluster

ken.lang · April 13, 2018, 6:13pm

We have a 3-node cluster running 4.5.1.-2884. After restoring VM from a failed VMware cluster to new cluster, we are experiencing constant problems with the indexer service consuming high CPU and eventually grinding to a halt. So far our only solution has been to restart the server hosting the indexer service (running Global Indexes) and then it will slowly recur usually within four hours. This configuration had been rock solid stable for over a year before the move.

The servers are provisioned with 6 CPU, 16 GB, and 500 GB drives with are basically 90% free space.

The indexer.log shows a few somewhat generic errors:
couchbase Err projector.topicMissing
status:(error = Index scan timed out)
2018-04-11T09:35:18.788-06:00 [Error] StartDcpFeedOver(): MCResponse status=KEY_ENOENT, opcode=0x89, opaque=0, msg: Not found

What additional information would be helpful to troubleshoot?

Thanks,

deepkaran.salooja · April 16, 2018, 5:38pm

Please make sure all the required ports in your new cluster are open. https://developer.couchbase.com/documentation/server/current/install/install-ports.html

The above errors seem to point to configuration issues related to ports not being open. If it doesn’t help, you can share the full log.

ken.lang · April 16, 2018, 6:08pm

The firewalls on all servers are currently all disabled while we are troubleshooting the situation. Should all of these ports be visible on each Couchbase server. It seems that only the one running the indexer role does while the others have tcp port 999 for cluster communication visible.

Since I am a new to the forum, I am unable to upload the indexer.log at this time.

Thanks,

deepkaran.salooja · April 17, 2018, 8:59pm

From the index service perspective, all ports listed for “Indexer Service” needs to be open on nodes where index service has been enabled except port 9999 which needs to be enabled on all data service nodes.

Has the memory quota for index service been set correctly on the new cluster?

You can upload the log file using:
curl -X PUT -T indexer.log https://forumlogs.s3-us-west-1.amazonaws.com/indexer.log

ken.lang · April 18, 2018, 4:08am

I just uploaded indexer.log file.

deepkaran.salooja · April 18, 2018, 8:43pm

I don’t see a lot of activity in this log snippet. Do you know know the time window when indexer process was consuming a lot of CPU.

A couple of things you can try:

Change the compaction setting in UI to make it run only on Sunday rather than all the days it is currently set to.
Increase the RAM quota to 2GB.

ken.lang · April 19, 2018, 3:36am

WE had another failure today, so I’ll get the relevant logs and add them to the case early tomorrow MDT.

ken.lang · April 19, 2018, 4:04pm

I just uploaded yesterday’s indexer.log

deepkaran.salooja · April 19, 2018, 8:47pm

There is nothing that stands out from the log file. Can you indicate the time window when you observe the issue? How much cpu usage do you see for the indexer process?

Next time it happens, you can capture the cpu profile and share:

go tool pprof -seconds=60 -svg /opt/couchbase/bin/indexer http://localhost:9102/debug/pprof/profile > cpu_prof.svg

You may need to install graphviz on your machine.

ken.lang · April 20, 2018, 1:50am

The time frame is from about 9:00 AM to 9:30. I wasn’t able to view processes but in the previous incident top showed aggregate CPU at 99% while indexer ran at 400 to 500% (the top view).

I’ll look at running pprof tomorrow if it’s possible. Since this is servicing a clinical app we can’t let it just hang.

deepkaran.salooja · April 20, 2018, 10:19pm

The log file you shared has logs from 2018-04-17T23:52:44 till 2018-04-18T08:25:41.

ken.lang · April 20, 2018, 10:33pm

I must have grabbed the wrong version. I’ll find the correct one.

ken.lang · April 21, 2018, 5:09pm

We did make the changes that you suggested, and will keep you informed about the stability over the next few days.

Topic		Replies	Views
High Indexer CPU usage Couchbase Server index	2	1552	November 23, 2021
Indexer high CPU Couchbase Server	7	1970	January 21, 2021
Indexer 100% cpu at random different nodes SQL++ index	2	1626	June 19, 2019
Couchbase Indexer Service - High CPU Couchbase Server index	6	2565	March 24, 2021
4.5.1: Service 'indexer' exited with status 1 Couchbase Server	0	1483	June 11, 2017

High indexer CPU on index server after move to different VMware cluster

Related topics