Hi All,
There is many times we are seeing c.c.c.d.i.n.u.i.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 3204448263, max: 3221225472) errors after our application reaches to 100% traffic.
We have application running in gcp with below java configuration
ENV JAVA_OPTS="-Xms3g -Xmx3g -XX:+HeapDumpOnOutOfMemoryError"
ENV GC_OPTS=" -verbose:gc -XX:+PrintGCDetails -XX:+ParallelRefProcEnabled -XX:+PrintGCDateStamps -XX:+UseG1GC -XX:MaxGCPauseMillis=100"
where pod resouce is 5gb for application container and 2gb for istio.
We are seeing lots of spike in container memory reaching to 4gb where it got crashed though jvm memory stay below 2gb. During that period also we observe cb read operation upto 2.5 k per node(total 10 node). This is so frequently happening.
Below is the couchbase configuration
/**
-
Create Cluster,Connection by taking all properties.
-
try to connect to the host and bucket name of Couchbase database.
*/
@PostConstruct
public void setup() {
CouchbaseEnvironment env = DefaultCouchbaseEnvironment.builder().keepAliveTimeout(aliveTimeout).connectTimeout(connectionTimeout).socketConnectTimeout(socketTimeout)
.queryTimeout(queryTimeout).build();
try {
couchbaseCluster = CouchbaseCluster.create(env, nodes);
couchbaseCluster.authenticate(userName, password);
asyncPromoBucket = couchbaseCluster.openBucket(bucketName).async();
} catch (Exception e) {
log.error("failed trying to connect from couchbase Cluster ", e);
}
}
/**
- Disconnect from Couchbase server
- releasing values during shut down
*/
@PreDestroy
public void preDestroy() {
try {
if (this.couchbaseCluster != null) {
this.couchbaseCluster.disconnect();
}
} catch (Exception e) {
log.error("failed trying to disconnect from couchbase Cluster ", e);
}
}
aliveTimeout: 10000000
socketTimeout: 30000
connectionTimeout: 50000
requestTimeout: 2000
poolSize: 250
queryTimeout: 1000
Here is the complete log for exception details
{“timestamp”:“2020-07-07T19:06:58.850-04:00”,“logger_name”:“com.couchbase.client.deps.io.netty.channel.DefaultChannelPipeline”,“severity”:“WARN”,
“message”:"An exceptionCaught() event was fired, and it reached at the tail of the pipeline.
It usually means the last handler in the pipeline did not handle the exception.
",“stack_hash”:“e21e277a”,“stack_trace”:"c.c.c.d.i.n.u.i.OutOfDirectMemoryError: failed to allocate 16777216
byte(s) of direct memory (used: 3204448263, max: 3221225472)\n\tat c.c.c.d.i.n.u.i.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:535)\n\tat c.c.c.d.i.n.u.i.PlatformDependent.allocateDirectNoClean
er(PlatformDependent.java:489)\n\tat c.c.c.d.i.n.b.PoolArena$DirectArena.allocateDirect(PoolArena.java:766)\n\tat c.c.c.d.i.n.b.PoolArena$DirectArena.newChunk(PoolArena.java:742)\n\tat c.c.c.d.i.n.b.PoolArena.allo
cateNormal(PoolArena.java:244)\n\tat c.c.c.d.i.n.b.PoolArena.allocate(PoolArena.java:226)\n\tat c.c.c.d.i.n.b.PoolArena.allocate(PoolArena.java:146)\n\tat c.c.c.d.i.n.b.PooledByteBufAllocator.newDirectBuffer(Poole
dByteBufAllocator.java:333)\n\tat c.c.c.d.i.n.b.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:183)\n\tat c.c.c.d.i.n.b.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:174)
\n\tat c.c.c.d.i.n.b.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:135)\n\tat c.c.c.d.i.n.c.AdaptiveRecvByteBufAllocator$HandleImpl.allocate(AdaptiveRecvByteBufAllocator.java:104)\n\tat c.c.c.d.i
.n.c.n.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:117)\n\tat c.c.c.d.i.n.c.n.NioEventLoop.processSelectedKey(NioEventLoop.java:646)\n\tat c.c.c.d.i.n.c.n.NioEventLoop.processSelectedKeys
Optimized(NioEventLoop.java:581)\n\tat c.c.c.d.i.n.c.n.NioEventLoop.processSelectedKeys(NioEventLoop.java:498)\n\tat c.c.c.d.i.n.c.n.NioEventLoop.run(NioEventLoop.java:460)\n\tat c.c.c.d.i.n.u.c.SingleThreadEventE
xecutor$2.run(SingleThreadEventExecutor.java:131)\n\tat c.c.c.d.i.n.u.c.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)\n\tat java.lang.Thread.run(Thread.java:748)\n",“trace”:"",“span”:"",“parent”:"",
“class”:“c.c.c.d.i.n.c.DefaultChannelPipeline”,“service”:“PromotionsExecutionService”}
The method to fetch data from cb
final AsyncBucket asyncPromoBucket = cbConfiguration.getAsyncPromoBucket();
if (asyncPromoBucket == null) {
log.error(“Error in Couchbase connection”);
throw new PromoExecutionException(“Error in connection couchbase”);
}
long t1 = System.nanoTime();
Observable<ItemPromo> itemPromo = asyncPromoBucket.get(docId)
.retryWhen(RetryBuilder
.anyOf(BackpressureException.class, RequestCancelledException.class,
TimeoutException.class)
.delay(Delay.fixed(delay, TimeUnit.MILLISECONDS)).max(maxAttempt).build())
.doOnError(e -> {
log.error("Error while retrieving data from CB :{}", e);
throw new PromoExecutionException("Error while retrieving data from CB...");
})
.map(RepositoryHelper.parseItemPromoJson).switchIfEmpty(Observable.just(new ItemPromo()))
.timeout(timeout, TimeUnit.MILLISECONDS, Schedulers.io());
long t2 = System.nanoTime();
return itemPromo;
Need immediate help.
Thanks in advance