[MB-7737] [Doc'd] Swap usage shoots up with view querying or indexing Created: 13/Feb/13  Updated: 20/Jun/13  Resolved: 20/Jun/13

Status: Closed
Project: Couchbase Server
Component/s: couchbase-bucket, documentation
Affects Version/s: 2.0.1
Fix Version/s: 2.1.0
Security Level: Public

Type: Bug Priority: Critical
Reporter: Abhinav Dangeti Assignee: kzeller
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: Seen on the latest few builds
Observed on:
2.0.1-153 Linux
4 Core 30GB SSD machines

Attachments: Zip Archive 10_6_2_68.zip     PNG File Screen Shot 2013-02-13 at 4.46.23 PM.png     PNG File Screen Shot 2013-02-18 at 2.38.32 PM.png     PDF File vperf-lnx.loop_2.0.1-153-rel-enterprise_2.0.1-153-rel-enterprise_VESTA_Feb-11-2013_15-40-09.pdf    

 Description   
- Setup a 7 node cluster,
- RAM (for each physical machine): ~32GB
- create 2 buckets: default (bucket quota: 12000MB per node) and saslbucket (bucket quota: 7000MB per node)
- run a load (with just creates), until the item count hits a mark
- create 2 views over each of the buckets
- Run load that pushes system into DGM, ~80% resident ratio for default, and ~60% resident ratio for saslbucket
- Run queries continuously against the views as the load continues.
(for e.g.:
curl -X GET 'http://10.1.3.235:8092/default/_design/d1/_view/v1?startkey=0&stale=ok&limit=1000')
- Noticed the increased swap usage.
- Start a new mixed load on both the buckets (creates, updates, deletes, expires).
- Setup XDCR to another 5 node cluster.
- Continue querying, noticed considerable increase in swap usage.


This behavior is seen with a cluster with views as well (no XDCR involved).

 Comments   
Comment by Jin Lim [ 13/Feb/13 ]
Abhinav,
If available please provide more information regarding test environment setup and test scenario (manual steps to reproduce or test suite, etc).
Also, can you please confirm if this is a regression or existing since 2.0.0. Please assign back to Jin after the requested info. Thanks!
Comment by Farshid Ghods (Inactive) [ 13/Feb/13 ]
This is an existing behavior from 2.0 and the bug is opened as placeholder where system test team keeps adding more information based on the weekly runs.

Abhinav,


please add the test spec and the hardware which this test runs on ( things like how many buckets , how much ram left for indexing )..
Comment by Mike Wiederhold [ 14/Feb/13 ]
I also want to note here that memcached is pretty good about using roughly the memory it has been told it can use. If ns_server or couchdb are using too much memory when running view related tasks and pushing memcached into swap then I think this issue should be addressed from the ns_server/couchdb side. On the other hand if memcached is using more memory than it is told it can use then the ep-engine team should be looking at this.
Comment by Farshid Ghods (Inactive) [ 18/Feb/13 ]
this is seen on performance tests as well.

test case

vperf−lnx.conf
# View performance test for Linux:
# 8K ops/sec
# 80% reads, 20% write (12% updates/deletes, 8% inserts)
# 8M dataset (non−DGM)
# Stop after 6M total queries
# 3 ddocs with 3 views per ddoc
performance.iperf.MultiClientTests.test_vperf
params:
# general
batch=50
kind=json
mem_quota=15000
# load phase
items=8000000
hot_init_items=1000
# index phase
views=[3, 3, 3]
# access phase
ratio_sets=0.2
ratio_misses=0.04
ratio_creates=0.40
ratio_deletes=0.50
ratio_hot=0.2
ratio_hot_gets=0.95
ratio_hot_sets=0.95
ratio_expirations=0.03
bg_max_ops_per_sec=500
fg_max_ops=6000000
total_clients=16
# control (defaults: pytests/performance/perf_defaults.py)
load_wait_until_drained=1
loop_wait_until_drained=0
mcsoda_heartbeat=3
tear_down=1
tear_down_proxy=1
tear_down_bucket=0
tear_down_cluster=1
tear_down_on_setup=0
Comment by Farshid Ghods (Inactive) [ 19/Feb/13 ]
abhinav,

please rephrase :

- create 2 buckets: default (RAM: 12000MB per node) and saslbucket (RAM: 7000MB per node)
to

- create 2 buckets: default (bucket qouta: 12000MB per node) and saslbucket (qouta: 7000MB per node)

also how much RAM does each physica machine have ?
Comment by Abhinav Dangeti [ 20/Feb/13 ]
*Added in description
Comment by Jin Lim [ 20/Feb/13 ]
Per bug scrubs:
* not a regression from 2.0 (2.0 must have this issue as well)
* swappiness set = 10 & swap memory = 5G
* will need to start looking at it now
Comment by Abhinav Dangeti [ 06/Jun/13 ]
Cluster with views: with compaction, indexing and continuous querying. Noticed swap usage shoot up with system in DGM. Rebalance operation causing swap to max out. This high swap usage (about 100% at times) causes rebalance operations to fail. This particular time during a swap rebalance (failed) a node went down, with a memcached core. (10.6.2.68: Node being rebalanced in during swap rebalance).

High swap usage observed on all later 2.0.2 builds and 2.1.0 builds.
Current crash happened on 2.0.2-820.
Cluster still in high swap, no front-end/compaction/indexing.

Back trace of memcached core: (core at /data/core.memcached.3925 on 10.6.2.68)

Thread 11 (Thread 0x7f8df70fc700 (LWP 21383)):
#0 0x00007f8e002627bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f8dfa0ed5c1 in wait (this=0x6d8f860) at src/syncobject.hh:57
#2 ExecutorThread::run (this=0x6d8f860) at src/scheduler.cc:139
#3 0x00007f8dfa0ede4d in launch_executor_thread (arg=<value optimized out>) at src/scheduler.cc:34
#4 0x00007f8e0025e851 in start_thread () from /lib64/libpthread.so.0
#5 0x00007f8dfffac6dd in clone () from /lib64/libc.so.6

Thread 10 (Thread 0x7f8df5cfa700 (LWP 21385)):
#0 0x00007f8e002627bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f8dfa0ed5c1 in wait (this=0x6db6d00) at src/syncobject.hh:57
#2 ExecutorThread::run (this=0x6db6d00) at src/scheduler.cc:139
#3 0x00007f8dfa0ede4d in launch_executor_thread (arg=<value optimized out>) at src/scheduler.cc:34
#4 0x00007f8e0025e851 in start_thread () from /lib64/libpthread.so.0
#5 0x00007f8dfffac6dd in clone () from /lib64/libc.so.6

Thread 9 (Thread 0x7f8df7afd700 (LWP 21382)):
#0 0x00007f8e0026243c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f8dfeb95fed in add_log_entry (severity=<value optimized out>, client_cookie=<value optimized out>, fmt=<value optimized out>) at extensions/loggers/file_logger.c:178
#2 logger_log (severity=<value optimized out>, client_cookie=<value optimized out>, fmt=<value optimized out>) at extensions/loggers/file_logger.c:280
#3 0x00007f8dfa0c373e in LOG (severity=EXTENSION_LOG_WARNING, fmt=0x7f8dfa14f48c "%s") at src/ep_engine.cc:1197
#4 0x00007f8dfa0c1e2f in PersistenceCallback::callback(std::pair<int, long>&) () from /opt/couchbase/lib/memcached/ep.so
#5 0x00007f8dfa1227b4 in CouchKVStore::commitCallback (this=0x6d64300, committedReqs=<value optimized out>, numReqs=39988, errCode=COUCHSTORE_ERROR_WRITE) at src/couch-kvstore/couch-kvstore.cc:1590
#6 0x00007f8dfa1242c6 in CouchKVStore::commit2couchstore (this=0x6d64300) at src/couch-kvstore/couch-kvstore.cc:1417
#7 0x00007f8dfa12449a in CouchKVStore::commit (this=0x6d64300) at src/couch-kvstore/couch-kvstore.cc:806
#8 0x00007f8dfa0b5de6 in EventuallyPersistentStore::flushVBucket (this=0x6d4e000, vbid=361) at src/ep.cc:1920
#9 0x00007f8dfa0ded52 in doFlush (this=0x2487040, tid=<value optimized out>) at src/flusher.cc:222
#10 completeFlush (this=0x2487040, tid=<value optimized out>) at src/flusher.cc:197
#11 Flusher::step (this=0x2487040, tid=<value optimized out>) at src/flusher.cc:170
#12 0x00007f8dfa0ed731 in ExecutorThread::run (this=0x6d8fa00) at src/scheduler.cc:153
#13 0x00007f8dfa0ede4d in launch_executor_thread (arg=<value optimized out>) at src/scheduler.cc:34
#14 0x00007f8e0025e851 in start_thread () from /lib64/libpthread.so.0
#15 0x00007f8dfffac6dd in clone () from /lib64/libc.so.6

Thread 8 (Thread 0x7f8df66fb700 (LWP 21384)):
#0 0x00007f8e002627bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f8dfa0ed5c1 in wait (this=0x6db6ea0) at src/syncobject.hh:57
#2 ExecutorThread::run (this=0x6db6ea0) at src/scheduler.cc:139
#3 0x00007f8dfa0ede4d in launch_executor_thread (arg=<value optimized out>) at src/scheduler.cc:34
#4 0x00007f8e0025e851 in start_thread () from /lib64/libpthread.so.0
#5 0x00007f8dfffac6dd in clone () from /lib64/libc.so.6

Thread 7 (Thread 0x7f8df52f9700 (LWP 21386)):
#0 0x00007f8e002627bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f8dfa0ed5c1 in wait (this=0x6db6b60) at src/syncobject.hh:57
#2 ExecutorThread::run (this=0x6db6b60) at src/scheduler.cc:139
#3 0x00007f8dfa0ede4d in launch_executor_thread (arg=<value optimized out>) at src/scheduler.cc:34
#4 0x00007f8e0025e851 in start_thread () from /lib64/libpthread.so.0
#5 0x00007f8dfffac6dd in clone () from /lib64/libc.so.6

Thread 6 (Thread 0x7f8dfad80700 (LWP 21380)):
#0 0x00007f8dfff7097d in nanosleep () from /lib64/libc.so.6
#1 0x00007f8dfffa5b34 in usleep () from /lib64/libc.so.6
#2 0x00007f8dfa0e8d75 in updateStatsThread (arg=0x247c4c0) at src/memory_tracker.cc:31
#3 0x00007f8e0025e851 in start_thread () from /lib64/libpthread.so.0
#4 0x00007f8dfffac6dd in clone () from /lib64/libc.so.6

Thread 5 (Thread 0x7f8df84fe700 (LWP 21381)):
#0 0x00007f8e0026243c in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f8dfeb95fed in add_log_entry (severity=<value optimized out>, client_cookie=<value optimized out>, fmt=<value optimized out>) at extensions/loggers/file_logger.c:178
#2 logger_log (severity=<value optimized out>, client_cookie=<value optimized out>, fmt=<value optimized out>) at extensions/loggers/file_logger.c:280
#3 0x00007f8dfa0c373e in LOG (severity=EXTENSION_LOG_WARNING, fmt=0x7f8dfa14f48c "%s") at src/ep_engine.cc:1197
#4 0x00007f8dfa0c1e2f in PersistenceCallback::callback(std::pair<int, long>&) () from /opt/couchbase/lib/memcached/ep.so
#5 0x00007f8dfa1227b4 in CouchKVStore::commitCallback (this=0x6d64c00, committedReqs=<value optimized out>, numReqs=13023, errCode=COUCHSTORE_ERROR_WRITE) at src/couch-kvstore/couch-kvstore.cc:1590
#6 0x00007f8dfa1242c6 in CouchKVStore::commit2couchstore (this=0x6d64c00) at src/couch-kvstore/couch-kvstore.cc:1417
#7 0x00007f8dfa12449a in CouchKVStore::commit (this=0x6d64c00) at src/couch-kvstore/couch-kvstore.cc:806
#8 0x00007f8dfa0b5de6 in EventuallyPersistentStore::flushVBucket (this=0x6d4e000, vbid=570) at src/ep.cc:1920
#9 0x00007f8dfa0decd9 in doFlush (this=0x2486f00, tid=8) at src/flusher.cc:222
#10 Flusher::step (this=0x2486f00, tid=8) at src/flusher.cc:152
#11 0x00007f8dfa0ed731 in ExecutorThread::run (this=0x6d8fba0) at src/scheduler.cc:153
#12 0x00007f8dfa0ede4d in launch_executor_thread (arg=<value optimized out>) at src/scheduler.cc:34
#13 0x00007f8e0025e851 in start_thread () from /lib64/libpthread.so.0
#14 0x00007f8dfffac6dd in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x7f8e0139b720 (LWP 3925)):
#0 0x00007f8dfff7097d in nanosleep () from /lib64/libc.so.6
#1 0x00007f8dfffa5b34 in usleep () from /lib64/libc.so.6
#2 0x00007f8dfa0ddffa in Flusher::wait (this=0x2487040) at src/flusher.cc:32
#3 0x00007f8dfa0af5aa in stopFlusher (this=0x6d4e000, __in_chrg=<value optimized out>) at src/ep.cc:400
#4 EventuallyPersistentStore::~EventuallyPersistentStore (this=0x6d4e000, __in_chrg=<value optimized out>) at src/ep.cc:354
#5 0x00007f8dfa0daecf in EventuallyPersistentEngine::~EventuallyPersistentEngine() () from /opt/couchbase/lib/memcached/ep.so
#6 0x00007f8dfa0cbed5 in EvpDestroy (handle=0x6d3c000, force=false) at src/ep_engine.cc:133
#7 0x00007f8dfdf87bb5 in bucket_shutdown_engine (key=<value optimized out>, nkey=<value optimized out>, val=0x6cea2a0, nval=<value optimized out>, args=<value optimized out>) at bucket_engine.c:1290
#8 0x00007f8dfdf8e68c in genhash_iter (h=0x6cee000, iterfunc=0x7f8dfdf87b80 <bucket_shutdown_engine>, arg=0x0) at genhash.c:275
#9 0x00007f8dfdf88f46 in bucket_destroy (handle=0x7f8dfe191660, force=<value optimized out>) at bucket_engine.c:1327
#10 0x000000000040980b in main (argc=<value optimized out>, argv=<value optimized out>) at daemon/memcached.c:7935

Thread 3 (Thread 0x7f8dfeb92700 (LWP 3933)):
#0 0x00007f8dfff9f43d in write () from /lib64/libc.so.6
#1 0x00007f8dfff36033 in _IO_new_file_write () from /lib64/libc.so.6
#2 0x00007f8dfff35efa in _IO_new_file_xsputn () from /lib64/libc.so.6
#3 0x00007f8dfff2c0bd in fwrite () from /lib64/libc.so.6
#4 0x00007f8dfeb963e7 in flush_pending_io (arg=0x247c040) at extensions/loggers/file_logger.c:319
#5 logger_thead_main (arg=0x247c040) at extensions/loggers/file_logger.c:357
#6 0x00007f8e0025e851 in start_thread () from /lib64/libpthread.so.0
#7 0x00007f8dfffac6dd in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7f8df34f6700 (LWP 21389)):
#0 0x00007f8e002627bb in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f8dfa0a91c8 in wait (this=0x6cd82d0, d=...) at src/syncobject.hh:57
#2 IdleTask::run (this=0x6cd82d0, d=...) at src/dispatcher.cc:342
#3 0x00007f8dfa0abd2a in Dispatcher::run (this=0x6d20a80) at src/dispatcher.cc:184
#4 0x00007f8dfa0ac4ed in launch_dispatcher_thread (arg=<value optimized out>) at src/dispatcher.cc:28
#5 0x00007f8e0025e851 in start_thread () from /lib64/libpthread.so.0
#6 0x00007f8dfffac6dd in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7f8df3ef7700 (LWP 21388)):
#0 0x00007f8dffef78a5 in raise () from /lib64/libc.so.6
#1 0x00007f8dffef9085 in abort () from /lib64/libc.so.6
#2 0x00007f8dffef0a1e in __assert_fail_base () from /lib64/libc.so.6
#3 0x00007f8dffef0ae0 in __assert_fail () from /lib64/libc.so.6
#4 0x00007f8dfa115dd1 in writeFully (this=0x39a82000) at src/mutation_log.cc:46
#5 MutationLog::writeInitialBlock (this=0x39a82000) at src/mutation_log.cc:163
#6 0x00007f8dfa117fe4 in MutationLog::open (this=0x39a82000, _readOnly=<value optimized out>) at src/mutation_log.cc:273
#7 0x00007f8dfa09999b in ItemAccessVisitor::ItemAccessVisitor (this=0x6cd8c60, _store=<value optimized out>, _stats=<value optimized out>, sfin=<value optimized out>) at src/access_scanner.cc:22
#8 0x00007f8dfa09747a in AccessScanner::callback (this=0x2489ec0, d=..., t=...) at src/access_scanner.cc:108
#9 0x00007f8dfa0abd2a in Dispatcher::run (this=0x6d20c40) at src/dispatcher.cc:184
#10 0x00007f8dfa0ac4ed in launch_dispatcher_thread (arg=<value optimized out>) at src/dispatcher.cc:28
#11 0x00007f8e0025e851 in start_thread () from /lib64/libpthread.so.0
#12 0x00007f8dfffac6dd in clone () from /lib64/libc.so.6

Crash on thread-1
attachments: 10_6_2_68.zip : cbcollect_info logs from the node
Comment by Wayne Siu [ 07/Jun/13 ]
Assigning to 2.1.0 so that it will be included in the current scrub meeting.
Comment by Maria McDuff (Inactive) [ 07/Jun/13 ]
abhinav and alk are both working together in investigating this issue.
Comment by Abhinav Dangeti [ 11/Jun/13 ]
With swappiness set at 0, swap usage was reduced by a considerable amount as when compared to when swappiness was at 10. Swap usage averaged at around 1G per node, during rebalance operations, and access phase.
Comment by Maria McDuff (Inactive) [ 12/Jun/13 ]
Abhinav,

pls provide exact steps (to karen) on how to set swappiness to 0, does it require restart? or any other settings? Pass it on to Karen Z for documentation.
Comment by Abhinav Dangeti [ 12/Jun/13 ]
Setting swappiness (to zero) in linux:
sudo sysctl vm.swappiness=0

To see what the swappiness of the system is: either sysctl vm.swappiness or cat /proc/sys/vm/swappiness.

I think it would be wise to note how the performance is affected with this setting.
I'll try and set up a heavy dgm - rebalance run with and without this setting at zero, to note the effect on performance.
Comment by Abhinav Dangeti [ 13/Jun/13 ]
No significant performance regression in rebalance times with swappiness at 0 and swappiness at 10. (on build 2.1.0-715)
swappiness=10 :: http://qa.hq.northscale.net/job/litmuses-graph-loop/246/ ..rebalance time: 437s
swappiness=0 :: http://qa.hq.northscale.net/job/litmuses-graph-loop/245/ ..rebalance time: 461s

So, I guess the recommendation would be to have swappiness set at zero, whenever there be a views scenario with/without involving topology changes.
Comment by Aleksey Kondratenko [ 14/Jun/13 ]
hm. But sysctl or echo to /proc/sys/vm/swappiness will not hold over reboot.

On both rhel5 and my Debian box I'm seeing /etc/sysctl.conf being used at boot time to set all sorts of sysctl variables.

We should recommend users editing this file.

I.e. just appending:

vm.swappiness = 0

(and reboot) should be enough on both all kinds of ubuntus and all kinds of rhels.
Comment by kzeller [ 20/Jun/13 ]
<rnentry>

<version ver="2.1.0a"/>

<version ver="2.0.0l"/>

<version ver="2.0.0k"/>

<class id="cluster"/>

<issue type="cb" ref="MB-7737"/>
<issue type="cb" ref="MB-7774"/>

<rntext>

<para>
By default most Linux systems have swappiness set to 60. This will lead to overuse of disk swap with
Couchbase Server. Please follow our current recommendations on swappiness and swap space, see
<xref linkend="couchbase-bestpractice-cloud-swap"/>.
</para>


</rntext>

</rnentry>
Comment by kzeller [ 20/Jun/13 ]
<rnentry>

<version ver="2.1.0a"/>

<version ver="2.0.0l"/>

<version ver="2.0.0k"/>

<class id="cluster"/>

<issue type="cb" ref="MB-7737"/>
<issue type="cb" ref="MB-7774"/>

<rntext>

<para>
By default most Linux systems have swappiness set to 60. This will lead to overuse of disk swap with
Couchbase Server. Please follow our current recommendations on swappiness and swap space, see
<xref linkend="couchbase-bestpractice-cloud-swap"/>.
</para>


</rntext>

</rnentry>
Generated at Thu Oct 23 10:26:07 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.