[MB-4514] ns_server crash because linux can run out of open fil descriptors if user defines more than 10 views for one bucket Created: 07/Dec/11 Updated: 10/Jan/13 Resolved: 21/Dec/11 |
|
| Status: | Closed |
| Project: | Couchbase Server |
| Component/s: | ns_server, view-engine |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Security Level: | Public |
| Type: | Bug | Priority: | Major |
| Reporter: | Farshid Ghods | Assignee: | Unassigned |
| Resolution: | Fixed | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Description |
|
saw this during testing.
[error_logger:error] [2011-12-07 9:19:40] [ns_1@127.0.0.1:error_logger:ale_error_logger_handler:log_msg:76] ** Generic server mb_mnesia terminating ** Last message in was {mnesia_system_event, {mnesia_fatal,"Cannot open log file ~p: ~p~n", ["/opt/couchbase/var/lib/couchbase/mnesia/PREVIOUS.LOG", {file_error, "/opt/couchbase/var/lib/couchbase/mnesia/PREVIOUS.LOG", system_limit}], <<131,108,0,0,0,24,104,2,100,0,9,99,114,97, 115,104,105,110,102,111,104,2,107,0,29,67, 97,110,110,111,116,32,111,112,101,110,32, 108,111,103,32,102,105,108,101,32,126,112, 58,32,126,112,126,110,108,0,0,0,2,107,0,52, will have to repro and gather more information. |
| Comments |
| Comment by Farshid Ghods [ 07/Dec/11 ] |
|
both root and couchbase user have ulimit -n as 1024
[root@centos-farshid-2 bin]# sudo su - couchbase -sh-3.2$ ulimit -n 1024 |
| Comment by Farshid Ghods [ 07/Dec/11 ] |
|
and there are 10 views
10 views : [root@centos-farshid-2 set_view_default_design]# ls -l total 604 -rw-r--r-- 1 couchbase couchbase 67841 Dec 7 10:07 055b0b9ffa6ae90b26ab5ca9f42ca137.view -rw-r--r-- 1 couchbase couchbase 67841 Dec 7 10:07 0dd7d7cca8281d948d19d8f367465567.view -rw-r--r-- 1 couchbase couchbase 71937 Dec 7 10:09 154e29e9314762a746af4efa7139651e.view -rw-r--r-- 1 couchbase couchbase 47361 Dec 7 10:06 2578f65f866cdb8d05197b677f94f42e.view -rw-r--r-- 1 couchbase couchbase 59649 Dec 7 10:08 75d1311b41d4d5bca8de5856d7f2aff0.view -rw-r--r-- 1 couchbase couchbase 51457 Dec 7 10:06 7b6e6cc5a368c6d0b1f3cf6365a91523.view -rw-r--r-- 1 couchbase couchbase 43262 Dec 7 10:03 e2201450b32bd5f742a7160c64dce5c6.view -rw-r--r-- 1 couchbase couchbase 43265 Dec 7 10:05 e9ebf51f2ac6dd9f84b8afaf55da8296.view -rw-r--r-- 1 couchbase couchbase 47361 Dec 7 10:05 ec16f016e0a4c8d7ff4ae39dad7969bc.view -rw-r--r-- 1 couchbase couchbase 39164 Dec 7 10:02 eeef5d510738ed05362b3f55ba97df89.view errors : [couchdb:info] [2011-12-07 10:09:58] [ns_1@127.0.0.1:<0.380.0>:couch_log:error:42] Error opening view group `dev_test_view_on_10k_docs-21b8b9d` from database `default/9`: {error, system_limit} |
| Comment by Farshid Ghods [ 07/Dec/11 ] |
|
I see this line in couchbase-server . does that mean we are trying to set the ulimit to 10240 ? seems like an installer bug ??
./couchbase-server: if [ `ulimit -n` -lt 10240 ] [root@centos-farshid-2 bin]# ulimit -n 1024 |
| Comment by Farshid Ghods [ 07/Dec/11 ] |
|
reproducible by running this test ten times
python testrunner -i resources/jenkins/single-node-centos-64.ini -t viewtests.ViewTests.test_count_reduce_100_docs -p skip_cleanup=True replace the resources/jenkins/single-node-centos-64.ini with an ini file that has cluster_run server information |
| Comment by Farshid Ghods [ 07/Dec/11 ] |
|
From Aliaksey :
"Bug ? where ? Our _initscript_ sets ulimit just before changing user to couchbase and starting daemon. And script that starts daemon checks limit and warns if it's too small. So we have evidence that raising limit works." so seems like we are running out of 10240 file descriptors after creating 10 views. Damien was also talking about this issue few days ago about fd leak |
| Comment by Aliaksey Artamonau [ 21/Dec/11 ] |
| Fixed by http://review.couchbase.org/10386. |