[MB-4514] ns_server crash because linux can run out of open fil descriptors if user defines more than 10 views for one bucket Created: 07/Dec/11  Updated: 10/Jan/13  Resolved: 21/Dec/11

Status: Closed
Project: Couchbase Server
Component/s: ns_server, view-engine
Affects Version/s: None
Fix Version/s: None
Security Level: Public

Type: Bug Priority: Major
Reporter: Farshid Ghods (Inactive) Assignee: Unassigned
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: GZip Archive 10.1.6.104.log.gz     File 10.1.6.149.log.1    

 Description   
saw this during testing.



[error_logger:error] [2011-12-07 9:19:40] [ns_1@127.0.0.1:error_logger:ale_error_logger_handler:log_msg:76] ** Generic server mb_mnesia terminating
** Last message in was {mnesia_system_event,
                          {mnesia_fatal,"Cannot open log file ~p: ~p~n",
                              ["/opt/couchbase/var/lib/couchbase/mnesia/PREVIOUS.LOG",
                               {file_error,
                                   "/opt/couchbase/var/lib/couchbase/mnesia/PREVIOUS.LOG",
                                   system_limit}],
                              <<131,108,0,0,0,24,104,2,100,0,9,99,114,97,
                                115,104,105,110,102,111,104,2,107,0,29,67,
                                97,110,110,111,116,32,111,112,101,110,32,
                                108,111,103,32,102,105,108,101,32,126,112,
                                58,32,126,112,126,110,108,0,0,0,2,107,0,52,


will have to repro and gather more information.

 Comments   
Comment by Farshid Ghods (Inactive) [ 07/Dec/11 ]
both root and couchbase user have ulimit -n as 1024
[root@centos-farshid-2 bin]# sudo su - couchbase
-sh-3.2$ ulimit -n
1024
Comment by Farshid Ghods (Inactive) [ 07/Dec/11 ]
and there are 10 views

10 views :
[root@centos-farshid-2 set_view_default_design]# ls -l
total 604
-rw-r--r-- 1 couchbase couchbase 67841 Dec 7 10:07 055b0b9ffa6ae90b26ab5ca9f42ca137.view
-rw-r--r-- 1 couchbase couchbase 67841 Dec 7 10:07 0dd7d7cca8281d948d19d8f367465567.view
-rw-r--r-- 1 couchbase couchbase 71937 Dec 7 10:09 154e29e9314762a746af4efa7139651e.view
-rw-r--r-- 1 couchbase couchbase 47361 Dec 7 10:06 2578f65f866cdb8d05197b677f94f42e.view
-rw-r--r-- 1 couchbase couchbase 59649 Dec 7 10:08 75d1311b41d4d5bca8de5856d7f2aff0.view
-rw-r--r-- 1 couchbase couchbase 51457 Dec 7 10:06 7b6e6cc5a368c6d0b1f3cf6365a91523.view
-rw-r--r-- 1 couchbase couchbase 43262 Dec 7 10:03 e2201450b32bd5f742a7160c64dce5c6.view
-rw-r--r-- 1 couchbase couchbase 43265 Dec 7 10:05 e9ebf51f2ac6dd9f84b8afaf55da8296.view
-rw-r--r-- 1 couchbase couchbase 47361 Dec 7 10:05 ec16f016e0a4c8d7ff4ae39dad7969bc.view
-rw-r--r-- 1 couchbase couchbase 39164 Dec 7 10:02 eeef5d510738ed05362b3f55ba97df89.view


errors :
[couchdb:info] [2011-12-07 10:09:58] [ns_1@127.0.0.1:<0.380.0>:couch_log:error:42] Error opening view group `dev_test_view_on_10k_docs-21b8b9d` from database `default/9`: {error,

                                         system_limit}
Comment by Farshid Ghods (Inactive) [ 07/Dec/11 ]
I see this line in couchbase-server . does that mean we are trying to set the ulimit to 10240 ? seems like an installer bug ??


./couchbase-server: if [ `ulimit -n` -lt 10240 ]
[root@centos-farshid-2 bin]# ulimit -n
1024
Comment by Farshid Ghods (Inactive) [ 07/Dec/11 ]
reproducible by running this test ten times


python testrunner -i resources/jenkins/single-node-centos-64.ini -t viewtests.ViewTests.test_count_reduce_100_docs -p skip_cleanup=True

replace the resources/jenkins/single-node-centos-64.ini with an ini file that has cluster_run server information
Comment by Farshid Ghods (Inactive) [ 07/Dec/11 ]
From Aliaksey :

"Bug ? where ? Our _initscript_ sets ulimit just before changing user to couchbase and starting daemon. And script that starts daemon checks limit and warns if it's too small. So we have evidence that raising limit works."

so seems like we are running out of 10240 file descriptors after creating 10 views.

Damien was also talking about this issue few days ago about fd leak
Comment by Aliaksey Artamonau [ 21/Dec/11 ]
Fixed by http://review.couchbase.org/10386.
Generated at Sat Aug 30 22:48:19 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.