[NCBC-111] Hangs after threads are aborted Created: 04/Sep/12  Updated: 26/Jun/13  Resolved: 26/Jun/13

Status: Resolved
Project: Couchbase .NET client library
Component/s: library
Affects Version/s: 1.1.6
Fix Version/s: 1.2.7

Type: Bug Priority: Major
Reporter: roy.jacobs Assignee: Saakshi Manocha
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Zip Archive HangRepro-NCBC-111.zip     Zip Archive HangRepro.zip    

 Description   
In our production system thread occasionally get aborted to prevent long-running tasks from using resources when the user is no longer interested in the results. We are seeing these kinds of errors in the log:

ERROR Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Could not init pool.
and
ERROR Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Failed to reset an acquired socket.

Also, we are seeing hangs in MessageStreamListener.cs, line 400:
while ((line = reader.ReadLine()) != null)

This will occasionally hang indefinitely waiting for data to come in to the stream when there is no more data coming in.
I have created a reproduction application that creates a bunch of threads in an "abortable" thread pool (which just allows you to abort the threads, instead of being a black box). Starting the application will print a "." for every thread started, a "!" for every thread stopped and an "x" for every thread aborted. The thread itself just does a simple set/get. It typically hangs after about 15-20 seconds, on this ReadLine statement.

You can find the repro attached to this ticket.

This happens with Couchbase server 1.8.1 and the latest client code from Github.

 Comments   
Comment by Matt Ingenthron [ 17/May/13 ]
Can you see if this is reproducible with current server/client?
Comment by Saakshi Manocha [ 21/May/13 ]
Cluster version - 2.0.1-170
Client - latest from Github

Please find the attached program HangRepro-NCBC-111.zip that I used to reproduce the scenario.
The output is "output-with-using-statement.txt" (attached herewith)
The program hangs after 15-20 seconds.

Now if we change the portion of this code so as to remove the using statement and avoid recreation of client everytime a thread is created, it looks like this:

while (true)
                {
                    Console.Write(".");
                    var client = new CouchbaseClient(section);
                    // Start a thread that just does a simple add/get and then stops
                    var wi = AbortableThreadPool.QueueUserWorkItem(_ =>
                                                   {
                                                       client.Store(StoreMode.Add, "somekey", "somevalue");
                                                       var someValue = client.Get<string>("somekey");
                                                       if (someValue != "somevalue") throw new InvalidOperationException();
                                                       Console.Write("!");
                                                   });

                    // Maybe kill the thread
                    if (rnd.NextDouble() < 0.75) AbortableThreadPool.Cancel(wi, true);

                    // Wait a bit
                    Thread.Sleep((int)Math.Floor(rnd.NextDouble() * 10.0));

                    // Maybe kill the thread
                    if (rnd.NextDouble() < 0.75) AbortableThreadPool.Cancel(wi, true);
                }

With the above change in code, when I run the program again, it never hangs, I kept it running for 2-3 minutes, it was working fine.
The output is "output-without-using-statement.txt" (attached herewith)
Comment by Saakshi Manocha [ 21/May/13 ]
@Roy: Could you please try again without the "using" statement, and let me know if it helps!
Comment by Saakshi Manocha [ 07/Jun/13 ]
@Roy: Could you please confirm whether the issue is resolved with the above-mentioned solution. Thanks!
Comment by Saakshi Manocha [ 26/Jun/13 ]
Closing this issue for now, as no response from customer.
Customer may reopen or create a new issue if problem reoccurs.
Generated at Sat Apr 19 19:17:39 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.