Details
Description
In our production system thread occasionally get aborted to prevent long-running tasks from using resources when the user is no longer interested in the results. We are seeing these kinds of errors in the log:
ERROR Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Could not init pool.
and
ERROR Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Failed to reset an acquired socket.
Also, we are seeing hangs in MessageStreamListener.cs, line 400:
while ((line = reader.ReadLine()) != null)
This will occasionally hang indefinitely waiting for data to come in to the stream when there is no more data coming in.
I have created a reproduction application that creates a bunch of threads in an "abortable" thread pool (which just allows you to abort the threads, instead of being a black box). Starting the application will print a "." for every thread started, a "!" for every thread stopped and an "x" for every thread aborted. The thread itself just does a simple set/get. It typically hangs after about 15-20 seconds, on this ReadLine statement.
You can find the repro attached to this ticket.
This happens with Couchbase server 1.8.1 and the latest client code from Github.
ERROR Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Could not init pool.
and
ERROR Enyim.Caching.Memcached.MemcachedNode.InternalPoolImpl - Failed to reset an acquired socket.
Also, we are seeing hangs in MessageStreamListener.cs, line 400:
while ((line = reader.ReadLine()) != null)
This will occasionally hang indefinitely waiting for data to come in to the stream when there is no more data coming in.
I have created a reproduction application that creates a bunch of threads in an "abortable" thread pool (which just allows you to abort the threads, instead of being a black box). Starting the application will print a "." for every thread started, a "!" for every thread stopped and an "x" for every thread aborted. The thread itself just does a simple set/get. It typically hangs after about 15-20 seconds, on this ReadLine statement.
You can find the repro attached to this ticket.
This happens with Couchbase server 1.8.1 and the latest client code from Github.
-
Hide
- HangRepro-NCBC-111.zip
- 21/May/13 3:00 AM
- 1.97 MB
- Saakshi Manocha
-
- HangRepro-NCBC-111/.../NuGet.Config 0.2 kB
- HangRepro-NCBC-111/HangRepro/.../NuGet.exe 616 kB
- HangRepro-NCBC-111/.../NuGet.targets 7 kB
- HangRepro-NCBC-111/.../AbortableThreadPool.cs 3 kB
- HangRepro-NCBC-111/HangRepro/App.config 2 kB
- HangRepro-NCBC-111/.../Couchbase.dll 174 kB
- HangRepro-NCBC-111/.../Couchbase.pdb 436 kB
- HangRepro-NCBC-111/.../Enyim.Caching.dll 164 kB
- HangRepro-NCBC-111/.../Enyim.Caching.Log4NetAdapter.dll 8 kB
- HangRepro-NCBC-111/.../Enyim.Caching.Log4NetAdapter.pdb 28 kB
- HangRepro-NCBC-111/.../Enyim.Caching.pdb 662 kB
- HangRepro-NCBC-111/HangRepro/.../log.txt 7.20 MB
- HangRepro-NCBC-111/HangRepro/.../log4net.dll 264 kB
- HangRepro-NCBC-111/HangRepro/.../log4net.xml 1.30 MB
- HangRepro-NCBC-111/.../Newtonsoft.Json.dll 383 kB
- HangRepro-NCBC-111/.../Newtonsoft.Json.xml 430 kB
- HangRepro-NCBC-111/HangRepro/.../Test.exe 9 kB
- HangRepro-NCBC-111/.../Test.exe.config 2 kB
- HangRepro-NCBC-111/HangRepro/.../Test.pdb 22 kB
- HangRepro-NCBC-111/.../Test.vshost.exe 11 kB
- HangRepro-NCBC-111/.../Test.vshost.exe.config 2 kB
- HangRepro-NCBC-111/.../Test.vshost.exe.manifest 0.5 kB
- HangRepro-NCBC-111/.../DesignTimeResolveAssemblyReferencesInput.cache 6 kB
- HangRepro-NCBC-111/.../ResolveAssemblyReference.cache 26 kB
- HangRepro-NCBC-111/.../Test.csproj.FileListAbsolute.txt 1 kB
- HangRepro-NCBC-111/HangRepro/.../Test.exe 9 kB
- HangRepro-NCBC-111/HangRepro/.../Test.pdb 22 kB
- HangRepro-NCBC-111/.../packages.config 0.1 kB
- HangRepro-NCBC-111/.../CouchbaseNetClient.1.1.6.nupkg 332 kB
- HangRepro-NCBC-111/.../Couchbase.dll 79 kB
-
Hide
- HangRepro.zip
- 04/Sep/12 10:26 AM
- 227 kB
- roy.jacobs
-
- HangRepro/.nuget/NuGet.Config 0.2 kB
- HangRepro/.nuget/NuGet.exe 616 kB
- HangRepro/.nuget/NuGet.targets 7 kB
- HangRepro/AbortableThreadPool.cs 3 kB
- HangRepro/App.config 0.9 kB
- HangRepro/packages.config 0.1 kB
- HangRepro/Program.cs 2 kB
- HangRepro/Properties/AssemblyInfo.cs 1 kB
- HangRepro/Test.csproj 5 kB
- HangRepro/Test.csproj.user 0.5 kB
- HangRepro/Test.sln 1 kB
- HangRepro/Test.suo 85 kB
Activity
- All
- Comments
- Work Log
- History
- Activity
- Gerrit Reviews
Hide
Permalink
Matt Ingenthron
added a comment -
Can you see if this is reproducible with current server/client?
Show
Matt Ingenthron
added a comment - Can you see if this is reproducible with current server/client?
Hide
Saakshi Manocha
added a comment -
Cluster version - 2.0.1-170
Client - latest from Github
Please find the attached program HangRepro-NCBC-111.zip that I used to reproduce the scenario.
The output is "output-with-using-statement.txt" (attached herewith)
The program hangs after 15-20 seconds.
Now if we change the portion of this code so as to remove the using statement and avoid recreation of client everytime a thread is created, it looks like this:
while (true)
{
Console.Write(".");
var client = new CouchbaseClient(section);
// Start a thread that just does a simple add/get and then stops
var wi = AbortableThreadPool.QueueUserWorkItem(_ =>
{
client.Store(StoreMode.Add, "somekey", "somevalue");
var someValue = client.Get<string>("somekey");
if (someValue != "somevalue") throw new InvalidOperationException();
Console.Write("!");
});
// Maybe kill the thread
if (rnd.NextDouble() < 0.75) AbortableThreadPool.Cancel(wi, true);
// Wait a bit
Thread.Sleep((int)Math.Floor(rnd.NextDouble() * 10.0));
// Maybe kill the thread
if (rnd.NextDouble() < 0.75) AbortableThreadPool.Cancel(wi, true);
}
With the above change in code, when I run the program again, it never hangs, I kept it running for 2-3 minutes, it was working fine.
The output is "output-without-using-statement.txt" (attached herewith)
Client - latest from Github
Please find the attached program HangRepro-NCBC-111.zip that I used to reproduce the scenario.
The output is "output-with-using-statement.txt" (attached herewith)
The program hangs after 15-20 seconds.
Now if we change the portion of this code so as to remove the using statement and avoid recreation of client everytime a thread is created, it looks like this:
while (true)
{
Console.Write(".");
var client = new CouchbaseClient(section);
// Start a thread that just does a simple add/get and then stops
var wi = AbortableThreadPool.QueueUserWorkItem(_ =>
{
client.Store(StoreMode.Add, "somekey", "somevalue");
var someValue = client.Get<string>("somekey");
if (someValue != "somevalue") throw new InvalidOperationException();
Console.Write("!");
});
// Maybe kill the thread
if (rnd.NextDouble() < 0.75) AbortableThreadPool.Cancel(wi, true);
// Wait a bit
Thread.Sleep((int)Math.Floor(rnd.NextDouble() * 10.0));
// Maybe kill the thread
if (rnd.NextDouble() < 0.75) AbortableThreadPool.Cancel(wi, true);
}
With the above change in code, when I run the program again, it never hangs, I kept it running for 2-3 minutes, it was working fine.
The output is "output-without-using-statement.txt" (attached herewith)
Show
Saakshi Manocha
added a comment - Cluster version - 2.0.1-170
Client - latest from Github
Please find the attached program HangRepro- NCBC-111 .zip that I used to reproduce the scenario.
The output is "output-with-using-statement.txt" (attached herewith)
The program hangs after 15-20 seconds.
Now if we change the portion of this code so as to remove the using statement and avoid recreation of client everytime a thread is created, it looks like this:
while (true)
{
Console.Write(".");
var client = new CouchbaseClient(section);
// Start a thread that just does a simple add/get and then stops
var wi = AbortableThreadPool.QueueUserWorkItem(_ =>
{
client.Store(StoreMode.Add, "somekey", "somevalue");
var someValue = client.Get<string>("somekey");
if (someValue != "somevalue") throw new InvalidOperationException();
Console.Write("!");
});
// Maybe kill the thread
if (rnd.NextDouble() < 0.75) AbortableThreadPool.Cancel(wi, true);
// Wait a bit
Thread.Sleep((int)Math.Floor(rnd.NextDouble() * 10.0));
// Maybe kill the thread
if (rnd.NextDouble() < 0.75) AbortableThreadPool.Cancel(wi, true);
}
With the above change in code, when I run the program again, it never hangs, I kept it running for 2-3 minutes, it was working fine.
The output is "output-without-using-statement.txt" (attached herewith)
Hide
Saakshi Manocha
added a comment -
@Roy: Could you please try again without the "using" statement, and let me know if it helps!
Show
Saakshi Manocha
added a comment - @Roy: Could you please try again without the "using" statement, and let me know if it helps!