Couchbase
  • Why NoSQL?
  • Couchbase Server
  • Download
  • Resources
  • Careers
Home | Forums | Membase | Memcached Server 1.0.3

Memcache failing

13 replies [Last post]
  • Login or register to post comments
Thu, 04/08/2010 - 11:23
jim
Offline
Joined: 03/26/2010
Groups: None

Hi,
I have successfully tested NSMS on my local machine and then in a qa environment. I have just deployed it to production today and the memcached server seems to fail. I am not sure how to open the log file to read it and if you can give me some guidance on that I will check out what it says. The console application that comes with the server says the server is up.

When I restart the server from the services section the server will start and run for about 10 seconds and then the console just stops responding and the server is as well. I am getting "Memcache Protocol, Unknown opcode (33) Request" but I think that is just the sites trying to determine if the cache is working.

Config:
NSMS is on W2k8 32GB ram
running 2 webservers with roughly 85 sites all configured to use the same bucket.
using the Enyim asp.net client with
minPoolSize="10"
maxPoolSize="100"
connectionTimeout="00:00:10"

Thanks for all the help you guys have given so far.
Jim

Top
Thu, 04/08/2010 - 11:48
rod
Offline
Joined: 03/15/2010
Groups: None

Hi Jim,

Can you telnet to the servers on port 11211 and successfully connect to the cache? What does the web console log say (if anything, sounds like it has stopped responding)?

Thanks.
Rod

Top
Thu, 04/08/2010 - 11:48
jim
Offline
Joined: 03/26/2010
Groups: None

I have just found some documentation that says that the default number of connections for a memcached server is 1024. If this is correct then having upto 8000 connections of load would definitely cause problems. Would this cause the NSMS to stop responding? Is there a way to increase the number of connections for the NSMS?
Thanks,
Jim

Top
Thu, 04/08/2010 - 11:55
jim
Offline
Joined: 03/26/2010
Groups: None

I am able to telnet into the cache and the web console log has the last time that I restarted the service.

Top
Thu, 04/08/2010 - 12:06
rod
Offline
Joined: 03/15/2010
Groups: None

What other differences are there in your QA vs Production environments? If you private message me your phone we can call you directly and help you resolve this more quickly.

Top
Thu, 04/08/2010 - 12:59
jim
Offline
Joined: 03/26/2010
Groups: None

The NSMS is the same instance for both environments just using different buckets. The code is currently the same between qa and production. All servers are running w2k8, the qa server has less ram. Everything else is the same. All servers are hosted on the same network switch.

Top
Thu, 04/08/2010 - 13:52
sean
Offline
Joined: 03/15/2010
Groups: None

Hi, Jim. You're correct, the default limit is 1024. You can increase this in the config file.

In Program Files\Northscale\Memcached Server\priv\config, look for the following section:

{port_servers,
[{'_ver', {0, 0, 0}},
{memcached, "./priv/memcached",
["-p", "11212",
"-E", "./priv/engines/bucket_engine.so",
"-e", "admin=_admin;engine=./priv/engines/default_engine.so;default_bucket_name=default;auto_create=false"
],
[{env, [{"MEMCACHED_CHECK_STDIN", "event"},
{"MEMCACHED_TOP_KEYS", "100"},
{"ISASL_PWFILE", "./priv/isasl.pw"}, % Also isasl path above.
{"ISASL_DB_CHECK_TIME", "1"}
]},
use_stdio,
stderr_to_stdout,
stream]
}
]
}.

Change this to:

{port_servers,
[{'_ver', {0, 0, 0}},
{memcached, "./priv/memcached",
["-p", "11212",
"-c", "10000",
"-E", "./priv/engines/bucket_engine.so",
"-e", "admin=_admin;engine=./priv/engines/default_engine.so;default_bucket_name=default;auto_create=false"
],
[{env, [{"MEMCACHED_CHECK_STDIN", "event"},
{"MEMCACHED_TOP_KEYS", "100"},
{"ISASL_PWFILE", "./priv/isasl.pw"}, % Also isasl path above.
{"ISASL_DB_CHECK_TIME", "1"},
{"EVENT_NOSELECT", "1"}
]},
use_stdio,
stderr_to_stdout,
stream]
}
]
}.

The relevant changes are the addition of -c 10000 to the command line and the EVENT_NOSELECT environment variable.

You'll also need to remove the Program Files\Northscale\Memcached Server\config\ns_1 directory, as this has a cached version of the config. You'll need to re-do any other configuration changes you've made and re-create your cluster if you've joined a cluster.

Top
Thu, 04/08/2010 - 15:36
jim
Offline
Joined: 03/26/2010
Groups: None

If the number of incoming connections exceeded that 1024 would that crash the server? Although I do feel that this needs to be changed I am not sure that makes sense to me. Shouldn't memcache just ignore those connections and move on?

Top
Fri, 04/09/2010 - 08:50
jim
Offline
Joined: 03/26/2010
Groups: None

Changing the number of connections has done the trick. I am still a little confused on why exceeding the connection limit, even excessively like I was, would cause the server to crash.

I did notice that the hit ratio view on the bucket seems to be a bit off or I am just thinking about it wrong. The ratio is sitting at 0 - 0.001 while my gets/sec avg 35-39 and the hits/sec are sitting at 32-36. This seems like my hit ratio should be between 80-90%.

Also it would be nice to have a bucket column on the cluster analytic page.

Otherwise I am very impressed with your quick and incredibly helpful support via the forum. I can't wait to see what additional features come out in your next release!

Thanks again,
Jim

Top
Fri, 04/09/2010 - 09:12
sean
Offline
Joined: 03/15/2010
Groups: None

There are several things the cluster manager needs to connect to memcached for, and we haven't really tested it in the situation where it can't connect for a long period of time. The normal thing for Erlang to do when it encounters a situation it doesn't expect is to crash the specific Erlang process (there are hundreds of these) that encountered the unexpected condition, with the expectation that the process will simply be restarted by its supervisor. Eventually, if a process repeatedly crashes, this propagates up the supervisor chain. This will eventually result in memcached's being restarted, which is usually the right thing to do since inability to connect for a long period of time would usually indicate that something was wrong with memcached.

We could make it so inability to connect to memcached wouldn't cause the cluster manager to restart memcached (or crash), but hopefully once we've increased the default connection limit beyond what anyone could reasonably use this situation won't occur.

Top
Fri, 04/09/2010 - 09:54
rod
Offline
Joined: 03/15/2010
Groups: None

Hi Jim,

Very glad to hear everything is working as expected now. We will be including this fix in our next release (1.0.2). Also, we are looking into the hit ratio inconsistency you are seeing as well. Please send me your contact information in a direct message (or email me directly rod [at] northscale.com) so we can follow-up sometime early next week. Thanks.

Top
Fri, 09/17/2010 - 17:19
Ravi
Offline
Joined: 09/14/2010
Groups: None

Hello Sean, Thanks for the detailed explanation.

We have a clustered server setup with memcached server version 1.0.3 - Win 64 and one bucket configured. I tried to increase the allowed connections in the server as per your instructions above, modified the config and removed ns_1 folder, restarted the server, joined the cluster and recreated the bucket.

After that I when I increase the maxPoolSize in the client to anything above 1000 I am getting an error. the client used is .Net C# with Enyim client version 3.5.

I am guessing there needs to be additional config for the bucket. Attaching the diag herewith. Please clarify.

Thanks
Ravi

Top
Sun, 09/19/2010 - 08:04
Perry Krug
Offline
Joined: 06/02/2010
Groups: None

Thanks Ravi. What is the error that you are receiving from the client? Also, just to confirm, you are using Enyim 2.5 correct?

__________________

Forum support is great for free but sometimes you need a guaranteed response time and dedicated resources for your questions or issues.
Consider purchasing enterprise-level support from Membase: http://www.membase.com/products-and-services/overview
Call or email "sales -at- membase -dot- com" today!

Top
Sun, 09/19/2010 - 21:43
Ravi
Offline
Joined: 09/14/2010
Groups: None

Hello Perry,
Yes, it is Enyim client version 2.5.
Error details:

Message "The type initializer for 'NorthScale.Store.NorthScaleClient' threw an exception."

StackTrace " at System.Configuration.BaseConfigurationRecord.EvaluateOne(String[] keys, SectionInput input, Boolean isTrusted, FactoryRecord factoryRecord, SectionRecord sectionRecord, Object parentResult)\r\n at System.Configuration.BaseConfigurationRecord.Evaluate(FactoryRecord factoryRecord, SectionRecord sectionRecord, Object parentResult, Boolean getLkg, Boolean getRuntimeObject, Object& result, Object& resultRuntimeObject)\r\n at System.Configuration.BaseConfigurationRecord.GetSectionRecursive(String configKey, Boolean getLkg, Boolean checkPermission, Boolean getRuntimeObject, Boolean requestIsHere, Object& result, Object& resultRuntimeObject)\r\n at System.Configuration.BaseConfigurationRecord.GetSectionRecursive(String configKey, Boolean getLkg, Boolean checkPermission, Boolean getRuntimeObject, Boolean requestIsHere, Object& result, Object& resultRuntimeObject)\r\n at System.Configuration.BaseConfigurationRecord.GetSectionRecursive(String configKey, Boolean getLkg, Boolean checkPermission, Boolean getRuntimeObject, Boolean requestIsHere, Object& result, Object& resultRuntimeObject)\r\n at System.Configuration.BaseConfigurationRecord.GetSection(String configKey, Boolean getLkg, Boolean checkPermission)\r\n at System.Configuration.BaseConfigurationRecord.GetSection(String configKey)\r\n at System.Configuration.ClientConfigurationSystem.System.Configuration.Internal.IInternalConfigSystem.GetSection(String sectionName)\r\n at System.Configuration.ConfigurationManager.GetSection(String sectionName)\r\n at NorthScale.Store.NorthScaleClient..cctor() in d:\\EnyimMemcached\\Northscale.Store\\NorthScaleClient.cs:line 14"

Top
  • Login or register to post comments
  • Login
  • Register

Company

  • About Us
  • Leadership
  • Customers
  • Partners
  • Contact Us

Product

  • Couchbase Server
  • Couchbase SDKs
  • Use Cases
  • Documentation
  • Forums

Open Source

  • Couchbase Project
  • Couchbase vs. CouchDB

Commercial

  • Subscriptions & Support
  • Training & Services

News

  • Blog
  • Newsletter
  • Press Releases
  • Buzz

Follow Us

    
  • Customer Login
  • Terms of Service
  • Privacy Policy
  • Trademark Policy
  • Site Map

© 2013 COUCHBASE All rights reserved.

Sign in to Couchbase Community

close
  • Create new account
  • Request new password
You are logging into the Forums, Wiki and Issue Tracker