Details
-
Type:
Bug
-
Status:
Resolved
-
Priority:
Major
-
Resolution: Cannot Reproduce
-
Affects Version/s: 2.0-beta-2
-
Fix Version/s: 2.0.1
-
Component/s: couchbase-bucket
-
Security Level: Public
-
Labels:None
Description
need some diagnosis of beta user's issue with .net clients seeing lost connections.
cbcollect-info's coming soon.
cbcollect-info's coming soon.
-
- chunk1_to_chunk2.diff.txt
- 30/Nov/12 12:46 AM
- 2 kB
- Matt Ingenthron
-
- chunk2_to_chunk3.diff.txt
- 30/Nov/12 12:46 AM
- 2 kB
- Matt Ingenthron
-
- TCP Conversation from Streaming HTTP.txt
- 30/Nov/12 12:46 AM
- 24 kB
- Matt Ingenthron
Activity
- All
- Comments
- Work Log
- History
- Activity
- Gerrit Reviews
Hide
Permalink
Matt Ingenthron
added a comment -
Note: I think we're looking for two things here. #1: any evidence of connections being dropped by the server, especially unexpectedly. #2 any evidence that the cluster is sending reconfiguration commands, again unexpectedly.
Show
Matt Ingenthron
added a comment - Note: I think we're looking for two things here. #1: any evidence of connections being dropped by the server, especially unexpectedly. #2 any evidence that the cluster is sending reconfiguration commands, again unexpectedly.
Hide
Chiyoung Seo
added a comment -
From the memcached log, I found the following error messages:
Mon Nov 19 09:08:55.664956 CET 3: 144 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.664980 CET 3: 141 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665027 CET 3: 146 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665035 CET 3: 140 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665049 CET 3: 142 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665065 CET 3: 150 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665077 CET 3: 148 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665087 CET 3: 149 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665103 CET 3: 145 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665104 CET 3: 143 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665153 CET 3: 147 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665171 CET 3: 151 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665914 CET 3: 152 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.666347 CET 3: 153 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.667156 CET 3: 154 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.667506 CET 3: 155 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.668280 CET 3: 156 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.669499 CET 3: 157 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.673889 CET 3: 162 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.673909 CET 3: 160 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.673930 CET 3: 158 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.673961 CET 3: 161 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.673957 CET 3: 159 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.674022 CET 3: 163 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.364189 CET 3: 155 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.364795 CET 3: 115 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.365830 CET 3: 159 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.367275 CET 3: 117 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.368510 CET 3: 124 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.368988 CET 3: 102 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.369705 CET 3: 160 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.373880 CET 3: 114 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.374817 CET 3: 152 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.375379 CET 3: 119 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.378715 CET 3: 163 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.379256 CET 3: 154 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.380043 CET 3: 118 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.467618 CET 3: 122 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.468735 CET 3: 161 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.469922 CET 3: 151 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.470651 CET 3: 156 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.472597 CET 3: 157 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.473077 CET 3: 121 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.474281 CET 3: 120 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.474747 CET 3: 153 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.475528 CET 3: 113 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.476080 CET 3: 125 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.476732 CET 3: 158 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.479657 CET 3: 162 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.688615 CET 3: 114 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.691949 CET 3: 163 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.692678 CET 3: 127 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.700778 CET 3: 117 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.703451 CET 3: 120 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.704603 CET 3: 157 Closing connection due to read error: Connection reset by peer
...
Seems to me that lots of connections were suddenly closed by the client side and then the memcached closed the connections.
Mon Nov 19 09:08:55.664956 CET 3: 144 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.664980 CET 3: 141 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665027 CET 3: 146 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665035 CET 3: 140 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665049 CET 3: 142 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665065 CET 3: 150 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665077 CET 3: 148 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665087 CET 3: 149 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665103 CET 3: 145 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665104 CET 3: 143 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665153 CET 3: 147 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665171 CET 3: 151 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665914 CET 3: 152 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.666347 CET 3: 153 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.667156 CET 3: 154 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.667506 CET 3: 155 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.668280 CET 3: 156 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.669499 CET 3: 157 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.673889 CET 3: 162 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.673909 CET 3: 160 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.673930 CET 3: 158 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.673961 CET 3: 161 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.673957 CET 3: 159 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.674022 CET 3: 163 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.364189 CET 3: 155 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.364795 CET 3: 115 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.365830 CET 3: 159 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.367275 CET 3: 117 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.368510 CET 3: 124 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.368988 CET 3: 102 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.369705 CET 3: 160 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.373880 CET 3: 114 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.374817 CET 3: 152 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.375379 CET 3: 119 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.378715 CET 3: 163 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.379256 CET 3: 154 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.380043 CET 3: 118 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.467618 CET 3: 122 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.468735 CET 3: 161 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.469922 CET 3: 151 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.470651 CET 3: 156 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.472597 CET 3: 157 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.473077 CET 3: 121 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.474281 CET 3: 120 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.474747 CET 3: 153 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.475528 CET 3: 113 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.476080 CET 3: 125 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.476732 CET 3: 158 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.479657 CET 3: 162 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.688615 CET 3: 114 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.691949 CET 3: 163 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.692678 CET 3: 127 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.700778 CET 3: 117 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.703451 CET 3: 120 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.704603 CET 3: 157 Closing connection due to read error: Connection reset by peer
...
Seems to me that lots of connections were suddenly closed by the client side and then the memcached closed the connections.
Show
Chiyoung Seo
added a comment - From the memcached log, I found the following error messages:
Mon Nov 19 09:08:55.664956 CET 3: 144 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.664980 CET 3: 141 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665027 CET 3: 146 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665035 CET 3: 140 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665049 CET 3: 142 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665065 CET 3: 150 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665077 CET 3: 148 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665087 CET 3: 149 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665103 CET 3: 145 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665104 CET 3: 143 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665153 CET 3: 147 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665171 CET 3: 151 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.665914 CET 3: 152 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.666347 CET 3: 153 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.667156 CET 3: 154 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.667506 CET 3: 155 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.668280 CET 3: 156 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.669499 CET 3: 157 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.673889 CET 3: 162 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.673909 CET 3: 160 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.673930 CET 3: 158 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.673961 CET 3: 161 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.673957 CET 3: 159 Closing connection due to read error: Connection reset by peer
Mon Nov 19 09:08:55.674022 CET 3: 163 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.364189 CET 3: 155 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.364795 CET 3: 115 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.365830 CET 3: 159 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.367275 CET 3: 117 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.368510 CET 3: 124 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.368988 CET 3: 102 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.369705 CET 3: 160 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.373880 CET 3: 114 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.374817 CET 3: 152 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.375379 CET 3: 119 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.378715 CET 3: 163 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.379256 CET 3: 154 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.380043 CET 3: 118 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.467618 CET 3: 122 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.468735 CET 3: 161 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.469922 CET 3: 151 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.470651 CET 3: 156 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.472597 CET 3: 157 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.473077 CET 3: 121 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.474281 CET 3: 120 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.474747 CET 3: 153 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.475528 CET 3: 113 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.476080 CET 3: 125 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.476732 CET 3: 158 Closing connection due to read error: Connection reset by peer
Mon Nov 19 10:54:38.479657 CET 3: 162 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.688615 CET 3: 114 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.691949 CET 3: 163 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.692678 CET 3: 127 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.700778 CET 3: 117 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.703451 CET 3: 120 Closing connection due to read error: Connection reset by peer
Mon Nov 19 11:31:44.704603 CET 3: 157 Closing connection due to read error: Connection reset by peer
...
Seems to me that lots of connections were suddenly closed by the client side and then the memcached closed the connections.
Hide
Chiyoung Seo
added a comment -
Please see my comments. You may want to talk to Trond about this, but I'm sure that this is not a server side issue.
Show
Chiyoung Seo
added a comment - Please see my comments. You may want to talk to Trond about this, but I'm sure that this is not a server side issue.
Show
Matt Ingenthron
added a comment - Thanks for the analysis Chiyoung, agreed.
Hide
Matt Ingenthron
added a comment -
Steve: can you have someone look at what is happening at a configuration level?
Show
Matt Ingenthron
added a comment - Steve: can you have someone look at what is happening at a configuration level?
Hide
Matt Ingenthron
added a comment -
Yes. I meant the JSON doc we serve up from bucketsStreaming. What we're seeing on the client side (still trying to get more info) is regular configuration updates that are causing the client to dispose of some connections. we've not seen this with any other deployments, so it could be environmental. It's not likely something happening directly on the client, but we'll looking for ways that could possibly happen too.
Show
Matt Ingenthron
added a comment - Yes. I meant the JSON doc we serve up from bucketsStreaming. What we're seeing on the client side (still trying to get more info) is regular configuration updates that are causing the client to dispose of some connections. we've not seen this with any other deployments, so it could be environmental. It's not likely something happening directly on the client, but we'll looking for ways that could possibly happen too.
Hide
Xiaoqin Ma
added a comment -
I played with the log file a little. The earliest errors in the system are sorted by timestamp:
2012-11-18T18:03:31.904:[couchdb:error,2012-11-18T18:03:31.904,ns_1@192.168.0.35:<0.22511.3>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, writer error
2012-11-18T18:03:31.919:[couchdb:error,2012-11-18T18:03:31.919,ns_1@192.168.0.35:<0.4347.0>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, received error from updater: {badmatch,
2012-11-18T18:03:31.983:[couchdb:error,2012-11-18T18:03:31.983,ns_1@192.168.0.35:<0.22516.3>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, writer error
2012-11-18T18:03:32.003:[couchdb:error,2012-11-18T18:03:32.003,ns_1@192.168.0.35:<0.4356.0>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, received error from updater: {badmatch,
2012-11-18T18:03:33.166:[couchdb:error,2012-11-18T18:03:33.166,ns_1@192.168.0.35:<0.22550.3>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, writer error
2012-11-18T18:03:33.168:[couchdb:error,2012-11-18T18:03:33.168,ns_1@192.168.0.35:<0.4437.0>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, received error from updater: {badmatch,
2012-11-18T18:03:33.965:[couchdb:error,2012-11-18T18:03:33.965,ns_1@192.168.0.35:<0.22689.3>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, writer error
2012-11-18T18:03:33.966:[couchdb:error,2012-11-18T18:03:33.966,ns_1@192.168.0.35:<0.4437.0>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, received error from updater: {badmatch,
2012-11-18T18:03:48.288:[couchdb:error,2012-11-18T18:03:48.288,ns_1@192.168.0.35:<0.23545.3>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, writer error
2012-11-18T18:03:48.289:[couchdb:error,2012-11-18T18:03:48.289,ns_1@192.168.0.35:<0.4455.0>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, received error from updater: {badmatch,
2012-11-18T18:03:48.712:[couchdb:error,2012-11-18T18:03:48.712,ns_1@192.168.0.35:<0.23563.3>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, writer error
2012-11-18T18:03:48.713:[couchdb:error,2012-11-18T18:03:48.713,ns_1@192.168.0.35:<0.4437.0>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, received error from updater: {badmatch,
2012-11-18T18:06:01.352:[couchdb:error,2012-11-18T18:06:01.352,ns_1@192.168.0.35:<0.26858.3>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, writer error
2012-11-18T18:06:01.382:[couchdb:error,2012-11-18T18:06:01.382,ns_1@192.168.0.35:<0.4455.0>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, received error from updater: {badmatch,
2012-11-18T18:06:01.820:[couchdb:error,2012-11-18T18:06:01.820,ns_1@192.168.0.35:<0.26871.3>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, writer error
2012-11-18T18:06:01.821:[couchdb:error,2012-11-18T18:06:01.821,ns_1@192.168.0.35:<0.4437.0>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, received error from updater: {badmatch,
2012-11-18T18:06:06.254:[couchdb:error,2012-11-18T18:06:06.254,ns_1@192.168.0.35:<0.27203.3>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, writer error
2012-11-18T18:06:06.266:[couchdb:error,2012-11-18T18:06:06.266,ns_1@192.168.0.35:<0.4356.0>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, received error from updater: {badmatch,
2012-11-18T18:06:21.392:[couchdb:error,2012-11-18T18:06:21.392,ns_1@192.168.0.35:<0.28655.3>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, writer error
2012-11-18T18:06:21.396:[couchdb:error,2012-11-18T18:06:21.396,ns_1@192.168.0.35:<0.4347.0>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, received error from updater: {badmatch,
2012-11-18T18:06:21.435:[couchdb:error,2012-11-18T18:06:21.435,ns_1@192.168.0.35:<0.28662.3>:couch_log:error:42]Set view `quizmo`, main group `_design/games`, writer error
2012-11-18T18:06:21.438:[couchdb:error,2012-11-18T18:06:21.438,ns_1@192.168.0.35:<0.4392.0>:couch_log:error:42]Set view `quizmo`, main group `_design/games`, received error from updater: {badmatch,
2012-11-18T18:06:26.713:[couchdb:error,2012-11-18T18:06:26.713,ns_1@192.168.0.35:<0.29007.3>:couch_log:error:42]Set view `quizmo`, main group `_design/games`, writer error
2012-11-18T18:06:26.732:[couchdb:error,2012-11-18T18:06:26.732,ns_1@192.168.0.35:<0.4392.0>:couch_log:error:42]Set view `quizmo`, main group `_design/games`, received error from updater: {badmatch,
2012-11-18T18:06:26.968:[couchdb:error,2012-11-18T18:06:26.968,ns_1@192.168.0.35:<0.29024.3>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, writer error
2012-11-18T18:06:26.971:[couchdb:error,2012-11-18T18:06:26.971,ns_1@192.168.0.35:<0.4455.0>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, received error from updater: {badmatch,
2012-11-18T18:06:27.132:[couchdb:error,2012-11-18T18:06:27.132,ns_1@192.168.0.35:<0.29036.3>:couch_log:error:42]Set view `quizmo`, main group `_design/chat`, writer error
2012-11-18T18:06:27.160:[couchdb:error,2012-11-18T18:06:27.160,ns_1@192.168.0.35:<0.4410.0>:couch_log:error:42]Set view `quizmo`, main group `_design/chat`, received error from updater: {badmatch,
2012-11-18T18:06:32.899:[couchdb:error,2012-11-18T18:06:32.899,ns_1@192.168.0.35:<0.29402.3>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, writer error
2012-11-18T18:06:32.922:[couchdb:error,2012-11-18T18:06:32.922,ns_1@192.168.0.35:<0.4455.0>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, received error from updater: {badmatch,
2012-11-18T18:06:33.016:[couchdb:error,2012-11-18T18:06:33.016,ns_1@192.168.0.35:<0.29411.3>:couch_log:error:42]Set view `quizmo`, main group `_design/chat`, writer error
2012-11-18T18:06:33.051:[couchdb:error,2012-11-18T18:06:33.051,ns_1@192.168.0.35:<0.4410.0>:couch_log:error:42]Set view `quizmo`, main group `_design/chat`, received error from updater: {badmatch,
2012-11-18T18:06:48.084:[couchdb:error,2012-11-18T18:06:48.084,ns_1@192.168.0.35:<0.30153.3>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, writer error
2012-11-18T18:06:48.114:[couchdb:error,2012-11-18T18:06:48.114,ns_1@192.168.0.35:<0.4347.0>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, received error from updater: {badmatch,
2012-11-18T18:06:48.391:[couchdb:error,2012-11-18T18:06:48.391,ns_1@192.168.0.35:<0.30160.3>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, writer error
2012-11-18T18:06:48.392:[couchdb:error,2012-11-18T18:06:48.392,ns_1@192.168.0.35:<0.4356.0>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, received er
2012-11-18T18:03:31.904:[couchdb:error,2012-11-18T18:03:31.904,ns_1@192.168.0.35:<0.22511.3>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, writer error
2012-11-18T18:03:31.919:[couchdb:error,2012-11-18T18:03:31.919,ns_1@192.168.0.35:<0.4347.0>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, received error from updater: {badmatch,
2012-11-18T18:03:31.983:[couchdb:error,2012-11-18T18:03:31.983,ns_1@192.168.0.35:<0.22516.3>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, writer error
2012-11-18T18:03:32.003:[couchdb:error,2012-11-18T18:03:32.003,ns_1@192.168.0.35:<0.4356.0>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, received error from updater: {badmatch,
2012-11-18T18:03:33.166:[couchdb:error,2012-11-18T18:03:33.166,ns_1@192.168.0.35:<0.22550.3>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, writer error
2012-11-18T18:03:33.168:[couchdb:error,2012-11-18T18:03:33.168,ns_1@192.168.0.35:<0.4437.0>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, received error from updater: {badmatch,
2012-11-18T18:03:33.965:[couchdb:error,2012-11-18T18:03:33.965,ns_1@192.168.0.35:<0.22689.3>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, writer error
2012-11-18T18:03:33.966:[couchdb:error,2012-11-18T18:03:33.966,ns_1@192.168.0.35:<0.4437.0>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, received error from updater: {badmatch,
2012-11-18T18:03:48.288:[couchdb:error,2012-11-18T18:03:48.288,ns_1@192.168.0.35:<0.23545.3>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, writer error
2012-11-18T18:03:48.289:[couchdb:error,2012-11-18T18:03:48.289,ns_1@192.168.0.35:<0.4455.0>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, received error from updater: {badmatch,
2012-11-18T18:03:48.712:[couchdb:error,2012-11-18T18:03:48.712,ns_1@192.168.0.35:<0.23563.3>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, writer error
2012-11-18T18:03:48.713:[couchdb:error,2012-11-18T18:03:48.713,ns_1@192.168.0.35:<0.4437.0>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, received error from updater: {badmatch,
2012-11-18T18:06:01.352:[couchdb:error,2012-11-18T18:06:01.352,ns_1@192.168.0.35:<0.26858.3>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, writer error
2012-11-18T18:06:01.382:[couchdb:error,2012-11-18T18:06:01.382,ns_1@192.168.0.35:<0.4455.0>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, received error from updater: {badmatch,
2012-11-18T18:06:01.820:[couchdb:error,2012-11-18T18:06:01.820,ns_1@192.168.0.35:<0.26871.3>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, writer error
2012-11-18T18:06:01.821:[couchdb:error,2012-11-18T18:06:01.821,ns_1@192.168.0.35:<0.4437.0>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, received error from updater: {badmatch,
2012-11-18T18:06:06.254:[couchdb:error,2012-11-18T18:06:06.254,ns_1@192.168.0.35:<0.27203.3>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, writer error
2012-11-18T18:06:06.266:[couchdb:error,2012-11-18T18:06:06.266,ns_1@192.168.0.35:<0.4356.0>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, received error from updater: {badmatch,
2012-11-18T18:06:21.392:[couchdb:error,2012-11-18T18:06:21.392,ns_1@192.168.0.35:<0.28655.3>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, writer error
2012-11-18T18:06:21.396:[couchdb:error,2012-11-18T18:06:21.396,ns_1@192.168.0.35:<0.4347.0>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, received error from updater: {badmatch,
2012-11-18T18:06:21.435:[couchdb:error,2012-11-18T18:06:21.435,ns_1@192.168.0.35:<0.28662.3>:couch_log:error:42]Set view `quizmo`, main group `_design/games`, writer error
2012-11-18T18:06:21.438:[couchdb:error,2012-11-18T18:06:21.438,ns_1@192.168.0.35:<0.4392.0>:couch_log:error:42]Set view `quizmo`, main group `_design/games`, received error from updater: {badmatch,
2012-11-18T18:06:26.713:[couchdb:error,2012-11-18T18:06:26.713,ns_1@192.168.0.35:<0.29007.3>:couch_log:error:42]Set view `quizmo`, main group `_design/games`, writer error
2012-11-18T18:06:26.732:[couchdb:error,2012-11-18T18:06:26.732,ns_1@192.168.0.35:<0.4392.0>:couch_log:error:42]Set view `quizmo`, main group `_design/games`, received error from updater: {badmatch,
2012-11-18T18:06:26.968:[couchdb:error,2012-11-18T18:06:26.968,ns_1@192.168.0.35:<0.29024.3>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, writer error
2012-11-18T18:06:26.971:[couchdb:error,2012-11-18T18:06:26.971,ns_1@192.168.0.35:<0.4455.0>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, received error from updater: {badmatch,
2012-11-18T18:06:27.132:[couchdb:error,2012-11-18T18:06:27.132,ns_1@192.168.0.35:<0.29036.3>:couch_log:error:42]Set view `quizmo`, main group `_design/chat`, writer error
2012-11-18T18:06:27.160:[couchdb:error,2012-11-18T18:06:27.160,ns_1@192.168.0.35:<0.4410.0>:couch_log:error:42]Set view `quizmo`, main group `_design/chat`, received error from updater: {badmatch,
2012-11-18T18:06:32.899:[couchdb:error,2012-11-18T18:06:32.899,ns_1@192.168.0.35:<0.29402.3>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, writer error
2012-11-18T18:06:32.922:[couchdb:error,2012-11-18T18:06:32.922,ns_1@192.168.0.35:<0.4455.0>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, received error from updater: {badmatch,
2012-11-18T18:06:33.016:[couchdb:error,2012-11-18T18:06:33.016,ns_1@192.168.0.35:<0.29411.3>:couch_log:error:42]Set view `quizmo`, main group `_design/chat`, writer error
2012-11-18T18:06:33.051:[couchdb:error,2012-11-18T18:06:33.051,ns_1@192.168.0.35:<0.4410.0>:couch_log:error:42]Set view `quizmo`, main group `_design/chat`, received error from updater: {badmatch,
2012-11-18T18:06:48.084:[couchdb:error,2012-11-18T18:06:48.084,ns_1@192.168.0.35:<0.30153.3>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, writer error
2012-11-18T18:06:48.114:[couchdb:error,2012-11-18T18:06:48.114,ns_1@192.168.0.35:<0.4347.0>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, received error from updater: {badmatch,
2012-11-18T18:06:48.391:[couchdb:error,2012-11-18T18:06:48.391,ns_1@192.168.0.35:<0.30160.3>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, writer error
2012-11-18T18:06:48.392:[couchdb:error,2012-11-18T18:06:48.392,ns_1@192.168.0.35:<0.4356.0>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, received er
Show
Xiaoqin Ma
added a comment - I played with the log file a little. The earliest errors in the system are sorted by timestamp:
2012-11-18T18:03:31.904:[couchdb:error,2012-11-18T18:03:31.904, ns_1@192.168.0.35 :<0.22511.3>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, writer error
2012-11-18T18:03:31.919:[couchdb:error,2012-11-18T18:03:31.919, ns_1@192.168.0.35 :<0.4347.0>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, received error from updater: {badmatch,
2012-11-18T18:03:31.983:[couchdb:error,2012-11-18T18:03:31.983, ns_1@192.168.0.35 :<0.22516.3>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, writer error
2012-11-18T18:03:32.003:[couchdb:error,2012-11-18T18:03:32.003, ns_1@192.168.0.35 :<0.4356.0>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, received error from updater: {badmatch,
2012-11-18T18:03:33.166:[couchdb:error,2012-11-18T18:03:33.166, ns_1@192.168.0.35 :<0.22550.3>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, writer error
2012-11-18T18:03:33.168:[couchdb:error,2012-11-18T18:03:33.168, ns_1@192.168.0.35 :<0.4437.0>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, received error from updater: {badmatch,
2012-11-18T18:03:33.965:[couchdb:error,2012-11-18T18:03:33.965, ns_1@192.168.0.35 :<0.22689.3>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, writer error
2012-11-18T18:03:33.966:[couchdb:error,2012-11-18T18:03:33.966, ns_1@192.168.0.35 :<0.4437.0>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, received error from updater: {badmatch,
2012-11-18T18:03:48.288:[couchdb:error,2012-11-18T18:03:48.288, ns_1@192.168.0.35 :<0.23545.3>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, writer error
2012-11-18T18:03:48.289:[couchdb:error,2012-11-18T18:03:48.289, ns_1@192.168.0.35 :<0.4455.0>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, received error from updater: {badmatch,
2012-11-18T18:03:48.712:[couchdb:error,2012-11-18T18:03:48.712, ns_1@192.168.0.35 :<0.23563.3>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, writer error
2012-11-18T18:03:48.713:[couchdb:error,2012-11-18T18:03:48.713, ns_1@192.168.0.35 :<0.4437.0>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, received error from updater: {badmatch,
2012-11-18T18:06:01.352:[couchdb:error,2012-11-18T18:06:01.352, ns_1@192.168.0.35 :<0.26858.3>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, writer error
2012-11-18T18:06:01.382:[couchdb:error,2012-11-18T18:06:01.382, ns_1@192.168.0.35 :<0.4455.0>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, received error from updater: {badmatch,
2012-11-18T18:06:01.820:[couchdb:error,2012-11-18T18:06:01.820, ns_1@192.168.0.35 :<0.26871.3>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, writer error
2012-11-18T18:06:01.821:[couchdb:error,2012-11-18T18:06:01.821, ns_1@192.168.0.35 :<0.4437.0>:couch_log:error:42]Set view `quizmo`, main group `_design/all`, received error from updater: {badmatch,
2012-11-18T18:06:06.254:[couchdb:error,2012-11-18T18:06:06.254, ns_1@192.168.0.35 :<0.27203.3>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, writer error
2012-11-18T18:06:06.266:[couchdb:error,2012-11-18T18:06:06.266, ns_1@192.168.0.35 :<0.4356.0>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, received error from updater: {badmatch,
2012-11-18T18:06:21.392:[couchdb:error,2012-11-18T18:06:21.392, ns_1@192.168.0.35 :<0.28655.3>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, writer error
2012-11-18T18:06:21.396:[couchdb:error,2012-11-18T18:06:21.396, ns_1@192.168.0.35 :<0.4347.0>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, received error from updater: {badmatch,
2012-11-18T18:06:21.435:[couchdb:error,2012-11-18T18:06:21.435, ns_1@192.168.0.35 :<0.28662.3>:couch_log:error:42]Set view `quizmo`, main group `_design/games`, writer error
2012-11-18T18:06:21.438:[couchdb:error,2012-11-18T18:06:21.438, ns_1@192.168.0.35 :<0.4392.0>:couch_log:error:42]Set view `quizmo`, main group `_design/games`, received error from updater: {badmatch,
2012-11-18T18:06:26.713:[couchdb:error,2012-11-18T18:06:26.713, ns_1@192.168.0.35 :<0.29007.3>:couch_log:error:42]Set view `quizmo`, main group `_design/games`, writer error
2012-11-18T18:06:26.732:[couchdb:error,2012-11-18T18:06:26.732, ns_1@192.168.0.35 :<0.4392.0>:couch_log:error:42]Set view `quizmo`, main group `_design/games`, received error from updater: {badmatch,
2012-11-18T18:06:26.968:[couchdb:error,2012-11-18T18:06:26.968, ns_1@192.168.0.35 :<0.29024.3>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, writer error
2012-11-18T18:06:26.971:[couchdb:error,2012-11-18T18:06:26.971, ns_1@192.168.0.35 :<0.4455.0>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, received error from updater: {badmatch,
2012-11-18T18:06:27.132:[couchdb:error,2012-11-18T18:06:27.132, ns_1@192.168.0.35 :<0.29036.3>:couch_log:error:42]Set view `quizmo`, main group `_design/chat`, writer error
2012-11-18T18:06:27.160:[couchdb:error,2012-11-18T18:06:27.160, ns_1@192.168.0.35 :<0.4410.0>:couch_log:error:42]Set view `quizmo`, main group `_design/chat`, received error from updater: {badmatch,
2012-11-18T18:06:32.899:[couchdb:error,2012-11-18T18:06:32.899, ns_1@192.168.0.35 :<0.29402.3>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, writer error
2012-11-18T18:06:32.922:[couchdb:error,2012-11-18T18:06:32.922, ns_1@192.168.0.35 :<0.4455.0>:couch_log:error:42]Set view `quizmo`, main group `_design/userstats`, received error from updater: {badmatch,
2012-11-18T18:06:33.016:[couchdb:error,2012-11-18T18:06:33.016, ns_1@192.168.0.35 :<0.29411.3>:couch_log:error:42]Set view `quizmo`, main group `_design/chat`, writer error
2012-11-18T18:06:33.051:[couchdb:error,2012-11-18T18:06:33.051, ns_1@192.168.0.35 :<0.4410.0>:couch_log:error:42]Set view `quizmo`, main group `_design/chat`, received error from updater: {badmatch,
2012-11-18T18:06:48.084:[couchdb:error,2012-11-18T18:06:48.084, ns_1@192.168.0.35 :<0.30153.3>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, writer error
2012-11-18T18:06:48.114:[couchdb:error,2012-11-18T18:06:48.114, ns_1@192.168.0.35 :<0.4347.0>:couch_log:error:42]Set view `quizmo`, main group `_design/search`, received error from updater: {badmatch,
2012-11-18T18:06:48.391:[couchdb:error,2012-11-18T18:06:48.391, ns_1@192.168.0.35 :<0.30160.3>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, writer error
2012-11-18T18:06:48.392:[couchdb:error,2012-11-18T18:06:48.392, ns_1@192.168.0.35 :<0.4356.0>:couch_log:error:42]Set view `quizmo`, main group `_design/questions`, received er
Hide
Matt Ingenthron
added a comment -
That's from views. Are there any logs related to configuration changes?
Is any of that an indication that the views are not being written owing to errors?
Is any of that an indication that the views are not being written owing to errors?
Show
Matt Ingenthron
added a comment - That's from views. Are there any logs related to configuration changes?
Is any of that an indication that the views are not being written owing to errors?
Hide
Matt Ingenthron
added a comment -
From a packet trace, we determined that the server is sending config updates even when there are no changes. This may be happening all of the time, not just under load but it's causing problems at the client under load. That's probably just a client issue, but it's still unexpected. I'll attach the full connection from bucketStream and the diff of the various chunks sent.
Show
Matt Ingenthron
added a comment - From a packet trace, we determined that the server is sending config updates even when there are no changes. This may be happening all of the time, not just under load but it's causing problems at the client under load. That's probably just a client issue, but it's still unexpected. I'll attach the full connection from bucketStream and the diff of the various chunks sent.
Hide
From the diag log, it shows after setting up "the Replication from bucket "quizmo" to bucket "quizmo" on cluster "tony mac", the system slowly goes wrong. Any information from "tony mac" cluster (192.168.0.72)?
2012-11-22 15:52:08.359 ns_orchestrator:4:info:message(ns_1@192.168.0.35) - Starting rebalance, KeepNodes = ['ns_1@192.168.0.35','ns_1@192.168.0.36',
'ns_1@192.168.0.37'], EjectNodes = []
2012-11-22 15:52:08.408 ns_storage_conf:0:info:message(ns_1@192.168.0.36) - Deleting old data files of bucket "quizmo"
2012-11-22 15:52:08.410 ns_storage_conf:0:info:message(ns_1@192.168.0.37) - Deleting old data files of bucket "quizmo"
2012-11-22 15:52:08.499 ns_rebalancer:0:info:message(ns_1@192.168.0.35) - Started rebalancing bucket quizmo
2012-11-22 15:52:09.092 ns_memcached:1:info:message(ns_1@192.168.0.37) - Bucket "quizmo" loaded on node 'ns_1@192.168.0.37' in 0 seconds.
2012-11-22 15:52:09.105 ns_memcached:1:info:message(ns_1@192.168.0.36) - Bucket "quizmo" loaded on node 'ns_1@192.168.0.36' in 0 seconds.
2012-11-22 15:52:11.653 ns_vbucket_mover:0:info:message(ns_1@192.168.0.35) - Bucket "quizmo" rebalance does not seem to be swap rebalance
2012-11-22 16:37:43.223 ns_orchestrator:1:info:message(ns_1@192.168.0.35) - Rebalance completed successfully.
2012-11-26 15:13:01.784 menelaus_web_remote_clusters:0:info:message(ns_1@192.168.0.35) - Created remote cluster reference "tony mac" via 192.168.0.72.
2012-11-26 15:13:13.383 menelaus_web_create_replication:0:info:message(ns_1@192.168.0.35) - Replication from bucket "quizmo" to bucket "quizmo" on cluster "tony mac" created.
2012-11-22 15:52:08.359 ns_orchestrator:4:info:message(ns_1@192.168.0.35) - Starting rebalance, KeepNodes = ['ns_1@192.168.0.35','ns_1@192.168.0.36',
'ns_1@192.168.0.37'], EjectNodes = []
2012-11-22 15:52:08.408 ns_storage_conf:0:info:message(ns_1@192.168.0.36) - Deleting old data files of bucket "quizmo"
2012-11-22 15:52:08.410 ns_storage_conf:0:info:message(ns_1@192.168.0.37) - Deleting old data files of bucket "quizmo"
2012-11-22 15:52:08.499 ns_rebalancer:0:info:message(ns_1@192.168.0.35) - Started rebalancing bucket quizmo
2012-11-22 15:52:09.092 ns_memcached:1:info:message(ns_1@192.168.0.37) - Bucket "quizmo" loaded on node 'ns_1@192.168.0.37' in 0 seconds.
2012-11-22 15:52:09.105 ns_memcached:1:info:message(ns_1@192.168.0.36) - Bucket "quizmo" loaded on node 'ns_1@192.168.0.36' in 0 seconds.
2012-11-22 15:52:11.653 ns_vbucket_mover:0:info:message(ns_1@192.168.0.35) - Bucket "quizmo" rebalance does not seem to be swap rebalance
2012-11-22 16:37:43.223 ns_orchestrator:1:info:message(ns_1@192.168.0.35) - Rebalance completed successfully.
2012-11-26 15:13:01.784 menelaus_web_remote_clusters:0:info:message(ns_1@192.168.0.35) - Created remote cluster reference "tony mac" via 192.168.0.72.
2012-11-26 15:13:13.383 menelaus_web_create_replication:0:info:message(ns_1@192.168.0.35) - Replication from bucket "quizmo" to bucket "quizmo" on cluster "tony mac" created.
Show
Xiaoqin Ma
added a comment - - edited From the diag log, it shows after setting up "the Replication from bucket "quizmo" to bucket "quizmo" on cluster "tony mac", the system slowly goes wrong. Any information from "tony mac" cluster (192.168.0.72)?
2012-11-22 15:52:08.359 ns_orchestrator:4:info:message( ns_1@192.168.0.35 ) - Starting rebalance, KeepNodes = [' ns_1@192.168.0.35 ',' ns_1@192.168.0.36 ',
' ns_1@192.168.0.37 '], EjectNodes = []
2012-11-22 15:52:08.408 ns_storage_conf:0:info:message( ns_1@192.168.0.36 ) - Deleting old data files of bucket "quizmo"
2012-11-22 15:52:08.410 ns_storage_conf:0:info:message( ns_1@192.168.0.37 ) - Deleting old data files of bucket "quizmo"
2012-11-22 15:52:08.499 ns_rebalancer:0:info:message( ns_1@192.168.0.35 ) - Started rebalancing bucket quizmo
2012-11-22 15:52:09.092 ns_memcached:1:info:message( ns_1@192.168.0.37 ) - Bucket "quizmo" loaded on node ' ns_1@192.168.0.37 ' in 0 seconds.
2012-11-22 15:52:09.105 ns_memcached:1:info:message( ns_1@192.168.0.36 ) - Bucket "quizmo" loaded on node ' ns_1@192.168.0.36 ' in 0 seconds.
2012-11-22 15:52:11.653 ns_vbucket_mover:0:info:message( ns_1@192.168.0.35 ) - Bucket "quizmo" rebalance does not seem to be swap rebalance
2012-11-22 16:37:43.223 ns_orchestrator:1:info:message( ns_1@192.168.0.35 ) - Rebalance completed successfully.
2012-11-26 15:13:01.784 menelaus_web_remote_clusters:0:info:message( ns_1@192.168.0.35 ) - Created remote cluster reference "tony mac" via 192.168.0.72.
2012-11-26 15:13:13.383 menelaus_web_create_replication:0:info:message( ns_1@192.168.0.35 ) - Replication from bucket "quizmo" to bucket "quizmo" on cluster "tony mac" created.
Hide
Matt Ingenthron
added a comment -
The user working with this system tried to replicate to a Mac, and found it wouldn't work. That's been abandoned, so I think there is no info from "tony mac" any more.
Question though: do things continue to have odd log messages from then on?
Also, is this possibly the cause of the configuration updates?
Question though: do things continue to have odd log messages from then on?
Also, is this possibly the cause of the configuration updates?
Show
Matt Ingenthron
added a comment - The user working with this system tried to replicate to a Mac, and found it wouldn't work. That's been abandoned, so I think there is no info from "tony mac" any more.
Question though: do things continue to have odd log messages from then on?
Also, is this possibly the cause of the configuration updates?
Hide
Xiaoqin Ma
added a comment -
question:do things continue to have odd log messages from then on?
answer: Yes. The xdcr continues to try to replicate data to tony mac all the time, refer to ns_server.xdcr.error.log.
I am not sure if it causes the configuration updates. I will let xdcr guy have a look.
answer: Yes. The xdcr continues to try to replicate data to tony mac all the time, refer to ns_server.xdcr.error.log.
I am not sure if it causes the configuration updates. I will let xdcr guy have a look.
Show
Xiaoqin Ma
added a comment - question:do things continue to have odd log messages from then on?
answer: Yes. The xdcr continues to try to replicate data to tony mac all the time, refer to ns_server.xdcr.error.log.
I am not sure if it causes the configuration updates. I will let xdcr guy have a look.
Hide
Xiaoqin Ma
added a comment -
you may try delete the xdcr relationship completely, see if the problem will go away.
Show
Xiaoqin Ma
added a comment - you may try delete the xdcr relationship completely, see if the problem will go away.
Show
Xiaoqin Ma
added a comment - Hi Junji, can you have a look to see if it relates to xdcr.
Hide
Xiaoqin Ma
added a comment -
From the attached streaming data, it doesn't look that it is reconfiguration updates. My hypothesis is (need to be verified by ns_server guys):
somehow the client and server got disconnected;
when the server gets connected with the client again, it sends start_link information with bucket streaming data.
Will look further to see how server/clients get disconnected.
somehow the client and server got disconnected;
when the server gets connected with the client again, it sends start_link information with bucket streaming data.
Will look further to see how server/clients get disconnected.
Show
Xiaoqin Ma
added a comment - From the attached streaming data, it doesn't look that it is reconfiguration updates. My hypothesis is (need to be verified by ns_server guys):
somehow the client and server got disconnected;
when the server gets connected with the client again, it sends start_link information with bucket streaming data.
Will look further to see how server/clients get disconnected.
Hide
I saw a lot of errors generated due to connection lost in memcached log, but the error message doesn't have details about it. I would like to have the following information in the connection related log in future to help analyze the log and find the root cause of the issues :
initiator node's process id, process name, socket id, calling function.
dormant node's process id, process name, socket id.
initiator node's process id, process name, socket id, calling function.
dormant node's process id, process name, socket id.
Show
Xiaoqin Ma
added a comment - - edited I saw a lot of errors generated due to connection lost in memcached log, but the error message doesn't have details about it. I would like to have the following information in the connection related log in future to help analyze the log and find the root cause of the issues :
initiator node's process id, process name, socket id, calling function.
dormant node's process id, process name, socket id.
Hide
Xiaoqin Ma
added a comment -
Even before setting up the XDCR, the system has already generated a lot of connect lost errors. So for this particular bug, it is not caused by XDCR.
XDCR issue: " The user working with this system tried to replicate to a Mac, and found it wouldn't work. That's been abandoned" After the abandon, the system is still trying to replicate the data across the abandoned cluster, it generates a lot of error messages related to XDCR. We may need some way to completely delete the abandoned xdcr relationship, if the replicate cluster is not reachable.
XDCR issue: " The user working with this system tried to replicate to a Mac, and found it wouldn't work. That's been abandoned" After the abandon, the system is still trying to replicate the data across the abandoned cluster, it generates a lot of error messages related to XDCR. We may need some way to completely delete the abandoned xdcr relationship, if the replicate cluster is not reachable.
Show
Xiaoqin Ma
added a comment - Even before setting up the XDCR, the system has already generated a lot of connect lost errors. So for this particular bug, it is not caused by XDCR.
XDCR issue: " The user working with this system tried to replicate to a Mac, and found it wouldn't work. That's been abandoned" After the abandon, the system is still trying to replicate the data across the abandoned cluster, it generates a lot of error messages related to XDCR. We may need some way to completely delete the abandoned xdcr relationship, if the replicate cluster is not reachable.
Hide
1. Polish the connection related error message generated written in memcached log.
2. Write the year information in the memcached log. There is no year information in the current memcached log.
2. Write the year information in the memcached log. There is no year information in the current memcached log.
Show
Xiaoqin Ma
added a comment - - edited 1. Polish the connection related error message generated written in memcached log.
2. Write the year information in the memcached log. There is no year information in the current memcached log.
Show
Chiyoung Seo
added a comment - Please assign it to memcached folks.
Hide
Farshid Ghods
added a comment -
per bug scrub
Chiyoung is going to work with Xiaowin to resolve this issue
Chiyoung is going to work with Xiaowin to resolve this issue
Show
Farshid Ghods
added a comment - per bug scrub
Chiyoung is going to work with Xiaowin to resolve this issue
Hide
Chiyoung Seo
added a comment -
Matt,
Can you update this bug if you have any further updates? I've gone through all the comments here. I don't think it is a server side issue.
Please reassign it to me if you think it's still a server side problem.
Can you update this bug if you have any further updates? I've gone through all the comments here. I don't think it is a server side issue.
Please reassign it to me if you think it's still a server side problem.
Show
Chiyoung Seo
added a comment - Matt,
Can you update this bug if you have any further updates? I've gone through all the comments here. I don't think it is a server side issue.
Please reassign it to me if you think it's still a server side problem.
Hide
Matt Ingenthron
added a comment -
In the end, the issue ended up being MB-5406, though there were claims of other problems that we couldn't quite work out. I think this is good to be closed.
Show
Matt Ingenthron
added a comment - In the end, the issue ended up being MB-5406 , though there were claims of other problems that we couldn't quite work out. I think this is good to be closed.