[CCBC-458] Allow to disable refresh-on-views-error Created: 25/Jun/14  Updated: 10/Jul/14  Resolved: 10/Jul/14

Status: Resolved
Project: Couchbase C client library libcouchbase
Component/s: library
Affects Version/s: 2.3.0, 2.3.1, 2.3.2
Fix Version/s: 2.4.0-beta
Security Level: Public

Type: Improvement Priority: Major
Reporter: Mark Nunberg Assignee: Mark Nunberg
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Currently when an HTTP error is received we automatically take this as a cue to refresh the configuration. While the side effects are generally harmless (because of throttling) they are troublesome to analyze in the logs and are a waste of network resources. Since this behavior exists primarily for views-only workload we should look into adding an option to disable this behavior and/or disable it by default.

We may also possibly implement the proper "Heuristics Checking" specified in a proposal somewhere too..

 Comments   
Comment by Mark Nunberg [ 10/Jul/14 ]
http://review.couchbase.org/39288




[CCBC-405] Throw appropriate error on empty key Created: 02/May/14  Updated: 10/Jul/14  Resolved: 10/Jul/14

Status: Resolved
Project: Couchbase C client library libcouchbase
Component/s: None
Affects Version/s: None
Fix Version/s: 2.4.0-beta
Security Level: Public

Type: Bug Priority: Major
Reporter: Jon Moses Assignee: Mark Nunberg
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: OSX 10.9, LCB 2.3.0

$ cbc version
cbc built from: libcouchbase 2.3.0_15_g0b04b36 (rev. 0b04b36c1ef8d568271ea402b582e4339b81c8f3)
    using libcouchbase: 2.3.0 (libevent)
    using libyajl: 2.0.4

Attachments: File minimal.c    

 Description   
If I issue a query that returns 0x07, the next two queries, regardless of validity will return 0x10 instead of the expected results.

See 'minimal.c' attached

 Comments   
Comment by Sergey Avseyev [ 02/May/14 ]
Expected output:
$ example/minimal/minimal
STORED "foo" CAS: 17097673194045177856
GOT "foo" CAS: 17097673194045177856 FLAGS:0x0 SIZE:3
bar
GET ERROR: Invalid input/arguments (0x7)
GOT "foo" CAS: 17097673194045177856 FLAGS:0x0 SIZE:3
bar


Actual output:
$ example/minimal/minimal
STORED "foo" CAS: 580788420889542656
GOT "foo" CAS: 580788420889542656 FLAGS:0x0 SIZE:3
bar
GET ERROR: Invalid input/arguments (0x7)
ERROR: Network failure (0x10), (null)


Matt, do you know whether the server defines behaviour on empty key (zero length)
Comment by Mark Nunberg [ 02/May/14 ]
The error callback is not and never has been a reliable way to programmatically extract the error from an operation. For what it's worth, 2.2.0 exhibits the same behavior:

+ LD_LIBRARY_PATH=/sources/libcouchbase-2.0.7/inst/lib ./minimal
Using libcouchbase 2.0.7
STORED "foo" CAS: 858236688785346304
GOT "foo" CAS: 858236688785346304 FLAGS:0x0 SIZE:3
bar
GET ERROR: Invalid arguments (0x7)
GOT "foo" CAS: 858236688785346304 FLAGS:0x0 SIZE:3
bar
+ LD_LIBRARY_PATH=/sources/libcouchbase-2.1.3/inst/lib ./minimal
Using libcouchbase 2.1.3
STORED "foo" CAS: 5565267957526957824
GOT "foo" CAS: 5565267957526957824 FLAGS:0x0 SIZE:3
bar
GET ERROR: Invalid arguments (0x7)
ERROR: Network error (0x10), (null)
+ LD_LIBRARY_PATH=/sources/libcouchbase-2.2.0/inst/lib ./minimal
Using libcouchbase 2.2.0
STORED "foo" CAS: 8308748484727672576
GOT "foo" CAS: 8308748484727672576 FLAGS:0x0 SIZE:3
bar
GET ERROR: Invalid arguments (0x7)
ERROR: Network error (0x10), (null)
+ LD_LIBRARY_PATH=/sources/libcouchbase-2.3.0/inst/lib ./minimal
Using libcouchbase 2.3.0
STORED "foo" CAS: 3755695023363067648
GOT "foo" CAS: 3755695023363067648 FLAGS:0x0 SIZE:3
bar
GET ERROR: Invalid input/arguments (0x7)
GET ERROR: Network failure (0x10)
ERROR: Network failure (0x10), (null)


The issue seems to be in the client assuming a specific behavior of error_callback which is not guaranteed.
Comment by Jon Moses [ 02/May/14 ]
I'm not sure if the behavior expresses with a different initial error than 0x07, that's just the one that I reproduced from my data.
Comment by Mark Nunberg [ 02/May/14 ]
If you're scheduling commands asynchronously, then the failed commands in the pipeline will end up being invoked with the error received. This is actually correct behavior since the server is closing the socket.

Observe:

recvmsg(6, {msg_name(0)=NULL, msg_iov(1)=[{"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 65495}], msg_controllen=0, msg_flags=0}, 0) = 0
write(2, "GET ERROR: Invalid input/argumen"..., 41GET ERROR: Invalid input/arguments (0x7)
) = 41
epoll_ctl(3, EPOLL_CTL_DEL, 6, {EPOLLIN, {u32=6, u64=6}}) = 0
close(6) = 0
write(2, "ERROR: Network failure (0x10), ("..., 38ERROR: Network failure (0x10), (null)
) = 38


The first command gets EINVAL, and the error callback is invoked from it. The subsequent read from the socket returns 0 indicating the server has closed the connection which causes the remaining commands to fail with LCB_NETWORK_ERROR.
Comment by Mark Nunberg [ 02/May/14 ]
Proper solution is to return an error (either at lcb level or sdk level) if the key is empty.
Comment by Mark Nunberg [ 10/Jul/14 ]
http://review.couchbase.org/39284




[CCBC-233] Provide new version of arguments with common ABI header Created: 26/Jul/13  Updated: 09/Jul/14  Resolved: 09/Jul/14

Status: Resolved
Project: Couchbase C client library libcouchbase
Component/s: library
Affects Version/s: 2.0.7
Fix Version/s: 2.4.0-beta
Security Level: Public

Type: Task Priority: Critical
Reporter: Mark Nunberg Assignee: Mark Nunberg
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Our current argument structure doesn't have a simple ABI for common fields. This results in duplicate code for almost all implementations using libcouchbase because they must make sure that common fields such as "key" and "nkey" (and possibly also "cas") are dereferenced correctly in each structure.

A new version of arguments should be added which would unify the common fields of all commands, and allow code to "cast" the command structure to a common base type.

This also ties in to the C++ wrappers in which I'm forced to currently either use templates or use virtual if I am going to properly deal with fields

 Comments   
Comment by Mark Nunberg [ 08/Jul/14 ]
This isn't tied to a specific commit, but is related to the ongoing API3 stuff.




[CCBC-226] Support HTTP keepalive semantics for views Created: 15/Jul/13  Updated: 09/Jul/14  Resolved: 09/Jul/14

Status: Resolved
Project: Couchbase C client library libcouchbase
Component/s: None
Affects Version/s: None
Fix Version/s: 2.4.0-beta
Security Level: Public

Type: Task Priority: Major
Reporter: Mark Nunberg Assignee: Mark Nunberg
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Dependency
blocks PYCBC-171 Too many open connections when runnin... Resolved

 Description   
Keep-Alive

 Comments   
Comment by Matt Ingenthron [ 20/Aug/13 ]
keepalive is hard btw mordy__ @ 2:30

no doubt ingenthr @ 2:31

yes, probably we will start it as the latest tas avsej @ 2:31

though, we have only one server implementation, so we may be able to cut some corners w.r.t. the actual HTTP spec ingenthr @ 2:31
if that's where the weird stuff is 2:31
 
ingenthr: the most difficult is stale connection tracking mordy__ @ 2:31

ah ingenthr @ 2:31
that makes sense 2:32
 
since tyou have no way to know if a connection is closed or not unless you try to use it mordy__ @ 2:32

that sounds familiar ingenthr @ 2:32

so then you need to keep state in knowing whether a "cached" connection was used and apply retry logic mordy__ @ 2:33
Comment by Mark Nunberg [ 07/Jul/14 ]
Now that we have generic and well tested connection pool functionality within the library, this is simple to do. I decided to add this after seeing the immense numbers of HTTP connections during running the tests. This change should keep that down, and perhaps as a side effect, reduce some of the bugginess witnessed in the tests.
Comment by Mark Nunberg [ 07/Jul/14 ]
http://review.couchbase.org/#/c/39146/




[CCBC-471] Retried HTTP request may hang client Created: 06/Jul/14  Updated: 09/Jul/14  Resolved: 09/Jul/14

Status: Resolved
Project: Couchbase C client library libcouchbase
Component/s: library
Affects Version/s: 2.4.0-dp1
Fix Version/s: 2.4.0-beta
Security Level: Public

Type: Task Priority: Major
Reporter: Mark Nunberg Assignee: Mark Nunberg
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
If a request is retried as a result of an HTTP redirect, it will increment the pendops counter (which indicates how many pending operations remain until the event loop should stop and control be returned back to the user); however the counter is decremented only once -- when the request finishes.

 Comments   
Comment by Mark Nunberg [ 06/Jul/14 ]
http://review.couchbase.org/39143




[CCBC-430] Ensure SSL is properly setup in threaded environments Created: 25/May/14  Updated: 09/Jul/14  Resolved: 09/Jul/14

Status: Resolved
Project: Couchbase C client library libcouchbase
Component/s: library
Affects Version/s: 2.4.0-dp1
Fix Version/s: 2.4.0-beta
Security Level: Public

Type: Task Priority: Critical
Reporter: Mark Nunberg Assignee: Mark Nunberg
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Also enusre SSL_library_init() etc is done correctly as well.

 Comments   
Comment by Mark Nunberg [ 07/Jul/14 ]
http://review.couchbase.org/39202




[CCBC-456] IOCP plugin aborts on init failure Created: 23/Jun/14  Updated: 09/Jul/14  Resolved: 09/Jul/14

Status: Resolved
Project: Couchbase C client library libcouchbase
Component/s: library
Affects Version/s: 2.1.2
Fix Version/s: 2.4.0-beta
Security Level: Public

Type: Task Priority: Major
Reporter: sdeigm42 Assignee: Mark Nunberg
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: Windows 7 / 64bit


 Description   
In some of the C client library function (at least in lcb_iocp_new_iops in libcouchbase/plugins/io/iocp/iocp_iops.c) the abort() method is called in case of an failure. This causes my entire application to terminate instead of giving me the chance to print out an appropriate failure message.

I would suggest to replace the abort() function call with a return of an appropriate error code.

 Comments   
Comment by Mark Nunberg [ 07/Jul/14 ]
I've written up a fix for this. Indeed in the specific location there, you may do better with just an error message; however in general it might be better to leave the abort()s where they are; it's better to have a program predictably crash and possibly leaving a core dump than have it unpredictably hang.
Comment by sdeigm42 [ 07/Jul/14 ]
For us and our customers an abort and the termination of our process in case of an failure which can happen during normal processing is unacceptable since we don't have any chance to report errors, do cleanup operations and other things. Of course our software is not 100% error free and could also crash for other reasons but hopefully only for programming errors and not for other situations like not enough sockets available or other missing resources. In my opinion a basic client library should never terminate a process.

The main reason why we are evaluating couchbase is the high availability of the database due to the cluster approach. Since our software is working in a 24/7 environment with minimal downtime any possible termination through libcouchbase would be a show stopper for us to use the library/database.
Comment by Mark Nunberg [ 07/Jul/14 ]
I'm more curious to know under which circumstances the original abort was seen in the first place. Are you operating in a memory constrained environment?
Comment by sdeigm42 [ 07/Jul/14 ]
We were doing stress tests from a Windows 7 client with 8GB memory agains a Redhat couchbase server. At some point we got a lot of network errors and then our software terminated. Later on we used a debugger to figure out that libcouchbase called the abort() method after the Windows function CreateIoCompletionPort() failed. So we had no chance to figure out why the Windows method failed, but I assume the system was running out of open socket descriptors.
Comment by Mark Nunberg [ 07/Jul/14 ]
http://review.couchbase.org/#/c/38817/
Comment by Mark Nunberg [ 07/Jul/14 ]
I'd be interested to know if the library was leaking sockets. Are you running views as part of your stress test?

I'd also recommend testing the 2.4 developer preview (and beta which will be released soon) if possible. Should probably fix any socket leaks.

Lacking that, please try the 2.3.2 version (our latest stable release). 2.1.2 is pretty old.
Comment by sdeigm42 [ 07/Jul/14 ]
We don't think that your library is leaking sockets since most of the time the stress tests complete without any problems. When the test cases failed some lcb_get and some lbc_store commands failed and lcb_strerror returned "Network error" and shortly after that our process aborted.

We are just running one view - and that just at the end of the test - to collect all keys that have to be deleted.

We will update to a newer version of libcouchbase as you suggested.

Just to summarize again: we have no problems with the library itself. Today we weren't even able to reproduce the problem at all. But we can not live with the theoretical problem that a limited resource is not available and our software terminates due to this fact. Since our software runs in a realtime environment where we get multiple inbound and outbound socket connections for various protocols (SQL databases, SOAP connections, HTTP connections ...) we are dependent on the underlying client libraries to report an error if some resource is not available. Otherwise we would have to control the overall socket usage of all these subsystems which is not feasible.
Comment by Mark Nunberg [ 07/Jul/14 ]
In your particular case, be aware the library creates a new TCP connection each time you query a view. This will be fixed in 2.4

I can't guarantee the code is free of other places where it may abort or crash due to allocation failure, though I believe the most common cases have been caught. If you see a certain place where this may happen, please report it.
Comment by sdeigm42 [ 08/Jul/14 ]
Ok, I will update to version 2.4 as soon as possible and if I encounter any new aborts I will report them.

Thanks for your great support!!




[CCBC-472] server_nodes returns only nodes specified via "host" argument Created: 03/Jul/14  Updated: 09/Jul/14  Resolved: 09/Jul/14

Status: Resolved
Project: Couchbase C client library libcouchbase
Component/s: library
Affects Version/s: None
Fix Version/s: 2.4.0-beta
Security Level: Public

Type: Bug Priority: Major
Reporter: Pavel Paulau Assignee: Mark Nunberg
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: Python 2.7.6
lcb 2.3.2
couchbase 1.2.2


 Description   
https://gist.github.com/pavel-paulau/5f02459f73ee207a18aa

I'm pretty sure it's used to work. And it obviously has nothing to do with python sdk...

 Comments   
Comment by Mark Nunberg [ 03/Jul/14 ]
This is going to be fixed in libcouchbase 2.4. Unfortunately the way 'nodes' was implemented was rather confusing and would be dependent on HTTP bootstrap (which we only do selectively)
Comment by Pavel Paulau [ 03/Jul/14 ]
Thanks for clarification.

Two questions:
1. When do you plan to release 2.4
2. Is there any temporary workaround? I need list of servers for view queries via 3rd party HTTP library...
Comment by Mark Nunberg [ 07/Jul/14 ]
2.4 should have a beta out sometime this week which should hopefully fix this issue.

As for a temporary workaround - I'll see what I can come up with. Are you OK with a patch? :)
Comment by Pavel Paulau [ 07/Jul/14 ]
I temporary workaround-ed the problem but look forward to permanent fix.
Comment by Mark Nunberg [ 07/Jul/14 ]
http://review.couchbase.org/#/c/39158/




[CCBC-360] date mutated in libcouchbase's send buffer Created: 08/Apr/14  Updated: 08/Jul/14  Resolved: 08/Jul/14

Status: Resolved
Project: Couchbase C client library libcouchbase
Component/s: library
Affects Version/s: 2.2.0
Fix Version/s: 2.4.0-dp1
Security Level: Public

Type: Bug Priority: Blocker
Reporter: killgxlin Assignee: Mark Nunberg
Resolution: Fixed Votes: 0
Labels: usability
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: libcouchbase-2.2.0
couchbase-server 2.2.0
ubuntu 13.04 g++ 4.8.1

Attachments: File key_3423.base64     File key_3423.hexdump     File key_3423.json    

 Description   
while true
  set 1000 json document to couchbase-server
  get 1000 json documents from couchbase-server
  some of them may become invalid json document

the test code is here https://github.com/killgxlin/libcouchbase_testcode
i think it may be a bug of libcouchbase. if not it may be a bug of myself ,please point out and mail me
killgxlin@hotmail.com please.

broken document is in attachment
key_*.base64 is the broken document saved in couchbase-server
key_*.json is a broken json which decoded from key_*base64,
key_*.hexdump is a hexdumped text from broken key_*.json

by watch key_*.hexdump i could see that the past of the document is overwroted by a network msgof couchbase which head is 80 01 PROTOCOL_BINARY_REQ PROTOCOL_BINARY_CMD_SET

 Comments   
Comment by Sergey Avseyev [ 13/Apr/14 ]
https://github.com/couchbase/libcouchbase/pull/13
Comment by Mark Nunberg [ 14/May/14 ]
This is dependent on having a 2.3.2 come out (or perhaps be supplied as a 2.3.x main release in the future)..




[CCBC-459] Provide error category indicating server is loaded Created: 26/Jun/14  Updated: 07/Jul/14  Resolved: 07/Jul/14

Status: Resolved
Project: Couchbase C client library libcouchbase
Component/s: library
Affects Version/s: 2.3.1, 2.4.0-dp1
Fix Version/s: 2.4.0-beta
Security Level: Public

Type: Task Priority: Major
Reporter: Mark Nunberg Assignee: Mark Nunberg
Resolution: Fixed Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
We should make a new error classifier (LCB_ERRTYPE_SRVLOAD) and LCB_EIFSRVLOAD to indicate that the error reply would be solved by a backoff.

Currently we have EIFTMP which includes these errors, but also includes some errors which are not strictly related to having a backoff interval.




Generated at Sat Jul 12 04:14:55 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.