Details
-
Type:
Improvement
-
Status:
Open
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: 2.0.1
-
Fix Version/s: None
-
Component/s: documentation
-
Security Level: Public
-
Labels:
Description
I've been thinking about restructuring the Java Guides because I find it
rather confusing. So I propose the following new structure, which takes
enhancements and new features into account as well. Any inputs would be
greatly appreciated!
Here are the main changes
- Moving the "Reference" into its own subsection, clearing out the main
level
- Providing more information on using the APIs, also with 2.0 content.
- Getting started & tutorial mainly unchanged.
1. Getting Started
1.1 Preparations
1.2 Hello Couchbase
1.3 Working with Documents
1.4 Advanced Topics
1.5 Next Steps
2. Tutorial
2.1 Preparations
2.2 Quickstart
2.3 Connection Management
2.4 The Welcome Page
2.5 Managing Beers
2.6 Wrapping Up
3. Using the APIs
3.1 Connection Management
3.2 Retreiving Data
3.3 Mutating Data
3.4 Working with Views
3.5 Applying Persistence Constraints
3.6 Error Handling
3.7 JSON & Object Serialization
3.8 Design Document Management
4. Advanced Usage
4.1 Bulk Loading
4.2 Logging & Debugging
5. API Reference
5.1 Method Summary
5.2 Connecting & Disconnecting
5.3 Retreiving Data
5.4 Mutating Data
5.5 Management Operations
5.6 Other Useful Operations
A. Release Notes
B. Contributing
B.1 General Information
B.2 Source Code Styleguide
rather confusing. So I propose the following new structure, which takes
enhancements and new features into account as well. Any inputs would be
greatly appreciated!
Here are the main changes
- Moving the "Reference" into its own subsection, clearing out the main
level
- Providing more information on using the APIs, also with 2.0 content.
- Getting started & tutorial mainly unchanged.
1. Getting Started
1.1 Preparations
1.2 Hello Couchbase
1.3 Working with Documents
1.4 Advanced Topics
1.5 Next Steps
2. Tutorial
2.1 Preparations
2.2 Quickstart
2.3 Connection Management
2.4 The Welcome Page
2.5 Managing Beers
2.6 Wrapping Up
3. Using the APIs
3.1 Connection Management
3.2 Retreiving Data
3.3 Mutating Data
3.4 Working with Views
3.5 Applying Persistence Constraints
3.6 Error Handling
3.7 JSON & Object Serialization
3.8 Design Document Management
4. Advanced Usage
4.1 Bulk Loading
4.2 Logging & Debugging
5. API Reference
5.1 Method Summary
5.2 Connecting & Disconnecting
5.3 Retreiving Data
5.4 Mutating Data
5.5 Management Operations
5.6 Other Useful Operations
A. Release Notes
B. Contributing
B.1 General Information
B.2 Source Code Styleguide
Activity
- All
- Comments
- Work Log
- History
- Activity
- Gerrit Reviews
Hide
Karen Zeller
added a comment -
**Dependencies:
-Internal and external customer- validated/reviewed/finalized table of contents from MichaelN delivered to KarenZ
-Raw notes, existing content, new examples and tehnical information from MichaelN
-Internal and external customer- validated/reviewed/finalized table of contents from MichaelN delivered to KarenZ
-Raw notes, existing content, new examples and tehnical information from MichaelN
Show
Karen Zeller
added a comment - **Dependencies:
-Internal and external customer- validated/reviewed/finalized table of contents from MichaelN delivered to KarenZ
-Raw notes, existing content, new examples and tehnical information from MichaelN
Hide
Karen Zeller
added a comment -
On 3/13 Matt assigned you the action to validate your proposed Table of Contents for Java. He had mentioned getting input from technical evangelists, and a couple external customers.
Is this complete?
Is this complete?
Show
Karen Zeller
added a comment - On 3/13 Matt assigned you the action to validate your proposed Table of Contents for Java. He had mentioned getting input from technical evangelists, and a couple external customers.
Is this complete?
Hide
Michael Nitschinger
added a comment -
Not yet - I still need to gather feedback, I'll update this issue with feedback once I got it.
Show
Michael Nitschinger
added a comment - Not yet - I still need to gather feedback, I'll update this issue with feedback once I got it.
Hide
Karen Zeller
added a comment -
Ok. Once you have the reviewed and revised TOC, send to me to get structured as chapters/etc.
Show
Karen Zeller
added a comment - Ok. Once you have the reviewed and revised TOC, send to me to get structured as chapters/etc.
Hide
Karen Zeller
added a comment -
Input from Michael for Advanced Topics Chapter:
Hi,
yes I think its good to go with it!
You can also incorporate the new article in to the "advanced topic" section:
Here is the markdown version from the published one:
## Motivation
This blog post is intended to be a very detailed and informative article for those who already have used the Couchbase Java SDK and want to know how the internals work. This is not a introduction on how to use the Java SDK and we'll cover some fairly advanced topics on the way.
Normally, when talking about the SDK we mean everything that is needed to get you going (Client library, documentation, release notes,...). In this article though, the SDK refers to the Client library (code) unless stated otherwise.
As always, if you have feedback please let me/us know!
## Introduction
First and foremost, it is important to understand that the SDK wraps and extends the functionality of the [spymemcached](https://github.com/couchbase/spymemcached) (called "spy") memcached library. One of the protocols used internally is the memcached protocol, and a lot of functionality can be reused. On the other hand, once you start to peel off the first layers of the SDK you will notice that some components are somewhat more complex because of the fact that spy provides more features than the SDK needs in the first place. The other part is to remember that a lot of the components are interwoven, so you always need to get the dependency right. Most of the time, we release a new spy version at the same date with a new SDK, because new stuff has been added or fixed.
So, aside from reusing the functionality provided by spy, the SDK mainly adds two blocks of functionality: automatic cluster topology management and since 1.1 (and 2.0 server) support for Views. Aside from that it also provides administrative facilities like bucket and design document management.
To understand how the client operates, we'll dissect the whole process in different life cycle phases of the client. After we go through all three phases (bootstrap, operation and shutdown) you should have a clear picture of whats going on under the hood. Note that there is a separate blog post in the making about error handling, so we won't cover that here in greater detail (which will be published a few weeks later on the same blog here).
## Phase 1: Bootstrap
Before we can actually start serving operations like `get()` and `set()`, we need to bootstrap the `CouchbaseClient` object. The important part that we need to accomplish here is to initially get a cluster configuration (which contains the nodes and vBucket map), but also to establish a streaming connection to receive cluster updates in (near) real-time.
We take the list of nodes passing during bootstrap and iterate over it. The first node in the list that can be contacted on port 8091 is used to walk the RESTful interface on the server. If it is not available, the next one will be tried. This means that going from the provided `http://host:port/pools` URI we eventually follow the links to the bucket entity. All this happens inside a `ConfigurationProvider`, which is in this case the `com.couchbase.client.vbucket.ConfigurationProviderHTTP`. If you want to poke around on the internals, look for `getBucketConfiguration` and `readPools` methods.
A (successful) walk can be illustrated like this:
- GET /pools
- look for the "default" pools
- GET /pools/default
- look for the "buckets" hash which contains the bucket list
- GET /pools/default/buckets
- parse the list of buckets and extract the one provided by the application
- GET /pools/default/buckets/<bucketname>
Now we are at the REST endpoint we need. Inside this JSON response, you'll find all useful details that gets also be used by SDK internally (for example `stre
amingUri`, `nodes` and `vBucketServerMap`). The config gets parsed and stored. Before we move on, let's quickly discuss the strange `pools` part inside our REST walk:
The concept of a resource pool to group buckets was designed for Couchbase Server, but is currently not implemented. Still, the REST API is implemented that way and therefore all SDKs support it. That said, while we could theoretically just go directly to `/pools/default/buckets` and skip the first few queries, the current behaviour is future proof so you won't have to change the bootstrap code once the server implements it.
Back to our bootstrap phase. Now that we have a valid cluster config which contains all the nodes (and their hostnames or ip addresses), we can establish connections to them. Aside from establishing the data connections, we also need to instantiate a streaming connection to one of them. For simplicity reasons, we just establish the streaming connection to the node from the list where we got our initial configuration.
This gets us to an important point to keep in mind: if you have lots of CouchbaseClient objects running on many nodes and they all get bootstrapped with the same list, they may end up connecting to the same node for the streaming connection and create a possible bottleneck. Therefore, to distribute the load a little better I recommend shuffling the array before it gets passed in to the CouchbaseClient object. When you only have a few CouchbaseClient objects connected to your cluster, that won't be a problem at all.
The streaming connection URI is taken from the config we got previously, and normally looks like this:
streamingUri: "/pools/default/bucketsStreaming/default?bucket_uuid=88cae4a609eea500d8ad072fe71a7290"
If you point your browser to this address, you will also get the cluster topology updates streamed in real-time. Since the streaming connection needs to be established all the time and potentially blocks a thread, this is done in the background handled by different threads. We are using the NIO framework [Netty](http://netty.io) for this task, which provides a very handy way of dealing with asynchronous operations. If you want to start digging into this part, keep in mind that all read operations are completely separate from write operations, so you need to deal with handlers that take care of what comes back from the server. Aside from some wiring needed for Netty, the business logic can be found in `com.couchbase.client.vbucket.BucketMonitor` and `com.couchbase.client.vbucket.BucketUpdateResponseHandler`. We also try to reestablish this streaming connection if the socket gets closed (for example if this node gets rebalanced out of the cluster).
To actually shuffle data to the cluster nodes, we need to open various sockets to them. Note that there is absolutely no connection pooling needed inside the client, because we manage all sockets proactively. Aside from the special streaming connection to one of the severs (which is opened against port 8091), we need to open the following connections:
- Memcached Socket: Port 11210
- View Socket: Port 8092
Note that port 11211 is not used inside the client SDKs, but used to connect generic memcached clients that are not cluster aware. This means that these generic clients do not get updated cluster topologies.
So as a rule of thumb, if you have a 10 node cluster running, one CouchbaseClient object open about 21 (2*10 + 1) client sockets. These are directly managed, so if a node gets removed or added the numbers will change accordingly.
Now that all sockets have been opened, we are ready to perform regular cluster operations. As you can see, there is a lot of overhead involved when the CouchbaseClient object gets bootstrapped. Because of this fact, we strongly discourage you from either creating a new object on every request or running a lot of CouchbaseClient objects in one application server. This only adds unnecessary overhead and load on the application server and adds on the total sockets opened against the cluster (resulting in a possible performance problem).
As a point of reference, with regular INFO level logging enabled this is how connecting and disconnecting to a 1-node cluster (Couchbase bucket) should look like:
Apr 17, 2013 3:14:49 PM com.couchbase.client.CouchbaseProperties setPropertyFile
INFO: Could not load properties file "cbclient.properties" because: File not found with system classloader.
2013-04-17 15:14:49.656 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-04-17 15:14:49.673 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@2adb1d4
2013-04-17 15:14:49.718 INFO com.couchbase.client.ViewConnection: Added localhost to connect queue
2013-04-17 15:14:49.720 INFO com.couchbase.client.CouchbaseClient: viewmode property isn't defined. Setting viewmode to production mode
2013-04-17 15:14:49.856 INFO com.couchbase.client.CouchbaseConnection: Shut down Couchbase client
2013-04-17 15:14:49.861 INFO com.couchbase.client.ViewConnection: Node localhost has no ops in the queue
2013-04-17 15:14:49.861 INFO com.couchbase.client.ViewNode: I/O reactor terminated for localhost
If you are connecting to a Couchbase Server 1.8 or against a Memcache-Bucket you won't see View connections getting established:
INFO: Could not load properties file "cbclient.properties" because: File not found with system classloader.
2013-04-17 15:16:44.295 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.56.101:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-04-17 15:16:44.297 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.56.102:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-04-17 15:16:44.298 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.56.103:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-04-17 15:16:44.298 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.56.104:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-04-17 15:16:44.306 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@38b5dac4
2013-04-17 15:16:44.313 INFO com.couchbase.client.CouchbaseClient: viewmode property isn't defined. Setting viewmode to production mode
2013-04-17 15:16:44.332 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@69945ce
2013-04-17 15:16:44.333 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@6766afb3
2013-04-17 15:16:44.334 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@2b2d96f2
2013-04-17 15:16:44.368 INFO net.spy.memcached.auth.AuthThread: Authenticated to 192.168.56.103/192.168.56.103:11210
2013-04-17 15:16:44.368 INFO net.spy.memcached.auth.AuthThread: Authenticated to 192.168.56.102/192.168.56.102:11210
2013-04-17 15:16:44.369 INFO net.spy.memcached.auth.AuthThread: Authenticated to 192.168.56.101/192.168.56.101:11210
2013-04-17 15:16:44.369 INFO net.spy.memcached.auth.AuthThread: Authenticated to 192.168.56.104/192.168.56.104:11210
2013-04-17 15:16:44.490 INFO com.couchbase.client.CouchbaseConnection: Shut down Couchbase client
## Phase 2: Operations
When the SDK is bootstrapped, it enables your application to run operations against the attached cluster. For the purpose of this blog post, we need to distinguish between operations that get executed against a stable cluster and operations on a cluster that is currently experiencing some form of topology change (be it planned because of adding nodes or unplanned because of a node failure). Let's tackle the regular operations first.
### Operations against a stable cluster
While not directly visible in the first place, inside the SDK we need to distinguish between memcached operations and View operations. All operations that have a unique key in their method signature can be treaded as memcached operations. All of them eventually end up getting funneled through spy. View operations on the other hand are implemented completely inside the SDK itself.
Both View and memcached operations are asynchronous. Inside spy, there is one thread (call the I/O thread) dedicated to deal with IO operations. Note that in high-traffic environments, its not unusual that this thread is always active. It uses the non-blocking Java NIO mechanisms to deal with traffic, and loops around "selectors" that get notified when data can either be written or read. If you profile your application you'll see that this thread spends most of its time waiting on a `select` method, it means that it is idling there waiting to be notified for new traffic. The concepts used inside spy to deal with this are common Java NIO knowledge, so you may want to look into the [NIO internals](https://www.ibm.com/developerworks/java/tutorials/j-nio/) first before digging into that code path. Good starting points are the `net.spy.memcached.MemcachedConnection` and `net.spy.memcached.protocol.TCPMemcachedNodeImpl` classes. Note that inside the SDK, we override the `MemcachedConnection` to hook in our own reconfiguration logic. This class can be found inside the SDK at `com.couchbase.client.CouchbaseConnection` and for memcached-type buckets in `com.couchbase.client.CouchbaseMemcachedConnection`.
So if a memcached operations (like `get()`) gets issued, it gets passed down until it reaches the IO thread. The IO thread will then put it on a write queue towards its target node. It gets written eventually and then the IO thread adds information to a read queue so the responses can be mapped accordingly. This approach is based on futures, so when the result actually arrives, the Future is marked as completed, the result gets parsed and attached as Object.
The SDK only uses the memcached binary protocol, although spy would also support ASCII. The binary format is much more efficient and some of the advanced operations are only implemented there.
You may wonder how the SDK knows where to send the operation? Since we already have the up-to-date cluster map, we can hash the key and then based on the node list and vBucketMap determine which node to access. The vBucketMap not only contains the information for the master node of the array, but also the information for zero to three replica nodes. Look at this (shortened) example:
vBucketServerMap: {
hashAlgorithm: "CRC",
numReplicas: 1,
serverList: [
"192.168.56.101:11210",
"192.168.56.102:11210"
],
vBucketMap: [
[0,1],
[0,1],
[0,1],
[1,0],
[1,0],
[1,0]
//.....
},
The `serverList` contains our nodes, and the `vBucketMap` has pointers to the `serverList` array. We have 1024 vBuckets, so only some of them are shown here. You can see from looking at it that all keys that has into the first vBucket have its master node at index 0 (so the `.101` node) and its replica at index 1 (so the `.102` node). Once the cluster map changes and the vBuckets move around, we just need to update our config and know all the time where to point our operations towards.
View operations are handled differently. Since views can't be sent to a specific node (because we don't have a way to hash a key or something), we round-robin between the connected nodes. The operation gets assigned to a [com.couchbase.client.ViewNode](http://www.couchbase.com/autodocs/couchbase-java-client-1.1.5/com/couchbase/client/ViewNode.html) once it has free connections and then executed. The result is also handled through futures. To implement this functionality, the SDK uses the third party Apache HTTP Commons (NIO) library.
The whole View API hides behind port 8092 on every node and is very similar to [CouchDB](http://couchdb.apache.org/). It also contains a RESTful API, but the structure is a little bit different. For example, you can reach a design document at `/<bucketname>/_design/<designname>`. It contains the View definitions in JSON:
{
language: "javascript",
views: {
all: {
map: "function (doc) { if(doc.type == "city") {emit([doc.continent, doc.country, doc.name], 1)}}",
reduce: "_sum"
}
}
}
You can then reach down one level further like `/<bucketname>/_design/<designname>/_view/<viewname>` to actually query it:
{"total_rows":9,"rows":[
{"id":"city:shanghai","key":["asia","china","shanghai"],"value":1},
{"id":"city:tokyo","key":["asia","japan","tokyo"],"value":1},
{"id":"city:moscow","key":["asia","russia","moscow"],"value":1},
{"id":"city:vienna","key":["europe","austria","vienna"],"value":1},
{"id":"city:paris","key":["europe","france","paris"],"value":1},
{"id":"city:rome","key":["europe","italy","rome"],"value":1},
{"id":"city:amsterdam","key":["europe","netherlands","amsterdam"],"value":1},
{"id":"city:new_york","key":["north_america","usa","new_york"],"value":1},
{"id":"city:san_francisco","key":["north_america","usa","san_francisco"],"value":1}
]
}
Once the request is sent and a response gets back, it depends on the type of View request to determine on how the response gets parsed. It makes a difference, because reduced View queries look different than non-reduced. The SDK also includes support for spatial Views and they need to be handled differently as well.
The whole View response parsing implementation can be found inside the [com.couchbase.client.protocol.views](http://www.couchbase.com/autodocs/couchbase-java-client-1.1.5/com/couchbase/client/protocol/views/package-frame.html) namespace. You'll find abstract classes and interfaces like `ViewResponse` in there, and then their special implementations like `ViewResponseNoDocs`, `ViewResponseWithDocs` or `ViewResponseReduced`. It also makes a different if `setIncludeDocs()` is used on the Query object, because the SDK also needs to load the full documents using the memcached protocol behind the scenes. This is also done while parsing the Views.
Now that you have a basic understanding on how the SDK distributes its operations under stable conditions, we need to cover an important topic: how the SDK deals with cluster topology changes.
### Operations against a rebalancing cluster
Note that there is a separate blog post upcoming dealing with all the scenarios that may come up when something goes wrong on the SDK. Since rebalancing and failover are crucial parts of the SDK, this post deals more with the general process on how this is handled.
As mentioned earlier, the SDK receives topology updates through the streaming connection. Leaving the special case aside where this node actually gets removed or fails, all updates will stream in near real-time (in a eventually consistent architecture, it may take some time until the cluster updates get populated to that node). The chunks that come in over the stream look exactly like the ones we've seen when reading the initial configuration. After those chunks have been parsed, we need to check if the changes really affect the SDK (since there are many more parameters than the SDK needs, it won't make sense to listen to all of them). All changes that affect the topology and/or vBucket map are considered as important. If nodes get added or removed (be it either through failure or planned), we need to open or close the sockets. This process is called "reconfiguration".
Once such a reconfiguration is triggered, lots of actions need to happen in various places. Spymemcached needs to handle its sockets, View nodes need to be managed and new configuration needs to be updated. The SDK makes sure that only one reconfiguration can happen at the same time through locks so we don't have any race conditions going on.
The Netty-based `BucketUpdateResponseHandler` triggers the `CouchbaseClient#reconfigure` method, which then starts to dispatch everything. Depending on the bucket type used (i.e. memcached type buckets don't have Views and therefore no ViewNodes), configs are updated and sockets closed. Once the reconfiguration is done, it can receive new ones. During planned changes, everything should be pretty much controlled and no operations should fail. If a node is actually down and cannot be reached, those operations will be cancelled. Reconfiguration is tricky because the topology changes while operations are flowing through the system.
Finally, let's cover some differences between Couchbase and Memcache type buckets. All the information hat you've been reading previously only applies to Couchbase buckets. Memcache buckets are pretty basic and do not have the concept of vBuckets. Since you don't have vBuckets, all that the Client has to do is to manage the nodes and their corresponding sockets. Also, a different hashing algorithm is used (mostly Ketama) to determine the target node for each key. Also, memcache buckets don't have views, so you can't use the View API and it doesn't make much sense to keep View sockets around. So to clarify the previous statement, if you are running against a memcache bucket, for a 10 node cluster you'll only have 11 open connections.
Phase 3: Shutdown
-----------------
Once the `CouchbaseClient#shutdown()` method is called, no more operations are allowed to be added onto the `CouchbaseConnection`. Until the timeout is reached, the client wants to make sure that all operations went through accordingly. All sockets for both memcached and View connections are shut down once there are no more operations in the queue (or they get dropped). Note that that the `shutdown` methods on those sockets are also used when a node gets removed from the cluster during normal operations, so it's basically the same, but just for all attached nodes at the same time.
Summary
-------
After reading this blog post, you should have a much more clear picture on how the client SDK works and why it is designed the way it is. We have lots of enhancements planned for future releases, mostly enhancing the direct API experience. Note that this blog post didn't cover how errors are handled inside the SDK; this will be published in a separate blog post because there is also lots of information to cover.
Hi,
yes I think its good to go with it!
You can also incorporate the new article in to the "advanced topic" section:
Here is the markdown version from the published one:
## Motivation
This blog post is intended to be a very detailed and informative article for those who already have used the Couchbase Java SDK and want to know how the internals work. This is not a introduction on how to use the Java SDK and we'll cover some fairly advanced topics on the way.
Normally, when talking about the SDK we mean everything that is needed to get you going (Client library, documentation, release notes,...). In this article though, the SDK refers to the Client library (code) unless stated otherwise.
As always, if you have feedback please let me/us know!
## Introduction
First and foremost, it is important to understand that the SDK wraps and extends the functionality of the [spymemcached](https://github.com/couchbase/spymemcached) (called "spy") memcached library. One of the protocols used internally is the memcached protocol, and a lot of functionality can be reused. On the other hand, once you start to peel off the first layers of the SDK you will notice that some components are somewhat more complex because of the fact that spy provides more features than the SDK needs in the first place. The other part is to remember that a lot of the components are interwoven, so you always need to get the dependency right. Most of the time, we release a new spy version at the same date with a new SDK, because new stuff has been added or fixed.
So, aside from reusing the functionality provided by spy, the SDK mainly adds two blocks of functionality: automatic cluster topology management and since 1.1 (and 2.0 server) support for Views. Aside from that it also provides administrative facilities like bucket and design document management.
To understand how the client operates, we'll dissect the whole process in different life cycle phases of the client. After we go through all three phases (bootstrap, operation and shutdown) you should have a clear picture of whats going on under the hood. Note that there is a separate blog post in the making about error handling, so we won't cover that here in greater detail (which will be published a few weeks later on the same blog here).
## Phase 1: Bootstrap
Before we can actually start serving operations like `get()` and `set()`, we need to bootstrap the `CouchbaseClient` object. The important part that we need to accomplish here is to initially get a cluster configuration (which contains the nodes and vBucket map), but also to establish a streaming connection to receive cluster updates in (near) real-time.
We take the list of nodes passing during bootstrap and iterate over it. The first node in the list that can be contacted on port 8091 is used to walk the RESTful interface on the server. If it is not available, the next one will be tried. This means that going from the provided `http://host:port/pools` URI we eventually follow the links to the bucket entity. All this happens inside a `ConfigurationProvider`, which is in this case the `com.couchbase.client.vbucket.ConfigurationProviderHTTP`. If you want to poke around on the internals, look for `getBucketConfiguration` and `readPools` methods.
A (successful) walk can be illustrated like this:
- GET /pools
- look for the "default" pools
- GET /pools/default
- look for the "buckets" hash which contains the bucket list
- GET /pools/default/buckets
- parse the list of buckets and extract the one provided by the application
- GET /pools/default/buckets/<bucketname>
Now we are at the REST endpoint we need. Inside this JSON response, you'll find all useful details that gets also be used by SDK internally (for example `stre
amingUri`, `nodes` and `vBucketServerMap`). The config gets parsed and stored. Before we move on, let's quickly discuss the strange `pools` part inside our REST walk:
The concept of a resource pool to group buckets was designed for Couchbase Server, but is currently not implemented. Still, the REST API is implemented that way and therefore all SDKs support it. That said, while we could theoretically just go directly to `/pools/default/buckets` and skip the first few queries, the current behaviour is future proof so you won't have to change the bootstrap code once the server implements it.
Back to our bootstrap phase. Now that we have a valid cluster config which contains all the nodes (and their hostnames or ip addresses), we can establish connections to them. Aside from establishing the data connections, we also need to instantiate a streaming connection to one of them. For simplicity reasons, we just establish the streaming connection to the node from the list where we got our initial configuration.
This gets us to an important point to keep in mind: if you have lots of CouchbaseClient objects running on many nodes and they all get bootstrapped with the same list, they may end up connecting to the same node for the streaming connection and create a possible bottleneck. Therefore, to distribute the load a little better I recommend shuffling the array before it gets passed in to the CouchbaseClient object. When you only have a few CouchbaseClient objects connected to your cluster, that won't be a problem at all.
The streaming connection URI is taken from the config we got previously, and normally looks like this:
streamingUri: "/pools/default/bucketsStreaming/default?bucket_uuid=88cae4a609eea500d8ad072fe71a7290"
If you point your browser to this address, you will also get the cluster topology updates streamed in real-time. Since the streaming connection needs to be established all the time and potentially blocks a thread, this is done in the background handled by different threads. We are using the NIO framework [Netty](http://netty.io) for this task, which provides a very handy way of dealing with asynchronous operations. If you want to start digging into this part, keep in mind that all read operations are completely separate from write operations, so you need to deal with handlers that take care of what comes back from the server. Aside from some wiring needed for Netty, the business logic can be found in `com.couchbase.client.vbucket.BucketMonitor` and `com.couchbase.client.vbucket.BucketUpdateResponseHandler`. We also try to reestablish this streaming connection if the socket gets closed (for example if this node gets rebalanced out of the cluster).
To actually shuffle data to the cluster nodes, we need to open various sockets to them. Note that there is absolutely no connection pooling needed inside the client, because we manage all sockets proactively. Aside from the special streaming connection to one of the severs (which is opened against port 8091), we need to open the following connections:
- Memcached Socket: Port 11210
- View Socket: Port 8092
Note that port 11211 is not used inside the client SDKs, but used to connect generic memcached clients that are not cluster aware. This means that these generic clients do not get updated cluster topologies.
So as a rule of thumb, if you have a 10 node cluster running, one CouchbaseClient object open about 21 (2*10 + 1) client sockets. These are directly managed, so if a node gets removed or added the numbers will change accordingly.
Now that all sockets have been opened, we are ready to perform regular cluster operations. As you can see, there is a lot of overhead involved when the CouchbaseClient object gets bootstrapped. Because of this fact, we strongly discourage you from either creating a new object on every request or running a lot of CouchbaseClient objects in one application server. This only adds unnecessary overhead and load on the application server and adds on the total sockets opened against the cluster (resulting in a possible performance problem).
As a point of reference, with regular INFO level logging enabled this is how connecting and disconnecting to a 1-node cluster (Couchbase bucket) should look like:
Apr 17, 2013 3:14:49 PM com.couchbase.client.CouchbaseProperties setPropertyFile
INFO: Could not load properties file "cbclient.properties" because: File not found with system classloader.
2013-04-17 15:14:49.656 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-04-17 15:14:49.673 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@2adb1d4
2013-04-17 15:14:49.718 INFO com.couchbase.client.ViewConnection: Added localhost to connect queue
2013-04-17 15:14:49.720 INFO com.couchbase.client.CouchbaseClient: viewmode property isn't defined. Setting viewmode to production mode
2013-04-17 15:14:49.856 INFO com.couchbase.client.CouchbaseConnection: Shut down Couchbase client
2013-04-17 15:14:49.861 INFO com.couchbase.client.ViewConnection: Node localhost has no ops in the queue
2013-04-17 15:14:49.861 INFO com.couchbase.client.ViewNode: I/O reactor terminated for localhost
If you are connecting to a Couchbase Server 1.8 or against a Memcache-Bucket you won't see View connections getting established:
INFO: Could not load properties file "cbclient.properties" because: File not found with system classloader.
2013-04-17 15:16:44.295 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.56.101:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-04-17 15:16:44.297 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.56.102:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-04-17 15:16:44.298 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.56.103:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-04-17 15:16:44.298 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.56.104:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-04-17 15:16:44.306 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@38b5dac4
2013-04-17 15:16:44.313 INFO com.couchbase.client.CouchbaseClient: viewmode property isn't defined. Setting viewmode to production mode
2013-04-17 15:16:44.332 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@69945ce
2013-04-17 15:16:44.333 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@6766afb3
2013-04-17 15:16:44.334 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@2b2d96f2
2013-04-17 15:16:44.368 INFO net.spy.memcached.auth.AuthThread: Authenticated to 192.168.56.103/192.168.56.103:11210
2013-04-17 15:16:44.368 INFO net.spy.memcached.auth.AuthThread: Authenticated to 192.168.56.102/192.168.56.102:11210
2013-04-17 15:16:44.369 INFO net.spy.memcached.auth.AuthThread: Authenticated to 192.168.56.101/192.168.56.101:11210
2013-04-17 15:16:44.369 INFO net.spy.memcached.auth.AuthThread: Authenticated to 192.168.56.104/192.168.56.104:11210
2013-04-17 15:16:44.490 INFO com.couchbase.client.CouchbaseConnection: Shut down Couchbase client
## Phase 2: Operations
When the SDK is bootstrapped, it enables your application to run operations against the attached cluster. For the purpose of this blog post, we need to distinguish between operations that get executed against a stable cluster and operations on a cluster that is currently experiencing some form of topology change (be it planned because of adding nodes or unplanned because of a node failure). Let's tackle the regular operations first.
### Operations against a stable cluster
While not directly visible in the first place, inside the SDK we need to distinguish between memcached operations and View operations. All operations that have a unique key in their method signature can be treaded as memcached operations. All of them eventually end up getting funneled through spy. View operations on the other hand are implemented completely inside the SDK itself.
Both View and memcached operations are asynchronous. Inside spy, there is one thread (call the I/O thread) dedicated to deal with IO operations. Note that in high-traffic environments, its not unusual that this thread is always active. It uses the non-blocking Java NIO mechanisms to deal with traffic, and loops around "selectors" that get notified when data can either be written or read. If you profile your application you'll see that this thread spends most of its time waiting on a `select` method, it means that it is idling there waiting to be notified for new traffic. The concepts used inside spy to deal with this are common Java NIO knowledge, so you may want to look into the [NIO internals](https://www.ibm.com/developerworks/java/tutorials/j-nio/) first before digging into that code path. Good starting points are the `net.spy.memcached.MemcachedConnection` and `net.spy.memcached.protocol.TCPMemcachedNodeImpl` classes. Note that inside the SDK, we override the `MemcachedConnection` to hook in our own reconfiguration logic. This class can be found inside the SDK at `com.couchbase.client.CouchbaseConnection` and for memcached-type buckets in `com.couchbase.client.CouchbaseMemcachedConnection`.
So if a memcached operations (like `get()`) gets issued, it gets passed down until it reaches the IO thread. The IO thread will then put it on a write queue towards its target node. It gets written eventually and then the IO thread adds information to a read queue so the responses can be mapped accordingly. This approach is based on futures, so when the result actually arrives, the Future is marked as completed, the result gets parsed and attached as Object.
The SDK only uses the memcached binary protocol, although spy would also support ASCII. The binary format is much more efficient and some of the advanced operations are only implemented there.
You may wonder how the SDK knows where to send the operation? Since we already have the up-to-date cluster map, we can hash the key and then based on the node list and vBucketMap determine which node to access. The vBucketMap not only contains the information for the master node of the array, but also the information for zero to three replica nodes. Look at this (shortened) example:
vBucketServerMap: {
hashAlgorithm: "CRC",
numReplicas: 1,
serverList: [
"192.168.56.101:11210",
"192.168.56.102:11210"
],
vBucketMap: [
[0,1],
[0,1],
[0,1],
[1,0],
[1,0],
[1,0]
//.....
},
The `serverList` contains our nodes, and the `vBucketMap` has pointers to the `serverList` array. We have 1024 vBuckets, so only some of them are shown here. You can see from looking at it that all keys that has into the first vBucket have its master node at index 0 (so the `.101` node) and its replica at index 1 (so the `.102` node). Once the cluster map changes and the vBuckets move around, we just need to update our config and know all the time where to point our operations towards.
View operations are handled differently. Since views can't be sent to a specific node (because we don't have a way to hash a key or something), we round-robin between the connected nodes. The operation gets assigned to a [com.couchbase.client.ViewNode](http://www.couchbase.com/autodocs/couchbase-java-client-1.1.5/com/couchbase/client/ViewNode.html) once it has free connections and then executed. The result is also handled through futures. To implement this functionality, the SDK uses the third party Apache HTTP Commons (NIO) library.
The whole View API hides behind port 8092 on every node and is very similar to [CouchDB](http://couchdb.apache.org/). It also contains a RESTful API, but the structure is a little bit different. For example, you can reach a design document at `/<bucketname>/_design/<designname>`. It contains the View definitions in JSON:
{
language: "javascript",
views: {
all: {
map: "function (doc) { if(doc.type == "city") {emit([doc.continent, doc.country, doc.name], 1)}}",
reduce: "_sum"
}
}
}
You can then reach down one level further like `/<bucketname>/_design/<designname>/_view/<viewname>` to actually query it:
{"total_rows":9,"rows":[
{"id":"city:shanghai","key":["asia","china","shanghai"],"value":1},
{"id":"city:tokyo","key":["asia","japan","tokyo"],"value":1},
{"id":"city:moscow","key":["asia","russia","moscow"],"value":1},
{"id":"city:vienna","key":["europe","austria","vienna"],"value":1},
{"id":"city:paris","key":["europe","france","paris"],"value":1},
{"id":"city:rome","key":["europe","italy","rome"],"value":1},
{"id":"city:amsterdam","key":["europe","netherlands","amsterdam"],"value":1},
{"id":"city:new_york","key":["north_america","usa","new_york"],"value":1},
{"id":"city:san_francisco","key":["north_america","usa","san_francisco"],"value":1}
]
}
Once the request is sent and a response gets back, it depends on the type of View request to determine on how the response gets parsed. It makes a difference, because reduced View queries look different than non-reduced. The SDK also includes support for spatial Views and they need to be handled differently as well.
The whole View response parsing implementation can be found inside the [com.couchbase.client.protocol.views](http://www.couchbase.com/autodocs/couchbase-java-client-1.1.5/com/couchbase/client/protocol/views/package-frame.html) namespace. You'll find abstract classes and interfaces like `ViewResponse` in there, and then their special implementations like `ViewResponseNoDocs`, `ViewResponseWithDocs` or `ViewResponseReduced`. It also makes a different if `setIncludeDocs()` is used on the Query object, because the SDK also needs to load the full documents using the memcached protocol behind the scenes. This is also done while parsing the Views.
Now that you have a basic understanding on how the SDK distributes its operations under stable conditions, we need to cover an important topic: how the SDK deals with cluster topology changes.
### Operations against a rebalancing cluster
Note that there is a separate blog post upcoming dealing with all the scenarios that may come up when something goes wrong on the SDK. Since rebalancing and failover are crucial parts of the SDK, this post deals more with the general process on how this is handled.
As mentioned earlier, the SDK receives topology updates through the streaming connection. Leaving the special case aside where this node actually gets removed or fails, all updates will stream in near real-time (in a eventually consistent architecture, it may take some time until the cluster updates get populated to that node). The chunks that come in over the stream look exactly like the ones we've seen when reading the initial configuration. After those chunks have been parsed, we need to check if the changes really affect the SDK (since there are many more parameters than the SDK needs, it won't make sense to listen to all of them). All changes that affect the topology and/or vBucket map are considered as important. If nodes get added or removed (be it either through failure or planned), we need to open or close the sockets. This process is called "reconfiguration".
Once such a reconfiguration is triggered, lots of actions need to happen in various places. Spymemcached needs to handle its sockets, View nodes need to be managed and new configuration needs to be updated. The SDK makes sure that only one reconfiguration can happen at the same time through locks so we don't have any race conditions going on.
The Netty-based `BucketUpdateResponseHandler` triggers the `CouchbaseClient#reconfigure` method, which then starts to dispatch everything. Depending on the bucket type used (i.e. memcached type buckets don't have Views and therefore no ViewNodes), configs are updated and sockets closed. Once the reconfiguration is done, it can receive new ones. During planned changes, everything should be pretty much controlled and no operations should fail. If a node is actually down and cannot be reached, those operations will be cancelled. Reconfiguration is tricky because the topology changes while operations are flowing through the system.
Finally, let's cover some differences between Couchbase and Memcache type buckets. All the information hat you've been reading previously only applies to Couchbase buckets. Memcache buckets are pretty basic and do not have the concept of vBuckets. Since you don't have vBuckets, all that the Client has to do is to manage the nodes and their corresponding sockets. Also, a different hashing algorithm is used (mostly Ketama) to determine the target node for each key. Also, memcache buckets don't have views, so you can't use the View API and it doesn't make much sense to keep View sockets around. So to clarify the previous statement, if you are running against a memcache bucket, for a 10 node cluster you'll only have 11 open connections.
Phase 3: Shutdown
-----------------
Once the `CouchbaseClient#shutdown()` method is called, no more operations are allowed to be added onto the `CouchbaseConnection`. Until the timeout is reached, the client wants to make sure that all operations went through accordingly. All sockets for both memcached and View connections are shut down once there are no more operations in the queue (or they get dropped). Note that that the `shutdown` methods on those sockets are also used when a node gets removed from the cluster during normal operations, so it's basically the same, but just for all attached nodes at the same time.
Summary
-------
After reading this blog post, you should have a much more clear picture on how the client SDK works and why it is designed the way it is. We have lots of enhancements planned for future releases, mostly enhancing the direct API experience. Note that this blog post didn't cover how errors are handled inside the SDK; this will be published in a separate blog post because there is also lots of information to cover.
Show
Karen Zeller
added a comment - Input from Michael for Advanced Topics Chapter:
Hi,
yes I think its good to go with it!
You can also incorporate the new article in to the "advanced topic" section:
Here is the markdown version from the published one:
## Motivation
This blog post is intended to be a very detailed and informative article for those who already have used the Couchbase Java SDK and want to know how the internals work. This is not a introduction on how to use the Java SDK and we'll cover some fairly advanced topics on the way.
Normally, when talking about the SDK we mean everything that is needed to get you going (Client library, documentation, release notes,...). In this article though, the SDK refers to the Client library (code) unless stated otherwise.
As always, if you have feedback please let me/us know!
## Introduction
First and foremost, it is important to understand that the SDK wraps and extends the functionality of the [spymemcached]( https://github.com/couchbase/spymemcached ) (called "spy") memcached library. One of the protocols used internally is the memcached protocol, and a lot of functionality can be reused. On the other hand, once you start to peel off the first layers of the SDK you will notice that some components are somewhat more complex because of the fact that spy provides more features than the SDK needs in the first place. The other part is to remember that a lot of the components are interwoven, so you always need to get the dependency right. Most of the time, we release a new spy version at the same date with a new SDK, because new stuff has been added or fixed.
So, aside from reusing the functionality provided by spy, the SDK mainly adds two blocks of functionality: automatic cluster topology management and since 1.1 (and 2.0 server) support for Views. Aside from that it also provides administrative facilities like bucket and design document management.
To understand how the client operates, we'll dissect the whole process in different life cycle phases of the client. After we go through all three phases (bootstrap, operation and shutdown) you should have a clear picture of whats going on under the hood. Note that there is a separate blog post in the making about error handling, so we won't cover that here in greater detail (which will be published a few weeks later on the same blog here).
## Phase 1: Bootstrap
Before we can actually start serving operations like `get()` and `set()`, we need to bootstrap the `CouchbaseClient` object. The important part that we need to accomplish here is to initially get a cluster configuration (which contains the nodes and vBucket map), but also to establish a streaming connection to receive cluster updates in (near) real-time.
We take the list of nodes passing during bootstrap and iterate over it. The first node in the list that can be contacted on port 8091 is used to walk the RESTful interface on the server. If it is not available, the next one will be tried. This means that going from the provided ` http://host:port/pools ` URI we eventually follow the links to the bucket entity. All this happens inside a `ConfigurationProvider`, which is in this case the `com.couchbase.client.vbucket.ConfigurationProviderHTTP`. If you want to poke around on the internals, look for `getBucketConfiguration` and `readPools` methods.
A (successful) walk can be illustrated like this:
- GET /pools
- look for the "default" pools
- GET /pools/default
- look for the "buckets" hash which contains the bucket list
- GET /pools/default/buckets
- parse the list of buckets and extract the one provided by the application
- GET /pools/default/buckets/<bucketname>
Now we are at the REST endpoint we need. Inside this JSON response, you'll find all useful details that gets also be used by SDK internally (for example `stre
amingUri`, `nodes` and `vBucketServerMap`). The config gets parsed and stored. Before we move on, let's quickly discuss the strange `pools` part inside our REST walk:
The concept of a resource pool to group buckets was designed for Couchbase Server, but is currently not implemented. Still, the REST API is implemented that way and therefore all SDKs support it. That said, while we could theoretically just go directly to `/pools/default/buckets` and skip the first few queries, the current behaviour is future proof so you won't have to change the bootstrap code once the server implements it.
Back to our bootstrap phase. Now that we have a valid cluster config which contains all the nodes (and their hostnames or ip addresses), we can establish connections to them. Aside from establishing the data connections, we also need to instantiate a streaming connection to one of them. For simplicity reasons, we just establish the streaming connection to the node from the list where we got our initial configuration.
This gets us to an important point to keep in mind: if you have lots of CouchbaseClient objects running on many nodes and they all get bootstrapped with the same list, they may end up connecting to the same node for the streaming connection and create a possible bottleneck. Therefore, to distribute the load a little better I recommend shuffling the array before it gets passed in to the CouchbaseClient object. When you only have a few CouchbaseClient objects connected to your cluster, that won't be a problem at all.
The streaming connection URI is taken from the config we got previously, and normally looks like this:
streamingUri: "/pools/default/bucketsStreaming/default?bucket_uuid=88cae4a609eea500d8ad072fe71a7290"
If you point your browser to this address, you will also get the cluster topology updates streamed in real-time. Since the streaming connection needs to be established all the time and potentially blocks a thread, this is done in the background handled by different threads. We are using the NIO framework [Netty]( http://netty.io ) for this task, which provides a very handy way of dealing with asynchronous operations. If you want to start digging into this part, keep in mind that all read operations are completely separate from write operations, so you need to deal with handlers that take care of what comes back from the server. Aside from some wiring needed for Netty, the business logic can be found in `com.couchbase.client.vbucket.BucketMonitor` and `com.couchbase.client.vbucket.BucketUpdateResponseHandler`. We also try to reestablish this streaming connection if the socket gets closed (for example if this node gets rebalanced out of the cluster).
To actually shuffle data to the cluster nodes, we need to open various sockets to them. Note that there is absolutely no connection pooling needed inside the client, because we manage all sockets proactively. Aside from the special streaming connection to one of the severs (which is opened against port 8091), we need to open the following connections:
- Memcached Socket: Port 11210
- View Socket: Port 8092
Note that port 11211 is not used inside the client SDKs, but used to connect generic memcached clients that are not cluster aware. This means that these generic clients do not get updated cluster topologies.
So as a rule of thumb, if you have a 10 node cluster running, one CouchbaseClient object open about 21 (2*10 + 1) client sockets. These are directly managed, so if a node gets removed or added the numbers will change accordingly.
Now that all sockets have been opened, we are ready to perform regular cluster operations. As you can see, there is a lot of overhead involved when the CouchbaseClient object gets bootstrapped. Because of this fact, we strongly discourage you from either creating a new object on every request or running a lot of CouchbaseClient objects in one application server. This only adds unnecessary overhead and load on the application server and adds on the total sockets opened against the cluster (resulting in a possible performance problem).
As a point of reference, with regular INFO level logging enabled this is how connecting and disconnecting to a 1-node cluster (Couchbase bucket) should look like:
Apr 17, 2013 3:14:49 PM com.couchbase.client.CouchbaseProperties setPropertyFile
INFO: Could not load properties file "cbclient.properties" because: File not found with system classloader.
2013-04-17 15:14:49.656 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-04-17 15:14:49.673 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@2adb1d4
2013-04-17 15:14:49.718 INFO com.couchbase.client.ViewConnection: Added localhost to connect queue
2013-04-17 15:14:49.720 INFO com.couchbase.client.CouchbaseClient: viewmode property isn't defined. Setting viewmode to production mode
2013-04-17 15:14:49.856 INFO com.couchbase.client.CouchbaseConnection: Shut down Couchbase client
2013-04-17 15:14:49.861 INFO com.couchbase.client.ViewConnection: Node localhost has no ops in the queue
2013-04-17 15:14:49.861 INFO com.couchbase.client.ViewNode: I/O reactor terminated for localhost
If you are connecting to a Couchbase Server 1.8 or against a Memcache-Bucket you won't see View connections getting established:
INFO: Could not load properties file "cbclient.properties" because: File not found with system classloader.
2013-04-17 15:16:44.295 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.56.101:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-04-17 15:16:44.297 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.56.102:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-04-17 15:16:44.298 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.56.103:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-04-17 15:16:44.298 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/192.168.56.104:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-04-17 15:16:44.306 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@38b5dac4
2013-04-17 15:16:44.313 INFO com.couchbase.client.CouchbaseClient: viewmode property isn't defined. Setting viewmode to production mode
2013-04-17 15:16:44.332 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@69945ce
2013-04-17 15:16:44.333 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@6766afb3
2013-04-17 15:16:44.334 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@2b2d96f2
2013-04-17 15:16:44.368 INFO net.spy.memcached.auth.AuthThread: Authenticated to 192.168.56.103/192.168.56.103:11210
2013-04-17 15:16:44.368 INFO net.spy.memcached.auth.AuthThread: Authenticated to 192.168.56.102/192.168.56.102:11210
2013-04-17 15:16:44.369 INFO net.spy.memcached.auth.AuthThread: Authenticated to 192.168.56.101/192.168.56.101:11210
2013-04-17 15:16:44.369 INFO net.spy.memcached.auth.AuthThread: Authenticated to 192.168.56.104/192.168.56.104:11210
2013-04-17 15:16:44.490 INFO com.couchbase.client.CouchbaseConnection: Shut down Couchbase client
## Phase 2: Operations
When the SDK is bootstrapped, it enables your application to run operations against the attached cluster. For the purpose of this blog post, we need to distinguish between operations that get executed against a stable cluster and operations on a cluster that is currently experiencing some form of topology change (be it planned because of adding nodes or unplanned because of a node failure). Let's tackle the regular operations first.
### Operations against a stable cluster
While not directly visible in the first place, inside the SDK we need to distinguish between memcached operations and View operations. All operations that have a unique key in their method signature can be treaded as memcached operations. All of them eventually end up getting funneled through spy. View operations on the other hand are implemented completely inside the SDK itself.
Both View and memcached operations are asynchronous. Inside spy, there is one thread (call the I/O thread) dedicated to deal with IO operations. Note that in high-traffic environments, its not unusual that this thread is always active. It uses the non-blocking Java NIO mechanisms to deal with traffic, and loops around "selectors" that get notified when data can either be written or read. If you profile your application you'll see that this thread spends most of its time waiting on a `select` method, it means that it is idling there waiting to be notified for new traffic. The concepts used inside spy to deal with this are common Java NIO knowledge, so you may want to look into the [NIO internals]( https://www.ibm.com/developerworks/java/tutorials/j-nio/ ) first before digging into that code path. Good starting points are the `net.spy.memcached.MemcachedConnection` and `net.spy.memcached.protocol.TCPMemcachedNodeImpl` classes. Note that inside the SDK, we override the `MemcachedConnection` to hook in our own reconfiguration logic. This class can be found inside the SDK at `com.couchbase.client.CouchbaseConnection` and for memcached-type buckets in `com.couchbase.client.CouchbaseMemcachedConnection`.
So if a memcached operations (like `get()`) gets issued, it gets passed down until it reaches the IO thread. The IO thread will then put it on a write queue towards its target node. It gets written eventually and then the IO thread adds information to a read queue so the responses can be mapped accordingly. This approach is based on futures, so when the result actually arrives, the Future is marked as completed, the result gets parsed and attached as Object.
The SDK only uses the memcached binary protocol, although spy would also support ASCII. The binary format is much more efficient and some of the advanced operations are only implemented there.
You may wonder how the SDK knows where to send the operation? Since we already have the up-to-date cluster map, we can hash the key and then based on the node list and vBucketMap determine which node to access. The vBucketMap not only contains the information for the master node of the array, but also the information for zero to three replica nodes. Look at this (shortened) example:
vBucketServerMap: {
hashAlgorithm: "CRC",
numReplicas: 1,
serverList: [
"192.168.56.101:11210",
"192.168.56.102:11210"
],
vBucketMap: [
[0,1],
[0,1],
[0,1],
[1,0],
[1,0],
[1,0]
//.....
},
The `serverList` contains our nodes, and the `vBucketMap` has pointers to the `serverList` array. We have 1024 vBuckets, so only some of them are shown here. You can see from looking at it that all keys that has into the first vBucket have its master node at index 0 (so the `.101` node) and its replica at index 1 (so the `.102` node). Once the cluster map changes and the vBuckets move around, we just need to update our config and know all the time where to point our operations towards.
View operations are handled differently. Since views can't be sent to a specific node (because we don't have a way to hash a key or something), we round-robin between the connected nodes. The operation gets assigned to a [com.couchbase.client.ViewNode]( http://www.couchbase.com/autodocs/couchbase-java-client-1.1.5/com/couchbase/client/ViewNode.html ) once it has free connections and then executed. The result is also handled through futures. To implement this functionality, the SDK uses the third party Apache HTTP Commons (NIO) library.
The whole View API hides behind port 8092 on every node and is very similar to [CouchDB]( http://couchdb.apache.org/ ). It also contains a RESTful API, but the structure is a little bit different. For example, you can reach a design document at `/<bucketname>/_design/<designname>`. It contains the View definitions in JSON:
{
language: "javascript",
views: {
all: {
map: "function (doc) { if(doc.type == "city") {emit([doc.continent, doc.country, doc.name], 1)}}",
reduce: "_sum"
}
}
}
You can then reach down one level further like `/<bucketname>/_design/<designname>/_view/<viewname>` to actually query it:
{"total_rows":9,"rows":[
{"id":"city:shanghai","key":["asia","china","shanghai"],"value":1},
{"id":"city:tokyo","key":["asia","japan","tokyo"],"value":1},
{"id":"city:moscow","key":["asia","russia","moscow"],"value":1},
{"id":"city:vienna","key":["europe","austria","vienna"],"value":1},
{"id":"city:paris","key":["europe","france","paris"],"value":1},
{"id":"city:rome","key":["europe","italy","rome"],"value":1},
{"id":"city:amsterdam","key":["europe","netherlands","amsterdam"],"value":1},
{"id":"city:new_york","key":["north_america","usa","new_york"],"value":1},
{"id":"city:san_francisco","key":["north_america","usa","san_francisco"],"value":1}
]
}
Once the request is sent and a response gets back, it depends on the type of View request to determine on how the response gets parsed. It makes a difference, because reduced View queries look different than non-reduced. The SDK also includes support for spatial Views and they need to be handled differently as well.
The whole View response parsing implementation can be found inside the [com.couchbase.client.protocol.views]( http://www.couchbase.com/autodocs/couchbase-java-client-1.1.5/com/couchbase/client/protocol/views/package-frame.html ) namespace. You'll find abstract classes and interfaces like `ViewResponse` in there, and then their special implementations like `ViewResponseNoDocs`, `ViewResponseWithDocs` or `ViewResponseReduced`. It also makes a different if `setIncludeDocs()` is used on the Query object, because the SDK also needs to load the full documents using the memcached protocol behind the scenes. This is also done while parsing the Views.
Now that you have a basic understanding on how the SDK distributes its operations under stable conditions, we need to cover an important topic: how the SDK deals with cluster topology changes.
### Operations against a rebalancing cluster
Note that there is a separate blog post upcoming dealing with all the scenarios that may come up when something goes wrong on the SDK. Since rebalancing and failover are crucial parts of the SDK, this post deals more with the general process on how this is handled.
As mentioned earlier, the SDK receives topology updates through the streaming connection. Leaving the special case aside where this node actually gets removed or fails, all updates will stream in near real-time (in a eventually consistent architecture, it may take some time until the cluster updates get populated to that node). The chunks that come in over the stream look exactly like the ones we've seen when reading the initial configuration. After those chunks have been parsed, we need to check if the changes really affect the SDK (since there are many more parameters than the SDK needs, it won't make sense to listen to all of them). All changes that affect the topology and/or vBucket map are considered as important. If nodes get added or removed (be it either through failure or planned), we need to open or close the sockets. This process is called "reconfiguration".
Once such a reconfiguration is triggered, lots of actions need to happen in various places. Spymemcached needs to handle its sockets, View nodes need to be managed and new configuration needs to be updated. The SDK makes sure that only one reconfiguration can happen at the same time through locks so we don't have any race conditions going on.
The Netty-based `BucketUpdateResponseHandler` triggers the `CouchbaseClient#reconfigure` method, which then starts to dispatch everything. Depending on the bucket type used (i.e. memcached type buckets don't have Views and therefore no ViewNodes), configs are updated and sockets closed. Once the reconfiguration is done, it can receive new ones. During planned changes, everything should be pretty much controlled and no operations should fail. If a node is actually down and cannot be reached, those operations will be cancelled. Reconfiguration is tricky because the topology changes while operations are flowing through the system.
Finally, let's cover some differences between Couchbase and Memcache type buckets. All the information hat you've been reading previously only applies to Couchbase buckets. Memcache buckets are pretty basic and do not have the concept of vBuckets. Since you don't have vBuckets, all that the Client has to do is to manage the nodes and their corresponding sockets. Also, a different hashing algorithm is used (mostly Ketama) to determine the target node for each key. Also, memcache buckets don't have views, so you can't use the View API and it doesn't make much sense to keep View sockets around. So to clarify the previous statement, if you are running against a memcache bucket, for a 10 node cluster you'll only have 11 open connections.
Phase 3: Shutdown
-----------------
Once the `CouchbaseClient#shutdown()` method is called, no more operations are allowed to be added onto the `CouchbaseConnection`. Until the timeout is reached, the client wants to make sure that all operations went through accordingly. All sockets for both memcached and View connections are shut down once there are no more operations in the queue (or they get dropped). Note that that the `shutdown` methods on those sockets are also used when a node gets removed from the cluster during normal operations, so it's basically the same, but just for all attached nodes at the same time.
Summary
-------
After reading this blog post, you should have a much more clear picture on how the client SDK works and why it is designed the way it is. We have lots of enhancements planned for future releases, mostly enhancing the direct API experience. Note that this blog post didn't cover how errors are handled inside the SDK; this will be published in a separate blog post because there is also lots of information to cover.
Hide
Karen Zeller
added a comment -
Confirm with MikeN:
Classpath dependencies section is up to date for 1.1
Also get a screen shot from Netbeans from him for this process:
In this example we use the NetBeans IDE, but you can use any other Java-compatible IDE with the Java SDK as well. After you install the IDE and open it:
Select File -> New Project => Maven => Java Application.
Provide a name for your new project "examples" and change the location to the directory you want.
. . We'll use the com.couchbase namespace, but you can use your own if you like (just make sure to change it later in the source files when you copy them).
Classpath dependencies section is up to date for 1.1
Also get a screen shot from Netbeans from him for this process:
In this example we use the NetBeans IDE, but you can use any other Java-compatible IDE with the Java SDK as well. After you install the IDE and open it:
Select File -> New Project => Maven => Java Application.
Provide a name for your new project "examples" and change the location to the directory you want.
. . We'll use the com.couchbase namespace, but you can use your own if you like (just make sure to change it later in the source files when you copy them).
Show
Karen Zeller
added a comment - Confirm with MikeN:
Classpath dependencies section is up to date for 1.1
Also get a screen shot from Netbeans from him for this process:
In this example we use the NetBeans IDE, but you can use any other Java-compatible IDE with the Java SDK as well. After you install the IDE and open it:
Select File -> New Project => Maven => Java Application.
Provide a name for your new project "examples" and change the location to the directory you want.
. . We'll use the com.couchbase namespace, but you can use your own if you like (just make sure to change it later in the source files when you copy them).
Hide
Updates from Perry/ Michael on Complex Key (for advanced topics, cross reference from Views chapters - confirm best place where with JamesM)
Perry asked me about more information on the ComplexKey object and the current docs are little bit thin ;)
http://www.couchbase.com/docs/couchbase-sdk-java-1.1/advanced-complexkey.html
I wrote some stuff last year that can either be incorportated or linked, whatever do you think fits:
http://nitschinger.at/New-Features-in-the-Couchbase-Java-Client-1-1-dp4
This whole thing will be replaced by a full "view usage" doc from me in the next weeks, so you may not put huge effort into rewriting it or so! Just make it findable for people..
*In this specific instance, I'm going to be rolling in and editing the blog from Michael into the Language Reference. I will also cross reference it from the Views chapter.
Perry asked me about more information on the ComplexKey object and the current docs are little bit thin ;)
http://www.couchbase.com/docs/couchbase-sdk-java-1.1/advanced-complexkey.html
I wrote some stuff last year that can either be incorportated or linked, whatever do you think fits:
http://nitschinger.at/New-Features-in-the-Couchbase-Java-Client-1-1-dp4
This whole thing will be replaced by a full "view usage" doc from me in the next weeks, so you may not put huge effort into rewriting it or so! Just make it findable for people..
*In this specific instance, I'm going to be rolling in and editing the blog from Michael into the Language Reference. I will also cross reference it from the Views chapter.
Show
Karen Zeller
added a comment - - edited Updates from Perry/ Michael on Complex Key (for advanced topics, cross reference from Views chapters - confirm best place where with JamesM)
Perry asked me about more information on the ComplexKey object and the current docs are little bit thin ;)
http://www.couchbase.com/docs/couchbase-sdk-java-1.1/advanced-complexkey.html
I wrote some stuff last year that can either be incorportated or linked, whatever do you think fits:
http://nitschinger.at/New-Features-in-the-Couchbase-Java-Client-1-1-dp4
This whole thing will be replaced by a full "view usage" doc from me in the next weeks, so you may not put huge effort into rewriting it or so! Just make it findable for people..
*In this specific instance, I'm going to be rolling in and editing the blog from Michael into the Language Reference. I will also cross reference it from the Views chapter.
Hide
Karen Zeller
added a comment -
Sent to Michael for clarification 4/24:
Hi Michael,
So I finished reworking the content for chapter 1 and I noticed when I actually worked on the content for 1.3 and 1.4 it seems abrupt and out of place for a Getting Started chapter to suddenly go into this detail.
This chapter is typically the most bare minimum, hello world type example we already have. I suggest we move 1.3 right after 3.2 and move 1.4, which is primarily CAS and GETL to right after 3.3.
Let me know if this sounds good to you.
Karen
Hi Michael,
So I finished reworking the content for chapter 1 and I noticed when I actually worked on the content for 1.3 and 1.4 it seems abrupt and out of place for a Getting Started chapter to suddenly go into this detail.
This chapter is typically the most bare minimum, hello world type example we already have. I suggest we move 1.3 right after 3.2 and move 1.4, which is primarily CAS and GETL to right after 3.3.
Let me know if this sounds good to you.
Karen
Show
Karen Zeller
added a comment - Sent to Michael for clarification 4/24:
Hi Michael,
So I finished reworking the content for chapter 1 and I noticed when I actually worked on the content for 1.3 and 1.4 it seems abrupt and out of place for a Getting Started chapter to suddenly go into this detail.
This chapter is typically the most bare minimum, hello world type example we already have. I suggest we move 1.3 right after 3.2 and move 1.4, which is primarily CAS and GETL to right after 3.3.
Let me know if this sounds good to you.
Karen
Hide
Karen Zeller
added a comment -
Request info from Michael:
Hi Michael,
I'm editing/overhauling the tutorial chapter this week now and I need some clarification:
"If you want to get up and running really quickly, here is how to do it with Jetty. Note that this guide assumes you have MacOS or Linux. If you use Windows, you will need to modify the paths accordingly. Also, make sure to have at least Maven installed on your machine.
Download Couchbase Server 2.0 and install it. Make sure you install the beer-sample dataset when you run the wizard, because this tutorial will use it.
Add the following views and design documents to the beer-sample bucket. Later we publish them to production:
The first design document name is beer and view name is by_name:"
-Are you assuming that someone creates these in Web Console? Or as file then put them from Java SDK, or RESt? In the later steps you run the sample app, so I am assuming they need to create/put into production at this point otherwise the beer sample application will not work.
Regards,
Karen
Hi Michael,
I'm editing/overhauling the tutorial chapter this week now and I need some clarification:
"If you want to get up and running really quickly, here is how to do it with Jetty. Note that this guide assumes you have MacOS or Linux. If you use Windows, you will need to modify the paths accordingly. Also, make sure to have at least Maven installed on your machine.
Download Couchbase Server 2.0 and install it. Make sure you install the beer-sample dataset when you run the wizard, because this tutorial will use it.
Add the following views and design documents to the beer-sample bucket. Later we publish them to production:
The first design document name is beer and view name is by_name:"
-Are you assuming that someone creates these in Web Console? Or as file then put them from Java SDK, or RESt? In the later steps you run the sample app, so I am assuming they need to create/put into production at this point otherwise the beer sample application will not work.
Regards,
Karen
Show
Karen Zeller
added a comment - Request info from Michael:
Hi Michael,
I'm editing/overhauling the tutorial chapter this week now and I need some clarification:
"If you want to get up and running really quickly, here is how to do it with Jetty. Note that this guide assumes you have MacOS or Linux. If you use Windows, you will need to modify the paths accordingly. Also, make sure to have at least Maven installed on your machine.
Download Couchbase Server 2.0 and install it. Make sure you install the beer-sample dataset when you run the wizard, because this tutorial will use it.
Add the following views and design documents to the beer-sample bucket. Later we publish them to production:
The first design document name is beer and view name is by_name:"
-Are you assuming that someone creates these in Web Console? Or as file then put them from Java SDK, or RESt? In the later steps you run the sample app, so I am assuming they need to create/put into production at this point otherwise the beer sample application will not work.
Regards,
Karen
Hide
Karen Zeller
added a comment -
Info from Michael:
On 30.04.2013, at 20:19, "Karen Zeller" <karen.zeller@couchbase.com> wrote:
Hi Michael,
I also notice that the "Quickstart" section is named a pretty vague title. It is really previewing the application if people have Jetty. So I think we should call it that: "Previewing the Application". This also distinguishes it from the next section "Preparation" which I am relabeling "Preparing Your Project"
Sounds good!
In general it is better practice to provide titles as actions intend of things. It is more engaging for the audience:
"Couchbase Server Administration" sounds much better as "Managing Couchbase Server".
Do these changes make sense?
Jup please go ahead with it!
Thanks,
Karen
On 30.04.2013, at 20:19, "Karen Zeller" <karen.zeller@couchbase.com> wrote:
Hi Michael,
I also notice that the "Quickstart" section is named a pretty vague title. It is really previewing the application if people have Jetty. So I think we should call it that: "Previewing the Application". This also distinguishes it from the next section "Preparation" which I am relabeling "Preparing Your Project"
Sounds good!
In general it is better practice to provide titles as actions intend of things. It is more engaging for the audience:
"Couchbase Server Administration" sounds much better as "Managing Couchbase Server".
Do these changes make sense?
Jup please go ahead with it!
Thanks,
Karen
Show
Karen Zeller
added a comment - Info from Michael:
On 30.04.2013, at 20:19, "Karen Zeller" < karen.zeller@couchbase.com > wrote:
Hi Michael,
I also notice that the "Quickstart" section is named a pretty vague title. It is really previewing the application if people have Jetty. So I think we should call it that: "Previewing the Application". This also distinguishes it from the next section "Preparation" which I am relabeling "Preparing Your Project"
Sounds good!
In general it is better practice to provide titles as actions intend of things. It is more engaging for the audience:
"Couchbase Server Administration" sounds much better as "Managing Couchbase Server".
Do these changes make sense?
Jup please go ahead with it!
Thanks,
Karen
Hide
Karen Zeller
added a comment -
Yes you need to! Please cross ref! Thanks :)
On 02.05.2013, at 19:13, "Karen Zeller" <karen.zeller@couchbase.com> wrote:
Just to make sure I am 100% clear - even at this previewing the
application phase with Jetty, someone needs to go into web console and add
those views?
If so I cross reference how to add views in Web ConsoleŠ..
Regards,
Karen
On 02.05.2013, at 19:13, "Karen Zeller" <karen.zeller@couchbase.com> wrote:
Just to make sure I am 100% clear - even at this previewing the
application phase with Jetty, someone needs to go into web console and add
those views?
If so I cross reference how to add views in Web ConsoleŠ..
Regards,
Karen
Show
Karen Zeller
added a comment - Yes you need to! Please cross ref! Thanks :)
On 02.05.2013, at 19:13, "Karen Zeller" < karen.zeller@couchbase.com > wrote:
Just to make sure I am 100% clear - even at this previewing the
application phase with Jetty, someone needs to go into web console and add
those views?
If so I cross reference how to add views in Web ConsoleŠ..
Regards,
Karen
Hide
Karen Zeller
added a comment -
From Michael:
Logging with the Couchbase Java Client
This blog post tells you everything that you need to know about logging with the Couchbase Java Client (and Spymemcached).
## Introduction
There is a huge variety in logging frameworks for Java, and its hard to please everyone. To understand how logging is handled in the SDK, we have to go back a few years. As you may know, the SDK depends on the [spymemcached]() library and therefore also inherits its logging mechanisms. Back in the days when [@dustin]() wrote spymemcached, there was no good abstraction available (speak SLF4J), so he wrote his own. Nowadays things have changed, but spymemcached still inherits this legacy.
At the time of writing, the SDK supports logging to a simple default logger (logs to STDERR from INFO level up), Log4J and the SunLogger (java.util.logging). In the upcoming 2.9.0 release of spymemcached, it will also support the SLF4J logging facade where you can plug in your own implementation. The next version of the SDK (1.1.6) will depend on spy 2.9, so you'll also get the benefits there.
Before we dig into the concepts, here are the supported Log Levels (defined by `net.spy.memcached.compat.log.Level`):
- TRACE (since 2.9)
- DEBUG
- INFO
- WARN
- ERROR
- FATAL
Keep in mind that different loggers implement different levels, so for some of them a mapping happens. This will be noted during the description of each implementation.
We'll now look at the different logging mechanisms available and how you can configure and work with them. SLF4J will be covered towards the end.
## Switching Logging
If you don't change anything, the default logger will be used. This mechanism just prints log messages to STDERR output (from INFO level upwards). Chances are that you want to integrate the SDK with the same logging library that you use as well (in your codebase). The LoggerFactory inside spy decides at construction which one to choose, based on a system property. So you can either change this programmatically or with a param to the `java` command.
If you want to use the Log4JLogger programmatically, do it this way (before initializing the `CouchbaseClient` object):
Properties systemProperties = System.getProperties();
systemProperties.put("net.spy.log.LoggerImpl", "net.spy.memcached.compat.log.Log4JLogger");
System.setProperties(systemProperties);
Of course, you need to add the Log4J JAR to your CLASSPATH to make it work (as we'll see later). Alternatively, you can set it dynamically this way:
java -Dnet.spy.log.LoggerImpl=net.spy.memcached.compat.log.Log4JLogger ...
Now that we are aware of the different implementations, lets look at each of them.
## The Simple Default Logger
If you don't change anything, the SDK will use the DefaultLogger (net.spy.memcached.compat.log.DefaultLogger). This logger has no dependencies and prints every log message that is INFO level or higher (INFO, WARN, ERROR and FATAL) to the systems STDERR. Since the STDERR is covered by most IDEs automatically, you'll also see them in the console output window.
Since its so simple, you can't change the behavior right now. Every log message gets timestamped as well (the format is `yyyy-MM-dd HH:mm:ss.SSS`). Connecting to Couchbase commonly looks like this:
2013-05-07 12:28:41.852 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-05-07 12:28:41.862 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@3d9360e2
2013-05-07 12:28:41.887 INFO com.couchbase.client.ViewConnection: Added localhost to connect queue
2013-05-07 12:28:41.888 INFO com.couchbase.client.CouchbaseClient: viewmode property isn't defined. Setting viewmode to production mode
2013-05-07 12:28:41.986 INFO com.couchbase.client.CouchbaseConnection: Shut down Couchbase client
2013-05-07 12:28:41.991 INFO com.couchbase.client.ViewConnection: Node localhost has no ops in the queue
2013-05-07 12:28:41.992 INFO com.couchbase.client.ViewNode: I/O reactor terminated for localhost
So the format is always: `<timestamp> <level> <classname> <message>`. Remeber that DEBUG messages or so will not be logged, so you won't see them with the DefaultLogger.
## The SunLogger (java.util.logging)
The SunLogger also doesn't introduce additonal dependencies, since it depends on the `java.util.logging` implementation. The `java.util.logging.Level` enum defines the following levels: ALL, CONFIG, FINEST, FINER, FINE, INFO, WARNING, SEVERE and OFF. Since this does not map well to our defined Levels, here is the mapping that happens:
- TRACE to FINEST (since 2.9)
- DEBUG to FINE
- INFO to INFO
- WARN to WARNING
- ERROR to SEVERE
- FATAL to SEVERE
Without any further changes, the SunLogger also prints from INFO level upwards like this:
May 7, 2013 12:42:16 PM com.couchbase.client.CouchbaseProperties setPropertyFile
INFO: Could not load properties file "cbclient.properties" because: File not found with system classloader.
May 7, 2013 12:42:16 PM net.spy.memcached.MemcachedConnection createConnections
INFO: Added {QA sa=/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
May 7, 2013 12:42:16 PM net.spy.memcached.MemcachedConnection handleIO
INFO: Connection state changed for sun.nio.ch.SelectionKeyImpl@4ce2cb55
May 7, 2013 12:42:16 PM com.couchbase.client.ViewConnection createConnections
INFO: Added localhost to connect queue
May 7, 2013 12:42:16 PM com.couchbase.client.CouchbaseClient <init>
INFO: viewmode property isn't defined. Setting viewmode to production mode
May 7, 2013 12:42:16 PM com.couchbase.client.CouchbaseConnection run
INFO: Shut down Couchbase client
May 7, 2013 12:42:16 PM com.couchbase.client.ViewConnection shutdown
INFO: Node localhost has no ops in the queue
May 7, 2013 12:42:16 PM com.couchbase.client.ViewNode$1 run
INFO: I/O reactor terminated for localhost
If you want to change the log level, do DEBUG and lower, you can do it like this:
Logger.getLogger("com.couchbase.client").setLevel(Level.FINEST);
Now there is one more thing you need to do if you want to print all debug messages to the console. You set the logging level correctly, but the `ConsoleHandler` is not set to debug yet.
for(Handler h : Logger.getLogger("com.couchbase.client").getParent().getHandlers()) {
if(h instanceof ConsoleHandler) {
h.setLevel(Level.FINEST);
}
}
So, here is a full example on how to use the `SunLogger` and get all Debug messages on the console.
Properties systemProperties = System.getProperties();
systemProperties.put("net.spy.log.LoggerImpl", "net.spy.memcached.compat.log.SunLogger");
System.setProperties(systemProperties);
Logger logger = Logger.getLogger("com.couchbase.client");
logger.setLevel(Level.FINEST);
for(Handler h : logger.getParent().getHandlers()) {
if(h instanceof ConsoleHandler){
h.setLevel(Level.FINEST);
}
}
Then just go ahead and create your `CouchbaseClient` object, you will see detailed output like this (trimmed here):
May 7, 2013 12:54:34 PM com.couchbase.client.vbucket.ReconfigurableObserver update
FINEST: Received an update, notifying reconfigurables about a com.couchbase.client.vbucket.config.Bucketcom.couchbase.client.vbucket.config.Bucket@3d77949
May 7, 2013 12:54:34 PM com.couchbase.client.vbucket.ReconfigurableObserver update
FINEST: Received an update, notifying reconfigurables about a com.couchbase.client.vbucket.config.Bucketcom.couchbase.client.vbucket.config.Bucket@4e927aef
May 7, 2013 12:54:34 PM com.couchbase.client.vbucket.ReconfigurableObserver update
FINEST: It says it is default and it's talking to /pools/default/bucketsStreaming/default?bucket_uuid=adfff22b70e09fafaa26ca37b7e05e9d
May 7, 2013 12:54:34 PM com.couchbase.client.vbucket.ReconfigurableObserver update
FINEST: It says it is default and it's talking to /pools/default/bucketsStreaming/default?bucket_uuid=adfff22b70e09fafaa26ca37b7e05e9d
## Log4J
Most people will need more flexibility, and Log4J was (and still is) standard in lots of applications. The SDK provides support for Log4J as well. To make it work, you first need to set the Instance correctly:
Properties systemProperties = System.getProperties();
systemProperties.put("net.spy.log.LoggerImpl", "net.spy.memcached.compat.log.Log4JLogger");
System.setProperties(systemProperties);
Now if you run this, you'll get an error that some of the Log4J classes can not be found. This is not a surprise, because its not on the classpath. Let's fix this by adding it accordingly. If you use maven, add the `log4j.log4j` dependency (current version is 1.2.17). You can also just download the JAR and add it to the CLASSPATH as needed.
Now if we run it again, we get another error:
log4j:WARN No appenders could be found for logger (com.couchbase.client.vbucket.ConfigurationProviderHTTP).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
One way to fix this is to get a correct `log4j.xml` configuration file into our CLASSPATH, but to make it work quickly Log4J provides a `BasicConfigurator`. Right after the system property configurations, call this:
org.apache.log4j.BasicConfigurator.configure();
Now if you run it again, you will see that we get nicely printed log messages. You can also see that they show up straight from the DEBUG level (and even contain information from which thread they got logged):
69 [main] INFO com.couchbase.client.CouchbaseConnection - Added {QA sa=/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
70 [main] DEBUG com.couchbase.client.vbucket.VBucketNodeLocator - Updating nodesMap in VBucketNodeLocator.
73 [main] DEBUG com.couchbase.client.vbucket.VBucketNodeLocator - Adding node with hostname 127.0.0.1:11210.
74 [main] DEBUG com.couchbase.client.vbucket.VBucketNodeLocator - Node added is {QA sa=localhost/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=8}.
74 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG com.couchbase.client.CouchbaseConnection - Done dealing with queue.
74 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG com.couchbase.client.CouchbaseConnection - Selecting with delay of 0ms
79 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG com.couchbase.client.CouchbaseConnection - Selected 1, selected 1 keys
79 [Memcached IO over
You can control the logging levels through the usual Log4J mechanisms. I won't go into detail about them here, so please [check out](http://logging.apache.org/log4j/1.2/manual.html) their official documentation (for example on how to use the `PropertyConfigurator` instead).
Speaking of Log4J, [Steffen Larsen](https://twitter.com/zooldk) implemented a [Log4J appender](https://github.com/zooldk/log4j-couchbase) to store logs in Couchbase (instead of a file)!
## The new Facade: SLF4J
Not binding the application to a specific logging library is always a good idea. SLF4J is a facade for various pluggable logging frameworks behind it. So you can choose the logging implementation during runtime, be it [logback](http://logback.qos.ch/), Log4J or others. Since we already tried Log4J, let's make SLF4J work with Logback, one of the other very common log frameworks out there.
Note that SLF4J support will be available in the 1.9.0 release of spymemcached and therefore also in the 1.1.6 release of the SDK.
First, we need to configure it accordingly:
Properties systemProperties = System.getProperties();
systemProperties.put("net.spy.log.LoggerImpl", "net.spy.memcached.compat.log.SLF4JLogger");
System.setProperties(systemProperties);
Now, we need to include two JARs into our classpath. The first one is the SLF4J facade API and the other one is our logging framework of choice. The facade API package is called `slf4j-api` (this package always needs to be in place) and since we want to use logback we need to include the `logback-classic` JAR. Note that this is not specific to the SDK, you can find this information [here](http://logback.qos.ch/manual/introduction.html). If you use maven, you can use this snippet:
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.5</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.0.12</version>
</dependency>
SLF4J will automatically pick up our logback implementation, so the logs will look like this:
13:25:43.692 [main] INFO c.c.client.CouchbaseConnection - Added {QA sa=/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
13:25:43.694 [main] DEBUG c.c.c.vbucket.VBucketNodeLocator - Updating nodesMap in VBucketNodeLocator.
13:25:43.697 [main] DEBUG c.c.c.vbucket.VBucketNodeLocator - Adding node with hostname 127.0.0.1:11210.
13:25:43.697 [main] DEBUG c.c.c.vbucket.VBucketNodeLocator - Node added is {QA sa=localhost/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=8}.
13:25:43.698 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG c.c.client.CouchbaseConnection - Done dealing with queue.
13:25:43.699 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG c.c.client.CouchbaseConnection - Selecting with delay of 0ms
13:25:43.702 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG c.c.client.CouchbaseConnection - Selected 1, selected 1 keys
13:25:43.703 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG c.c.client.CouchbaseConnection - Handling IO for: sun.nio.ch.SelectionKeyImpl@48ff2413 (r=false, w=false, c=true, op={QA sa=localhost/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=8})
13:25:43.703 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] INFO c.c.client.CouchbaseConnection - Connection state changed for sun.nio.ch.SelectionKeyImpl@48ff2413
13:25:43.713
As you can see, they also include DEBUG level logging here. If you don't include the logging implementation during runtime, SLF4J will complain at startup:
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details
If you want to learn how to configure logback, [look here](http://logback.qos.ch/manual/configuration.html).
## Summary
tba.
Logging with the Couchbase Java Client
This blog post tells you everything that you need to know about logging with the Couchbase Java Client (and Spymemcached).
## Introduction
There is a huge variety in logging frameworks for Java, and its hard to please everyone. To understand how logging is handled in the SDK, we have to go back a few years. As you may know, the SDK depends on the [spymemcached]() library and therefore also inherits its logging mechanisms. Back in the days when [@dustin]() wrote spymemcached, there was no good abstraction available (speak SLF4J), so he wrote his own. Nowadays things have changed, but spymemcached still inherits this legacy.
At the time of writing, the SDK supports logging to a simple default logger (logs to STDERR from INFO level up), Log4J and the SunLogger (java.util.logging). In the upcoming 2.9.0 release of spymemcached, it will also support the SLF4J logging facade where you can plug in your own implementation. The next version of the SDK (1.1.6) will depend on spy 2.9, so you'll also get the benefits there.
Before we dig into the concepts, here are the supported Log Levels (defined by `net.spy.memcached.compat.log.Level`):
- TRACE (since 2.9)
- DEBUG
- INFO
- WARN
- ERROR
- FATAL
Keep in mind that different loggers implement different levels, so for some of them a mapping happens. This will be noted during the description of each implementation.
We'll now look at the different logging mechanisms available and how you can configure and work with them. SLF4J will be covered towards the end.
## Switching Logging
If you don't change anything, the default logger will be used. This mechanism just prints log messages to STDERR output (from INFO level upwards). Chances are that you want to integrate the SDK with the same logging library that you use as well (in your codebase). The LoggerFactory inside spy decides at construction which one to choose, based on a system property. So you can either change this programmatically or with a param to the `java` command.
If you want to use the Log4JLogger programmatically, do it this way (before initializing the `CouchbaseClient` object):
Properties systemProperties = System.getProperties();
systemProperties.put("net.spy.log.LoggerImpl", "net.spy.memcached.compat.log.Log4JLogger");
System.setProperties(systemProperties);
Of course, you need to add the Log4J JAR to your CLASSPATH to make it work (as we'll see later). Alternatively, you can set it dynamically this way:
java -Dnet.spy.log.LoggerImpl=net.spy.memcached.compat.log.Log4JLogger ...
Now that we are aware of the different implementations, lets look at each of them.
## The Simple Default Logger
If you don't change anything, the SDK will use the DefaultLogger (net.spy.memcached.compat.log.DefaultLogger). This logger has no dependencies and prints every log message that is INFO level or higher (INFO, WARN, ERROR and FATAL) to the systems STDERR. Since the STDERR is covered by most IDEs automatically, you'll also see them in the console output window.
Since its so simple, you can't change the behavior right now. Every log message gets timestamped as well (the format is `yyyy-MM-dd HH:mm:ss.SSS`). Connecting to Couchbase commonly looks like this:
2013-05-07 12:28:41.852 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-05-07 12:28:41.862 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@3d9360e2
2013-05-07 12:28:41.887 INFO com.couchbase.client.ViewConnection: Added localhost to connect queue
2013-05-07 12:28:41.888 INFO com.couchbase.client.CouchbaseClient: viewmode property isn't defined. Setting viewmode to production mode
2013-05-07 12:28:41.986 INFO com.couchbase.client.CouchbaseConnection: Shut down Couchbase client
2013-05-07 12:28:41.991 INFO com.couchbase.client.ViewConnection: Node localhost has no ops in the queue
2013-05-07 12:28:41.992 INFO com.couchbase.client.ViewNode: I/O reactor terminated for localhost
So the format is always: `<timestamp> <level> <classname> <message>`. Remeber that DEBUG messages or so will not be logged, so you won't see them with the DefaultLogger.
## The SunLogger (java.util.logging)
The SunLogger also doesn't introduce additonal dependencies, since it depends on the `java.util.logging` implementation. The `java.util.logging.Level` enum defines the following levels: ALL, CONFIG, FINEST, FINER, FINE, INFO, WARNING, SEVERE and OFF. Since this does not map well to our defined Levels, here is the mapping that happens:
- TRACE to FINEST (since 2.9)
- DEBUG to FINE
- INFO to INFO
- WARN to WARNING
- ERROR to SEVERE
- FATAL to SEVERE
Without any further changes, the SunLogger also prints from INFO level upwards like this:
May 7, 2013 12:42:16 PM com.couchbase.client.CouchbaseProperties setPropertyFile
INFO: Could not load properties file "cbclient.properties" because: File not found with system classloader.
May 7, 2013 12:42:16 PM net.spy.memcached.MemcachedConnection createConnections
INFO: Added {QA sa=/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
May 7, 2013 12:42:16 PM net.spy.memcached.MemcachedConnection handleIO
INFO: Connection state changed for sun.nio.ch.SelectionKeyImpl@4ce2cb55
May 7, 2013 12:42:16 PM com.couchbase.client.ViewConnection createConnections
INFO: Added localhost to connect queue
May 7, 2013 12:42:16 PM com.couchbase.client.CouchbaseClient <init>
INFO: viewmode property isn't defined. Setting viewmode to production mode
May 7, 2013 12:42:16 PM com.couchbase.client.CouchbaseConnection run
INFO: Shut down Couchbase client
May 7, 2013 12:42:16 PM com.couchbase.client.ViewConnection shutdown
INFO: Node localhost has no ops in the queue
May 7, 2013 12:42:16 PM com.couchbase.client.ViewNode$1 run
INFO: I/O reactor terminated for localhost
If you want to change the log level, do DEBUG and lower, you can do it like this:
Logger.getLogger("com.couchbase.client").setLevel(Level.FINEST);
Now there is one more thing you need to do if you want to print all debug messages to the console. You set the logging level correctly, but the `ConsoleHandler` is not set to debug yet.
for(Handler h : Logger.getLogger("com.couchbase.client").getParent().getHandlers()) {
if(h instanceof ConsoleHandler) {
h.setLevel(Level.FINEST);
}
}
So, here is a full example on how to use the `SunLogger` and get all Debug messages on the console.
Properties systemProperties = System.getProperties();
systemProperties.put("net.spy.log.LoggerImpl", "net.spy.memcached.compat.log.SunLogger");
System.setProperties(systemProperties);
Logger logger = Logger.getLogger("com.couchbase.client");
logger.setLevel(Level.FINEST);
for(Handler h : logger.getParent().getHandlers()) {
if(h instanceof ConsoleHandler){
h.setLevel(Level.FINEST);
}
}
Then just go ahead and create your `CouchbaseClient` object, you will see detailed output like this (trimmed here):
May 7, 2013 12:54:34 PM com.couchbase.client.vbucket.ReconfigurableObserver update
FINEST: Received an update, notifying reconfigurables about a com.couchbase.client.vbucket.config.Bucketcom.couchbase.client.vbucket.config.Bucket@3d77949
May 7, 2013 12:54:34 PM com.couchbase.client.vbucket.ReconfigurableObserver update
FINEST: Received an update, notifying reconfigurables about a com.couchbase.client.vbucket.config.Bucketcom.couchbase.client.vbucket.config.Bucket@4e927aef
May 7, 2013 12:54:34 PM com.couchbase.client.vbucket.ReconfigurableObserver update
FINEST: It says it is default and it's talking to /pools/default/bucketsStreaming/default?bucket_uuid=adfff22b70e09fafaa26ca37b7e05e9d
May 7, 2013 12:54:34 PM com.couchbase.client.vbucket.ReconfigurableObserver update
FINEST: It says it is default and it's talking to /pools/default/bucketsStreaming/default?bucket_uuid=adfff22b70e09fafaa26ca37b7e05e9d
## Log4J
Most people will need more flexibility, and Log4J was (and still is) standard in lots of applications. The SDK provides support for Log4J as well. To make it work, you first need to set the Instance correctly:
Properties systemProperties = System.getProperties();
systemProperties.put("net.spy.log.LoggerImpl", "net.spy.memcached.compat.log.Log4JLogger");
System.setProperties(systemProperties);
Now if you run this, you'll get an error that some of the Log4J classes can not be found. This is not a surprise, because its not on the classpath. Let's fix this by adding it accordingly. If you use maven, add the `log4j.log4j` dependency (current version is 1.2.17). You can also just download the JAR and add it to the CLASSPATH as needed.
Now if we run it again, we get another error:
log4j:WARN No appenders could be found for logger (com.couchbase.client.vbucket.ConfigurationProviderHTTP).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
One way to fix this is to get a correct `log4j.xml` configuration file into our CLASSPATH, but to make it work quickly Log4J provides a `BasicConfigurator`. Right after the system property configurations, call this:
org.apache.log4j.BasicConfigurator.configure();
Now if you run it again, you will see that we get nicely printed log messages. You can also see that they show up straight from the DEBUG level (and even contain information from which thread they got logged):
69 [main] INFO com.couchbase.client.CouchbaseConnection - Added {QA sa=/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
70 [main] DEBUG com.couchbase.client.vbucket.VBucketNodeLocator - Updating nodesMap in VBucketNodeLocator.
73 [main] DEBUG com.couchbase.client.vbucket.VBucketNodeLocator - Adding node with hostname 127.0.0.1:11210.
74 [main] DEBUG com.couchbase.client.vbucket.VBucketNodeLocator - Node added is {QA sa=localhost/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=8}.
74 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG com.couchbase.client.CouchbaseConnection - Done dealing with queue.
74 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG com.couchbase.client.CouchbaseConnection - Selecting with delay of 0ms
79 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG com.couchbase.client.CouchbaseConnection - Selected 1, selected 1 keys
79 [Memcached IO over
You can control the logging levels through the usual Log4J mechanisms. I won't go into detail about them here, so please [check out](http://logging.apache.org/log4j/1.2/manual.html) their official documentation (for example on how to use the `PropertyConfigurator` instead).
Speaking of Log4J, [Steffen Larsen](https://twitter.com/zooldk) implemented a [Log4J appender](https://github.com/zooldk/log4j-couchbase) to store logs in Couchbase (instead of a file)!
## The new Facade: SLF4J
Not binding the application to a specific logging library is always a good idea. SLF4J is a facade for various pluggable logging frameworks behind it. So you can choose the logging implementation during runtime, be it [logback](http://logback.qos.ch/), Log4J or others. Since we already tried Log4J, let's make SLF4J work with Logback, one of the other very common log frameworks out there.
Note that SLF4J support will be available in the 1.9.0 release of spymemcached and therefore also in the 1.1.6 release of the SDK.
First, we need to configure it accordingly:
Properties systemProperties = System.getProperties();
systemProperties.put("net.spy.log.LoggerImpl", "net.spy.memcached.compat.log.SLF4JLogger");
System.setProperties(systemProperties);
Now, we need to include two JARs into our classpath. The first one is the SLF4J facade API and the other one is our logging framework of choice. The facade API package is called `slf4j-api` (this package always needs to be in place) and since we want to use logback we need to include the `logback-classic` JAR. Note that this is not specific to the SDK, you can find this information [here](http://logback.qos.ch/manual/introduction.html). If you use maven, you can use this snippet:
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.5</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.0.12</version>
</dependency>
SLF4J will automatically pick up our logback implementation, so the logs will look like this:
13:25:43.692 [main] INFO c.c.client.CouchbaseConnection - Added {QA sa=/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
13:25:43.694 [main] DEBUG c.c.c.vbucket.VBucketNodeLocator - Updating nodesMap in VBucketNodeLocator.
13:25:43.697 [main] DEBUG c.c.c.vbucket.VBucketNodeLocator - Adding node with hostname 127.0.0.1:11210.
13:25:43.697 [main] DEBUG c.c.c.vbucket.VBucketNodeLocator - Node added is {QA sa=localhost/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=8}.
13:25:43.698 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG c.c.client.CouchbaseConnection - Done dealing with queue.
13:25:43.699 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG c.c.client.CouchbaseConnection - Selecting with delay of 0ms
13:25:43.702 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG c.c.client.CouchbaseConnection - Selected 1, selected 1 keys
13:25:43.703 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG c.c.client.CouchbaseConnection - Handling IO for: sun.nio.ch.SelectionKeyImpl@48ff2413 (r=false, w=false, c=true, op={QA sa=localhost/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=8})
13:25:43.703 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] INFO c.c.client.CouchbaseConnection - Connection state changed for sun.nio.ch.SelectionKeyImpl@48ff2413
13:25:43.713
As you can see, they also include DEBUG level logging here. If you don't include the logging implementation during runtime, SLF4J will complain at startup:
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details
If you want to learn how to configure logback, [look here](http://logback.qos.ch/manual/configuration.html).
## Summary
tba.
Show
Karen Zeller
added a comment - From Michael:
Logging with the Couchbase Java Client
This blog post tells you everything that you need to know about logging with the Couchbase Java Client (and Spymemcached).
## Introduction
There is a huge variety in logging frameworks for Java, and its hard to please everyone. To understand how logging is handled in the SDK, we have to go back a few years. As you may know, the SDK depends on the [spymemcached]() library and therefore also inherits its logging mechanisms. Back in the days when [@dustin]() wrote spymemcached, there was no good abstraction available (speak SLF4J), so he wrote his own. Nowadays things have changed, but spymemcached still inherits this legacy.
At the time of writing, the SDK supports logging to a simple default logger (logs to STDERR from INFO level up), Log4J and the SunLogger (java.util.logging). In the upcoming 2.9.0 release of spymemcached, it will also support the SLF4J logging facade where you can plug in your own implementation. The next version of the SDK (1.1.6) will depend on spy 2.9, so you'll also get the benefits there.
Before we dig into the concepts, here are the supported Log Levels (defined by `net.spy.memcached.compat.log.Level`):
- TRACE (since 2.9)
- DEBUG
- INFO
- WARN
- ERROR
- FATAL
Keep in mind that different loggers implement different levels, so for some of them a mapping happens. This will be noted during the description of each implementation.
We'll now look at the different logging mechanisms available and how you can configure and work with them. SLF4J will be covered towards the end.
## Switching Logging
If you don't change anything, the default logger will be used. This mechanism just prints log messages to STDERR output (from INFO level upwards). Chances are that you want to integrate the SDK with the same logging library that you use as well (in your codebase). The LoggerFactory inside spy decides at construction which one to choose, based on a system property. So you can either change this programmatically or with a param to the `java` command.
If you want to use the Log4JLogger programmatically, do it this way (before initializing the `CouchbaseClient` object):
Properties systemProperties = System.getProperties();
systemProperties.put("net.spy.log.LoggerImpl", "net.spy.memcached.compat.log.Log4JLogger");
System.setProperties(systemProperties);
Of course, you need to add the Log4J JAR to your CLASSPATH to make it work (as we'll see later). Alternatively, you can set it dynamically this way:
java -Dnet.spy.log.LoggerImpl=net.spy.memcached.compat.log.Log4JLogger ...
Now that we are aware of the different implementations, lets look at each of them.
## The Simple Default Logger
If you don't change anything, the SDK will use the DefaultLogger (net.spy.memcached.compat.log.DefaultLogger). This logger has no dependencies and prints every log message that is INFO level or higher (INFO, WARN, ERROR and FATAL) to the systems STDERR. Since the STDERR is covered by most IDEs automatically, you'll also see them in the console output window.
Since its so simple, you can't change the behavior right now. Every log message gets timestamped as well (the format is `yyyy-MM-dd HH:mm:ss.SSS`). Connecting to Couchbase commonly looks like this:
2013-05-07 12:28:41.852 INFO com.couchbase.client.CouchbaseConnection: Added {QA sa=/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
2013-05-07 12:28:41.862 INFO com.couchbase.client.CouchbaseConnection: Connection state changed for sun.nio.ch.SelectionKeyImpl@3d9360e2
2013-05-07 12:28:41.887 INFO com.couchbase.client.ViewConnection: Added localhost to connect queue
2013-05-07 12:28:41.888 INFO com.couchbase.client.CouchbaseClient: viewmode property isn't defined. Setting viewmode to production mode
2013-05-07 12:28:41.986 INFO com.couchbase.client.CouchbaseConnection: Shut down Couchbase client
2013-05-07 12:28:41.991 INFO com.couchbase.client.ViewConnection: Node localhost has no ops in the queue
2013-05-07 12:28:41.992 INFO com.couchbase.client.ViewNode: I/O reactor terminated for localhost
So the format is always: `<timestamp> <level> <classname> <message>`. Remeber that DEBUG messages or so will not be logged, so you won't see them with the DefaultLogger.
## The SunLogger (java.util.logging)
The SunLogger also doesn't introduce additonal dependencies, since it depends on the `java.util.logging` implementation. The `java.util.logging.Level` enum defines the following levels: ALL, CONFIG, FINEST, FINER, FINE, INFO, WARNING, SEVERE and OFF. Since this does not map well to our defined Levels, here is the mapping that happens:
- TRACE to FINEST (since 2.9)
- DEBUG to FINE
- INFO to INFO
- WARN to WARNING
- ERROR to SEVERE
- FATAL to SEVERE
Without any further changes, the SunLogger also prints from INFO level upwards like this:
May 7, 2013 12:42:16 PM com.couchbase.client.CouchbaseProperties setPropertyFile
INFO: Could not load properties file "cbclient.properties" because: File not found with system classloader.
May 7, 2013 12:42:16 PM net.spy.memcached.MemcachedConnection createConnections
INFO: Added {QA sa=/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
May 7, 2013 12:42:16 PM net.spy.memcached.MemcachedConnection handleIO
INFO: Connection state changed for sun.nio.ch.SelectionKeyImpl@4ce2cb55
May 7, 2013 12:42:16 PM com.couchbase.client.ViewConnection createConnections
INFO: Added localhost to connect queue
May 7, 2013 12:42:16 PM com.couchbase.client.CouchbaseClient <init>
INFO: viewmode property isn't defined. Setting viewmode to production mode
May 7, 2013 12:42:16 PM com.couchbase.client.CouchbaseConnection run
INFO: Shut down Couchbase client
May 7, 2013 12:42:16 PM com.couchbase.client.ViewConnection shutdown
INFO: Node localhost has no ops in the queue
May 7, 2013 12:42:16 PM com.couchbase.client.ViewNode$1 run
INFO: I/O reactor terminated for localhost
If you want to change the log level, do DEBUG and lower, you can do it like this:
Logger.getLogger("com.couchbase.client").setLevel(Level.FINEST);
Now there is one more thing you need to do if you want to print all debug messages to the console. You set the logging level correctly, but the `ConsoleHandler` is not set to debug yet.
for(Handler h : Logger.getLogger("com.couchbase.client").getParent().getHandlers()) {
if(h instanceof ConsoleHandler) {
h.setLevel(Level.FINEST);
}
}
So, here is a full example on how to use the `SunLogger` and get all Debug messages on the console.
Properties systemProperties = System.getProperties();
systemProperties.put("net.spy.log.LoggerImpl", "net.spy.memcached.compat.log.SunLogger");
System.setProperties(systemProperties);
Logger logger = Logger.getLogger("com.couchbase.client");
logger.setLevel(Level.FINEST);
for(Handler h : logger.getParent().getHandlers()) {
if(h instanceof ConsoleHandler){
h.setLevel(Level.FINEST);
}
}
Then just go ahead and create your `CouchbaseClient` object, you will see detailed output like this (trimmed here):
May 7, 2013 12:54:34 PM com.couchbase.client.vbucket.ReconfigurableObserver update
FINEST: Received an update, notifying reconfigurables about a com.couchbase.client.vbucket.config.Bucketcom.couchbase.client.vbucket.config.Bucket@3d77949
May 7, 2013 12:54:34 PM com.couchbase.client.vbucket.ReconfigurableObserver update
FINEST: Received an update, notifying reconfigurables about a com.couchbase.client.vbucket.config.Bucketcom.couchbase.client.vbucket.config.Bucket@4e927aef
May 7, 2013 12:54:34 PM com.couchbase.client.vbucket.ReconfigurableObserver update
FINEST: It says it is default and it's talking to /pools/default/bucketsStreaming/default?bucket_uuid=adfff22b70e09fafaa26ca37b7e05e9d
May 7, 2013 12:54:34 PM com.couchbase.client.vbucket.ReconfigurableObserver update
FINEST: It says it is default and it's talking to /pools/default/bucketsStreaming/default?bucket_uuid=adfff22b70e09fafaa26ca37b7e05e9d
## Log4J
Most people will need more flexibility, and Log4J was (and still is) standard in lots of applications. The SDK provides support for Log4J as well. To make it work, you first need to set the Instance correctly:
Properties systemProperties = System.getProperties();
systemProperties.put("net.spy.log.LoggerImpl", "net.spy.memcached.compat.log.Log4JLogger");
System.setProperties(systemProperties);
Now if you run this, you'll get an error that some of the Log4J classes can not be found. This is not a surprise, because its not on the classpath. Let's fix this by adding it accordingly. If you use maven, add the `log4j.log4j` dependency (current version is 1.2.17). You can also just download the JAR and add it to the CLASSPATH as needed.
Now if we run it again, we get another error:
log4j:WARN No appenders could be found for logger (com.couchbase.client.vbucket.ConfigurationProviderHTTP).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
One way to fix this is to get a correct `log4j.xml` configuration file into our CLASSPATH, but to make it work quickly Log4J provides a `BasicConfigurator`. Right after the system property configurations, call this:
org.apache.log4j.BasicConfigurator.configure();
Now if you run it again, you will see that we get nicely printed log messages. You can also see that they show up straight from the DEBUG level (and even contain information from which thread they got logged):
69 [main] INFO com.couchbase.client.CouchbaseConnection - Added {QA sa=/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
70 [main] DEBUG com.couchbase.client.vbucket.VBucketNodeLocator - Updating nodesMap in VBucketNodeLocator.
73 [main] DEBUG com.couchbase.client.vbucket.VBucketNodeLocator - Adding node with hostname 127.0.0.1:11210.
74 [main] DEBUG com.couchbase.client.vbucket.VBucketNodeLocator - Node added is {QA sa=localhost/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=8}.
74 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG com.couchbase.client.CouchbaseConnection - Done dealing with queue.
74 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG com.couchbase.client.CouchbaseConnection - Selecting with delay of 0ms
79 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG com.couchbase.client.CouchbaseConnection - Selected 1, selected 1 keys
79 [Memcached IO over
You can control the logging levels through the usual Log4J mechanisms. I won't go into detail about them here, so please [check out]( http://logging.apache.org/log4j/1.2/manual.html ) their official documentation (for example on how to use the `PropertyConfigurator` instead).
Speaking of Log4J, [Steffen Larsen]( https://twitter.com/zooldk ) implemented a [Log4J appender]( https://github.com/zooldk/log4j-couchbase ) to store logs in Couchbase (instead of a file)!
## The new Facade: SLF4J
Not binding the application to a specific logging library is always a good idea. SLF4J is a facade for various pluggable logging frameworks behind it. So you can choose the logging implementation during runtime, be it [logback]( http://logback.qos.ch/), Log4J or others. Since we already tried Log4J, let's make SLF4J work with Logback, one of the other very common log frameworks out there.
Note that SLF4J support will be available in the 1.9.0 release of spymemcached and therefore also in the 1.1.6 release of the SDK.
First, we need to configure it accordingly:
Properties systemProperties = System.getProperties();
systemProperties.put("net.spy.log.LoggerImpl", "net.spy.memcached.compat.log.SLF4JLogger");
System.setProperties(systemProperties);
Now, we need to include two JARs into our classpath. The first one is the SLF4J facade API and the other one is our logging framework of choice. The facade API package is called `slf4j-api` (this package always needs to be in place) and since we want to use logback we need to include the `logback-classic` JAR. Note that this is not specific to the SDK, you can find this information [here]( http://logback.qos.ch/manual/introduction.html ). If you use maven, you can use this snippet:
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
<version>1.7.5</version>
</dependency>
<dependency>
<groupId>ch.qos.logback</groupId>
<artifactId>logback-classic</artifactId>
<version>1.0.12</version>
</dependency>
SLF4J will automatically pick up our logback implementation, so the logs will look like this:
13:25:43.692 [main] INFO c.c.client.CouchbaseConnection - Added {QA sa=/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0} to connect queue
13:25:43.694 [main] DEBUG c.c.c.vbucket.VBucketNodeLocator - Updating nodesMap in VBucketNodeLocator.
13:25:43.697 [main] DEBUG c.c.c.vbucket.VBucketNodeLocator - Adding node with hostname 127.0.0.1:11210.
13:25:43.697 [main] DEBUG c.c.c.vbucket.VBucketNodeLocator - Node added is {QA sa=localhost/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=8}.
13:25:43.698 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG c.c.client.CouchbaseConnection - Done dealing with queue.
13:25:43.699 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG c.c.client.CouchbaseConnection - Selecting with delay of 0ms
13:25:43.702 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG c.c.client.CouchbaseConnection - Selected 1, selected 1 keys
13:25:43.703 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] DEBUG c.c.client.CouchbaseConnection - Handling IO for: sun.nio.ch.SelectionKeyImpl@48ff2413 (r=false, w=false, c=true, op={QA sa=localhost/127.0.0.1:11210, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=8})
13:25:43.703 [Memcached IO over {MemcachedConnection to localhost/127.0.0.1:11210}] INFO c.c.client.CouchbaseConnection - Connection state changed for sun.nio.ch.SelectionKeyImpl@48ff2413
13:25:43.713
As you can see, they also include DEBUG level logging here. If you don't include the logging implementation during runtime, SLF4J will complain at startup:
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details
If you want to learn how to configure logback, [look here]( http://logback.qos.ch/manual/configuration.html ).
## Summary
tba.
-Michael/Matt will validate contents/topics with internal (support, evangelists) and external customers/developers
-Michael to provides descriptions (generally describes what information a chapter should contain)