I am using the couchbase hadoop connector plugin to import data from couchbase to hadoop. Unfortunately, the values imported appear as byte array references instead of the actual byte array I was expecting. So, for example, the data output to hadoop appears as follows:
I'm having a very uneven distribution of 100usec and 1msec query time for following test command: cbs-pillowfight -h localhost -b default -i 1 -I 10000
It's a two-node cluster running on i7-2600'...
We are having some performance issues with our ongoing project because we get back lots of values from our views and cannot group them because we don't use any reduce function.
Here, we want all gpi - values which share the came cell_id back. problem here is that lots of document share the same gpi and the same cell_id. The group function is no option since we found no reduce to perform here. Since we get back so many duplicate values we have to put in a set in python so they are unique we lose a lot of performance due to iterating through all values couichbase returns.
An example doc and view are in the full description.
Does anyone have any idea how we could solve this problem?
I'm very new to Couchbase, and testing some queries on it these days.
In Couchbase, it seems that we should pre-define 'view' before I execute query. So, every sample article explains how to define views for each condition such as 'by_country", "by_lanuage", and "by_age"....