Get all keys from bucket
Hello,
I've a question about Couchbase 2.0.
I'm Using Membase 1.7 and I need to retrieve all keys from the bucket.
I've read that Couchbase 2.0 adds query support.
With Couchbase 2.0 it will be possible to query the bucket and retrieve all keys and/or values? How?
Thanks.
I've downloaded jtap (https://github.com/mikewied/jtap) and I've compiled this example to retrieve all keys:
--- TapRunner.java ---
import com.membase.jtap.*;
import com.membase.jtap.exporter.*;
import com.membase.jtap.ops.*;
public class TapRunner
{
public static void main(String args[])
{
TapStreamClient client = new TapStreamClient("localhost", 11210, "default", null);
Exporter exporter = new FileExporter("results.txt");
CustomStream tapListener = new CustomStream(exporter, "node1");
tapListener.keysOnly();
tapListener.doDump();
client.start(tapListener);
}
}
--- TapRunner.java ---
I can compile it with no errors with
# javac -cp .:jtap.jar TapRunner.java
But when I run it I receive these errors:
# java -cp .:jtap.jar TapRunner
Exception in thread "main" java.lang.NoClassDefFoundError: org/slf4j/LoggerFactory
at com.membase.jtap.TapStreamClient.(Unknown Source)
at TapRunner.main(TapRunner.java:11)
Caused by: java.lang.ClassNotFoundException: org.slf4j.LoggerFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:336)
... 2 more
Where is the problem?
Is this the only way to retrieve all keys with TAP protocol, in a txt file?
Can't I iterate all the keys in other ways?
Thanks!
Sorry for the confusion, but definitely do not use JTap. All of that functionality has been added to the Couchbase Java Client at couchbase.com/develop/java/current
Again, fair warning, use this at your own risk (have a look at that wiki page).
Is it possible to iterate through all the key/value pairs using the .NET api?
Thanks
No, it's not currently available in the .NET client library and it's only experimental in the Java client library.
Wouldn't it be easiest to just add a view with the following map?
function (doc, meta) {
emit(meta.id, null);
}
It does add the overhead of an index, but then you can just use the regular query() calls to get all keys.
Yes, create a primary index as adavidson pointed out.
function (doc, meta) {
emit(meta.id, null);
}
this will give you the ability to get all the doc IDs back or search over a range etc. Then get the documents back using the GET api or using mget. That's the most performant way.
One thing to remember though is that this will give you ONLY the persisted indexed documents. Given Couchbase's asynchronous architecture, there may be addition documents in the managed cache that haven't been persisted yet.
You can also use "limit" and "skip" to step through the result set.
http://127.0.0.1:8092/beer-sample/_design/dev_primary_key/_view/primary_...
http://127.0.0.1:8092/beer-sample/_design/dev_primary_key/_view/primary_...
{"total_rows":7315,"rows":[
{"id":"110f033e61","key":"110f033e61","value":null},
{"id":"110f03499b","key":"110f03499b","value":null},
{"id":"110f035200","key":"110f035200","value":null},
{"id":"110f035db2","key":"110f035db2","value":null},
{"id":"110f035e84","key":"110f035e84","value":null},
{"id":"110f03622c","key":"110f03622c","value":null},
{"id":"110f036718","key":"110f036718","value":null},
..
..
..
..
hope this helps
While not fully supported, owing to the fact that you can make requests which would cause lots of memory usage or resource consumption if you're not careful, it sounds like what you want is TAP. See: http://www.couchbase.com/wiki/display/couchbase/TAP+Protocol
Note that the Java client (couchbase.com/develop/java/current) has TAP implemented. Use it with caution and at your own risk. As long as you stay away from checkpoint and registration, it will generally be okay but can cause quite a bit of disk IO if you have much more data on disk than in memory.