[MB-12530] possible problem reading from snapshot after rollback? Created: 31/Oct/14  Updated: 31/Oct/14

Status: Open
Project: Couchbase Server
Component/s: forestdb
Affects Version/s: .master
Fix Version/s: None
Security Level: Public

Type: Bug Priority: Major
Reporter: Marty Schoch Assignee: Sundar Sridharan
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: Tested on Mac

Attachments: File test2.c    
Triage: Untriaged
Is this a Regression?: Unknown

 Description   
I've run into a possible issue when I attempt to read from a snapshot created after a rollback. The C code for this test case has been attached.

It performs the following steps:

1. Open Database
2. Create Key 'a' and Commit
3. Create Key 'b' and Commit
4. Remember this as our rollback point
5. Create Key 'c' and Commit
6. Rollback to rollback point (seq 2)
7. Verify that Key 'c' is not found

Up until this point everything works as expected. The key 'c' is not found after rolling back.

8. Open a snapshot at the same point
9. Verify that Key 'c' is not found

However, if we create a snapshot at seq 2, on the db handle that was rolled back, and then we try to read key 'c' on that handle. Something unexpected happens, we successfully read a value for 'c'.

This was unexpected and I'm wondering if my expectations are wrong or if this is a bug.




[MB-12483] [windows] data lost when online upgrade from 2.0.0 to 3.0.1 and reboot cluster Created: 28/Oct/14  Updated: 31/Oct/14

Status: Open
Project: Couchbase Server
Component/s: couchbase-bucket
Affects Version/s: 3.0.1
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Blocker
Reporter: Thuan Nguyen Assignee: Thuan Nguyen
Resolution: Unresolved Votes: 0
Labels: windows
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: windows 2008 R2-64-bit

Triage: Untriaged
Operating System: Windows 64-bit
Is this a Regression?: Yes

 Description   
Install couchbase server 2.0.0 on 2 nodes (248 and 249)
Create a cluster of 2 ndoes and create default bucket
Load 1000 keys to default bucket

Install couchbase server 3.0.1-1444 on 2 other nodes (250 and 251)
Add 2 3.0.1 nodes to 2.0.0 cluser and rebalance. Passed
Remove 2 2.0.0 nodes (248 and 249) out of cluster and rebalance. Passed
DCP upgrade rebalance. Passed
Restart couchbase server on all nodes of 3.0.1 cluster (250 and 251)
After warmup, 2 replica keys are lost.

Live cluster available for debug at

ip:172.23.105.248
ip:172.23.105.249

ip:172.23.105.250
ip:172.23.105.251

Due to jira failed to attach the file, I will put them in cbfs and add the link later for cbcollectinfo

 Comments   
Comment by Thuan Nguyen [ 28/Oct/14 ]
This test case do the same as above
python testrunner.py -i /tmp/windows-4nodes-46.ini -t newupgradetests.MultiNodesUpgradeTests.online_upgrade_rebalance_in_out -p initial_version=2.0.0-1976-rel,reboot_cluster=true,upgrade_version=3.0.1-1444,skip_cleanup=true
Comment by Thuan Nguyen [ 28/Oct/14 ]
Link to cbcollectinfo at cbfs:
http://cbfs.hq.couchbase.com:8484/cbcollect_info/3_0_1/4-nodes-windows-data-lost-301_1444-20141028.tar
Comment by Cihan Biyikoglu [ 28/Oct/14 ]
Does this happen with 2.5.1 or 2.2 as well?
Comment by Anil Kumar [ 28/Oct/14 ]
Tony - In your test steps you have step for "Restart couchbase server on all nodes of 3.0.1 cluster (250 and 251) ". Why did we restart the servers? Is it needed? Can you confirm.
Comment by Mike Wiederhold [ 28/Oct/14 ]
I need access to these machines. Please let me know how I can do that.
Comment by Thuan Nguyen [ 28/Oct/14 ]
Running test from 2.5.1 to 3.0.1-1444 now.
Restart couchbase server is one of test cases in upgrade tests
Comment by Thuan Nguyen [ 28/Oct/14 ]
online upgrade with reboot server from 2.5.1 to 3.0.1-1444 passed. All acitve/replica keys restored back to bucket
Comment by Thuan Nguyen [ 28/Oct/14 ]
Running from 2.2.0 to 3.0.1-1444
Comment by Thuan Nguyen [ 28/Oct/14 ]
Online upgrade with reboot server from 2.2.0 to 3.0.1-1444 passed. All acitve/replica keys restored back to bucket
Comment by Cihan Biyikoglu [ 28/Oct/14 ]
thanks!
Comment by Thuan Nguyen [ 28/Oct/14 ]
I could reproduce this bug when online upgrade from 2.5.0 to 3.0.1-1444
Load 1000 keys to default bucket. At the end of upgrade after reboot, 4 replica keys lost.
Not Ready: vb_replica_curr_items 996 == 1000 expected on '172.23.106.173:8091''172.23.106.174:8091', default bucket

Another live cluster available for debug with same credentials

2.5.0 nodes
1:172.23.106.171
2:172.23.106.172

3.0.1 nodes
3:172.23.106.173
4:172.23.106.174
Comment by Patrick Varley [ 29/Oct/14 ]
Using (http://cbfs.hq.couchbase.com:8484/cbcollect_info/3_0_1/4-nodes-windows-data-lost-301_1444-20141028.tar)

Using mortimer it looks like the warmup lost the items.

There is an interesting error in the memcached logs just before warmup started:

Tue Oct 28 12:22:20.311400 Pacific Daylight Time 3: (default) Warning: failed to parse the vbstat json doc for vbucket 511: {"state": "replica","checkpoint_id": "0","max_deleted_seqno": "0","failover_table": ,"snap_start": "4","snap_end": "4"}
Tue Oct 28 12:22:21.246400 Pacific Daylight Time 3: (default) metadata loaded in 1012 ms
Tue Oct 28 12:22:21.274400 Pacific Daylight Time 3: (default) warmup completed in 1040 ms

Warmup stats showing the missing items:

memcached stats warmup
['cbstats', '-a', '127.0.0.1:11210', 'warmup', '-b', '_admin', '-p', '7dc493af28a80fd53a1e5cb9d8d49b27']
==============================================================================
******************************************************************************
default

 ep_warmup: enabled
 ep_warmup_dups: 0
 ep_warmup_estimate_time: 86000
 ep_warmup_estimated_key_count: 998
 ep_warmup_estimated_value_count: 998
 ep_warmup_item_expired: 0
 ep_warmup_key_count: 998
 ep_warmup_keys_time: 1012000
 ep_warmup_min_item_threshold: 100
 ep_warmup_min_memory_threshold: 100
 ep_warmup_oom: 0
 ep_warmup_state: done
 ep_warmup_thread: complete
 ep_warmup_time: 1040000
 ep_warmup_value_count: 998
Comment by Mike Wiederhold [ 29/Oct/14 ]
I've found two separate issues here that need to be fixed. There is a scenario where we might not update the snap stare/end seqno if the failover log is not written properly to disk. The second problem is to figure out why the failover log wasn't written to disk.
Comment by Gokul Krishnan [ 31/Oct/14 ]
Team, two of our support engineers ran two separate test on linux versions and didn't see this behavior, could you please retest on linux, thanks!
Comment by Mike Wiederhold [ 31/Oct/14 ]
Please retest on a build with this change.

http://review.couchbase.org/#/c/42677/
Comment by Thuan Nguyen [ 31/Oct/14 ]
I will re-test when the build has this fix




[MB-12375] 315% regression in 80th percentile query latency with 1 Bucket, 20M docs, non-DGM scenario with 4x1 views Created: 17/Oct/14  Updated: 31/Oct/14

Status: Open
Project: Couchbase Server
Component/s: view-engine
Affects Version/s: 3.0.1
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Blocker
Reporter: Venu Uppalapati Assignee: Venu Uppalapati
Resolution: Unresolved Votes: 0
Labels: performance
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Operating System: Windows 64-bit
Is this a Regression?: Yes

 Description   
Test description:
80th percentile query latency (ms), 1 bucket x 20M x 2KB, non-DGM, 4 x 1 views, 500 mutations/sec/node, 400 queries/sec

Observation:
80th percentile latency increased from 13ms to 54ms

links:
http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=zeus_301-1330_2c1_access
http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=zeus_251-1083_eba_access

logs:
http://ci.sc.couchbase.com/job/zeus-64/1221/artifact/172.23.96.25.zip
http://ci.sc.couchbase.com/job/zeus-64/1221/artifact/172.23.96.26.zip
http://ci.sc.couchbase.com/job/zeus-64/1221/artifact/172.23.96.27.zip
http://ci.sc.couchbase.com/job/zeus-64/1221/artifact/172.23.96.28.zip
http://ci.sc.couchbase.com/job/zeus-64/1221/artifact/web_log_172.23.96.25.json


 Comments   
Comment by Volker Mische [ 20/Oct/14 ]
I've added the label "Windows" as it really makes a difference. As the performance looks right on Linux.
Comment by Volker Mische [ 20/Oct/14 ]
Removed the "windows" label again as I just found out that there's also a way to set the "Operating System".
Comment by Volker Mische [ 28/Oct/14 ]
I looked into the logs of the 3.0 run on Windows and Linux. I couldn't find anything suspicious. In case anyone wants to have a look, here are the links to all the stuff:

### ShowFast test
80th percentile query latency (ms), 1 bucket x 20M x 2KB, non-DGM, 4 x 1 views, 500 mutations/sec/node, 400 queries/sec

### Leto run on build 1330
http://showfast.sc.couchbase.com/#/runs/query_lat_20M_leto_ssd/3.0.1-1330
http://ci.sc.couchbase.com/job/leto/621/console

### Xeus run on build 1330
http://showfast.sc.couchbase.com/#/runs/query_lat_20M_zeus_ssd/3.0.1-1330
http://ci.sc.couchbase.com/job/zeus-64/1221/

### Comparison between Zeus and Leto run
http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=zeus_301-1330_2c1_access&snapshot=leto_ssd_301-1330_488_access
Comment by Sriram Melkote [ 29/Oct/14 ]
Plan of action:

(a) Nimish to look at 2.5.1 and 3.0.1 runs more closely to characterize the source of slowdown
(b) Ceej to create a R16 2.5.1 build so we can eliminate the Erlang version change variable
(c) Nimish to use timestamps to see if we can narrow down the source of slowdown
Comment by Volker Mische [ 29/Oct/14 ]
The plan of action according to yesterdays meeting should be:

(a) Rerun the test with the 2.5.1 build that uses Erlang R16
(b) Nimish to get it reproduced locally
(c) Nimish to reduce the problem to the smallest possible case (e.g. using a single node and no load)
Comment by Aleksey Kondratenko [ 29/Oct/14 ]
Can somebody post query latency comparison of 3.0.1 vs 2.5.1 on similar hardware but running GNU/Linux ?

Also maybe I'm looking at wrong things but looking at graphs like this: http://i.imgur.com/1RIqfmq.png I see 80%-ile to be far bigger than 300%.

Such massive difference is likely to visible without fancy testrunner tests. But that's just a guess.

Also would be great to know exactly which queries are being sent.
Comment by Volker Mische [ 29/Oct/14 ]
Alk, here's the 2.5.1 run (12ms) [1].
And here the 3.0.0-1330 run (14ms) [2].

You can find them with going to ShowFast [3] and then click on "View Query" and "All". Then search for "80th percentile query latency (ms), 1 bucket x 20M x 2KB, non-DGM, 4 x 1 views, 500 mutations/sec/node, 400 queries/sec". You can then also switch between Linux and Windows builds.

[1]: http://showfast.sc.couchbase.com/#/runs/query_lat_20M_leto_ssd/2.5.1-1083
[2]: http://showfast.sc.couchbase.com/#/runs/query_lat_20M_leto_ssd/3.0.1-1330
[3]: http://showfast.sc.couchbase.com/#/timeline
Comment by Aleksey Kondratenko [ 29/Oct/14 ]
But leto_ssd vs. zeus are different set of hardware. Are we sure we can say that ssd versus hdd doesn't make a difference in this case ?
Comment by Venu Uppalapati [ 29/Oct/14 ]
 there are separate cluster config specifications for zeus for KV,Views and XDCR. all view related tests on zeus run on SSD.
Comment by Aleksey Kondratenko [ 29/Oct/14 ]
Is that _exact_ same hardware as leto_ssd for the purpose of query_lat_20M tests ?
Comment by Aleksey Kondratenko [ 29/Oct/14 ]
Also why then there's zeus_ssd.spec that's different than zeus.spec in perfrunner repo ?
Comment by Venu Uppalapati [ 29/Oct/14 ]
Yes, they are identical
https://github.com/couchbaselabs/perfrunner/blob/master/clusters/zeus_ssd.spec
https://github.com/couchbaselabs/perfrunner/blob/master/clusters/leto_ssd.spec
the hardware dedicated for windows is limited so this is a way of executing different tests, KV(with HDD),Views(with SSD),XDCR(HDD) using the same HW cluster.
Comment by Aleksey Kondratenko [ 29/Oct/14 ]
.spec files only tells me that CPUs are same. It doesn't tell me if rest of hardware is indeed same and configured same way.
Comment by Aleksey Kondratenko [ 29/Oct/14 ]
Looking at original cbmonitor reports I see that 3.x is eating more cpu. Which might be indication of cpu being saturated while in 2.5 run it might be less saturated. Which might cause huge difference in latency even if perf difference is not as great.

In order to prove/disprove that theory I propose to run same comparison with same configuration except that with half load (both in kv ops and in view ops).
Comment by Volker Mische [ 31/Oct/14 ]
Nimish was chatting with me that he saw that the throughput increased in 3.0.1 (click on Windows and search for "Query throughput (qps), 1 bucket x 20M x 2KB, non-DGM, 4 x 1 views, 500 mutations/sec/node" [1]. It has >800 requests per second. If you now compare it to the test mentioned in the bug, you can see that the latency is way better [2] (it takes a while to load). Look at the second "[bucket-1] latency_query" chart. Orange is the throughput test, green the one mentioned in the bug.

[1]: http://showfast.sc.couchbase.com/#/timeline
[2]: http://cbmonitor.sc.couchbase.com/reports/html/?snapshot=zeus_301-1330_2c1_access&snapshot=zeus_301-1437_05c_access
Comment by Nimish Gupta [ 31/Oct/14 ]
From the showfast graph, it looks like with new 3.0.1 build (3.0.1-1437) we are seeing better qps. Venu, could you please run this test with the newer 3.0.1 build (currently results are with 3.0.1-1330 in showfast) ?
Comment by Raju Suravarjjala [ 31/Oct/14 ]
Venu, as per our triage meeting today, can you please rerun the test with 3.0.2 build once it is available and report the results?
Comment by Volker Mische [ 31/Oct/14 ]
Venu, a rerun with build 3.0.1-1437 would be better.
Comment by Venu Uppalapati [ 31/Oct/14 ]
The result of this test from run with 3.0.1-1437 is 23ms(posted to showfast). for comparison the latency in previous runs is 13ms(2.5.1-1083) to 54ms(3.0.1-1330). Will keep the ticket assigned to me and re-run the test with 3.0.2 Windows build when available.




[MB-12446] queries against views have changed from 2.x to 3.x Created: 24/Oct/14  Updated: 31/Oct/14

Status: Open
Project: Couchbase Server
Component/s: view-engine
Affects Version/s: 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Critical
Reporter: Matt Ingenthron Assignee: Nimish Gupta
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Is this a Regression?: Yes

 Description   
Several users have reported that their views behave differently in 3.x than they did in 2.x in the forums. They've worked out that changing the inclusive_end gets closer to the older behavior.

Notably, these forum postings:
https://forums.couchbase.com/t/view-returns-nothing-when-a-key-is-specified-couchbase-3-0-nodejs-sdk-2-0/1856/1
https://forums.couchbase.com/t/unable-to-query-a-view-by-key-on-3-0-0-on-macos/1783

 Comments   
Comment by Nimish Gupta [ 27/Oct/14 ]
I was not able to find the how inclusive_end is set in query. I will look more and discuss with Volker.
Comment by Sriram Melkote [ 29/Oct/14 ]
The behavior in UI is known change and has been fixed elsewhere. However, the thread speaks of the same problem surfacing in node.js client.

This can only happen if server behavior changed, so we need to verify the default behavior has not changed when inclusive_end parameter is not specified between 2.5.1 and 3.0.0.
Comment by Nimish Gupta [ 30/Oct/14 ]
By default inclusive_end is true in server side in 3.0.0. So it is not changed from 2.5.1 in server side.
Comment by Sriram Melkote [ 30/Oct/14 ]
Matt, can you please send a HTTP trace of node.js SDK calls that would have been generated for:
https://forums.couchbase.com/t/view-returns-nothing-when-a-key-is-specified-couchbase-3-0-nodejs-sdk-2-0/1856
Nimish has looked and doesn't see default change on server side, so this is very puzzling.
Comment by Matt Ingenthron [ 31/Oct/14 ]
Brett: Can you test this per what is in the forum postings and see if you see the same behavior? If so, please pass it back to the view-engine team with the requested info.
Comment by Brett Lawson [ 31/Oct/14 ]
I can reproduce this issue with the development console alone. My 2.5.1 cluster properly returns the key as expected, whereas my 3.0.0 cluster fails to return a key, the keys parameter also exhibits the same problem, to reproduce:

1. Add a document to your bucket. (lets say with key: `broken-keys`)
2. Create a view called `dev_test/test` with the following:
  function (doc, meta) {
    emit(meta.id, null);
  }
3. Execute a query on 2.5.0 such as: `/dev_test/_view/test?stale=false&inclusive_end=false&key="broken-keys"`
  Your key will be returned in the result.
4. Execute a similar query on 3.0.0
  No results will be returned.
5. Execute a query with include-end set, such as: `/dev_test/_view/test?stale=false&inclusive_end=true&key="broken-keys"`
  This succeeds on both tested versions.
6. Using `keys=["broken-keys"]` will also exhibit this behaviour.

Cheers, Brett




[MB-12529] need copyright 2014 on OSX couchbase server 3.0.x Created: 31/Oct/14  Updated: 31/Oct/14

Status: Open
Project: Couchbase Server
Component/s: installer
Affects Version/s: 3.0.1
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Minor
Reporter: Steve Yen Assignee: Bin Cui
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Operating System: MacOSX 64-bit
Is this a Regression?: Unknown

 Description   
I downloaded mac osx 3.0.1 version, and after starting it, using the menubar icon/widget, chose the "About Couchbase Server" option. The popup about box window had copyright 2008-2013 on it, so should be incremented to 2014.




[MB-12492] Create 3.0.1 chef-based rightscale template for EE and CE Created: 28/Oct/14  Updated: 31/Oct/14

Status: Open
Project: Couchbase Server
Component/s: cloud
Affects Version/s: 3.0.1
Fix Version/s: 3.0.1
Security Level: Public

Type: Task Priority: Major
Reporter: Anil Kumar Assignee: Wei-Li Liu
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Need template for 3.0.1 GA

 Comments   
Comment by Anil Kumar [ 28/Oct/14 ]
Build 3.0.1-1444
Comment by Wei-Li Liu [ 31/Oct/14 ]
EE template : RightScript Couchbase Server - Enterprise Edition Installation - 3.0.1 [rev3]
CE template: RightScript Couchbase Server - Community Edition Installation - 3.0.1 [rev2]
Comment by Wei-Li Liu [ 31/Oct/14 ]
@Anil I have not publish it yet. Let me know if you want those publish to marketplace




[MB-12417] cbbackup runs out of resources while backing up large DBs. Created: 23/Oct/14  Updated: 31/Oct/14

Status: Open
Project: Couchbase Server
Component/s: tools
Affects Version/s: 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Critical
Reporter: adonoho Assignee: Bin Cui
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: OS X Mavericks, 10.9.5, CBServer Version: 3.0.0 Enterprise Edition (build-1209-rel)

Triage: Untriaged
Operating System: MacOSX 64-bit
Is this a Regression?: Unknown

 Description   
When backing up a large DB from CB server, cbbackup exhausts the resources of the backing up machine. The DB in question has 124 million documents and is 218 GB in size. The machine doing the backing up is only running cbbackup and has 16 GB of RAM and is writing to a large attached drive with roughly three times the available free space as the source DB. I have tried the backup twice and the first time the OS protected itself by pausing the backup process. The second time it just locked up. In both cases, the database had saved only 32.6% and 35.3%, respectively. On the second attempt, I was hoping that the backup would pick up where it left off. Instead it appears to have ignored the versions of the already saved documents and written them again … and crashed at about the same point. The CBServer machine was unaffected by the lockup but I've restarted it to be safe.

It would be quite weird to require that a machine doing the backups be as powerful as the CBServer cluster.

I would very much like to back this data up. It takes days to insert this much data into CBServer. With server 2.5.1, it takes roughly 8 hours to backup and restore this data. Then the view calculations kick in. In other words, this is a critical feature.

 Comments   
Comment by adonoho [ 31/Oct/14 ]
Would it be possible to get a version of cbbackup that doesn't run out of resources?
Comment by Bin Cui [ 31/Oct/14 ]
Please provide some details about how you run the cbbackup. What's the command line arguments you use? Do you use the default batch_max_size, batch_max_bytes and recv_min_bytes or you provide customized values?





[MB-12527] Need to mention about a best practise to use VPN to protect access to data ports when using XDCR Created: 31/Oct/14  Updated: 31/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 3.0
Fix Version/s: None
Security Level: Public

Type: Bug Priority: Major
Reporter: Don Pinto Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Is this a Regression?: Unknown

 Description   
In http://docs.couchbase.com/admin/admin/Tasks/tasks-manage-xdcr-dataEncryption.html :

After the ports table we should have the following best practice -

"If XDCR is employed in a situation where the XDCR traffic between data centers would normally route over the internet, Couchbase recommends the use of a VPN."




[MB-12526] 3.x Docs, Please include usage examples in the REST Documentation Created: 31/Oct/14  Updated: 31/Oct/14

Status: In Progress
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 3.0.1
Fix Version/s: None
Security Level: Public

Type: Bug Priority: Major
Reporter: Ian McCloy Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Dependency
Triage: Untriaged
Is this a Regression?: Unknown

 Description   
Customer has requested usage examples for how to use REST API

Some examples..
http://docs.couchbase.com/admin/admin/REST/rest-failover-graceful.html

/opt/couchbase/bin/curl -u Administrator:Password -d otpNode=ns_1@192.168.0.1 http://localhost:8091/controller/startGracefulFailover

http://docs.couchbase.com/admin/admin/REST/rest-node-recovery-incremental.html

/opt/couchbase/bin/curl -u Administrator:Password -d otpNode=ns_1@192.168.0.1 -d recoveryType=delta http://localhost:8091/controller/setRecoveryType

Please also explain in the documentation what happens when setting the recovery type. (A recovery type can only be applied to a node which has been failed-over. A rebalance is required after setting the type to apply the recovery)

 Comments   
Comment by Ruth Harris [ 31/Oct/14 ]
In progress. Currently, I'm finishing up with rewriting the REST content.
Phase 1 - Rewrite the existing content to present consistent info in a structured manner. That is, putting each topic in a template.
Phase 2 - Verify validity of some existing topics; Add short descriptions for each; Add tested curl examples and example responses for each

I'm currently finishing up Phase 1.
This is a long term project, however, extra focus is being given to 3.0 features.

Thanks, Ruth




[MB-12528] Security Best Practices in Documentation Created: 31/Oct/14  Updated: 31/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 3.0, sherlock
Fix Version/s: None
Security Level: Public

Type: Task Priority: Major
Reporter: Don Pinto Assignee: marija jovanovic
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
We should plan a section on security best practices and roll it out in milestones -

1. Update 3.0 docs to extract security topics from the docs and bring them under security best practices
2. Update with best practices presented during Connect
3. Once sherlock is out, update with topics from sherlock




[MB-12461] GO-XDCR: No error thrown while deleting non-existing replications Created: 27/Oct/14  Updated: 31/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: cross-datacenter-replication
Affects Version/s: sherlock
Fix Version/s: sherlock
Security Level: Public

Type: Bug Priority: Major
Reporter: Aruna Piravi Assignee: Yu Sui
Resolution: Unresolved Votes: 0
Labels: sprint3_xdcr
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Epic Link: XDCR next release
Is this a Regression?: No

 Description   
Impact
---------
will affect CLI and REST

Arunas-MacBook-Pro:~ apiravi$ curl -X POST http://localhost:12100/controller/cancelXDCR/localhost%3A9000_default_localhost%3A9000_target_active3
Arunas-MacBook-Pro:~ apiravi$ curl -X POST http://localhost:12100/controller/cancelXDCR/localhost%3A9000_default_localhost%3A9000_target_active3
Arunas-MacBook-Pro:~ apiravi$ curl -X POST http://localhost:12100/controller/cancelXDCR/localhost%3A9000_default_localhost%3A9000_target_active3u
Arunas-MacBook-Pro:~ apiravi$ curl -X POST http://localhost:12100/controller/cancelXDCR/localhost%3A9000_default_localhost%3A9000_target_active3uuygo8gliuyolikb
Arunas-MacBook-Pro:~ apiravi$

Http server log
--------------------
PipelineManager17:22:18.450934 [INFO] Try to stop the pipeline localhost:9000_default_localhost:9000_target_active3
AdminPort17:22:20.220755 [INFO] Request with path, /controller/cancelXDCR/localhost:9000_default_localhost:9000_target_active3, method, POST, and content type
AdminPort17:22:20.220773 [INFO] handleRequest called
AdminPort17:22:20.220809 [INFO] Request: &{POST /controller/cancelXDCR/localhost:9000_default_localhost:9000_target_active3 HTTP/1.1 1 1 map[User-Agent:[curl/7.30.0] Accept:[*/*]] 0x4851600 0 [] false localhost:12100 map[] map[] <nil> map[] 127.0.0.1:58484 /controller/cancelXDCR/localhost%3A9000_default_localhost%3A9000_target_active3 <nil>}
AdminPort17:22:20.220816 [INFO] Request key decoded: controller/cancelXDCR/dynamic/POST
AdminPort17:22:20.220820 [INFO] doDeleteReplicationRequest
ReplicationManager17:22:20.220832 [INFO] Deleting replication localhost:9000_default_localhost:9000_target_active3
PipelineManager17:22:20.220836 [INFO] Try to stop the pipeline localhost:9000_default_localhost:9000_target_active3
AdminPort17:22:50.948384 [INFO] Request with path, /controller/cancelXDCR/localhost:9000_default_localhost:9000_target_active3u, method, POST, and content type
AdminPort17:22:50.948402 [INFO] handleRequest called
AdminPort17:22:50.948444 [INFO] Request: &{POST /controller/cancelXDCR/localhost:9000_default_localhost:9000_target_active3u HTTP/1.1 1 1 map[User-Agent:[curl/7.30.0] Accept:[*/*]] 0x4851600 0 [] false localhost:12100 map[] map[] <nil> map[] 127.0.0.1:58490 /controller/cancelXDCR/localhost%3A9000_default_localhost%3A9000_target_active3u <nil>}
AdminPort17:22:50.948452 [INFO] Request key decoded: controller/cancelXDCR/dynamic/POST
AdminPort17:22:50.948455 [INFO] doDeleteReplicationRequest
ReplicationManager17:22:50.948468 [INFO] Deleting replication localhost:9000_default_localhost:9000_target_active3u
PipelineManager17:22:50.948472 [INFO] Try to stop the pipeline localhost:9000_default_localhost:9000_target_active3u
AdminPort17:22:57.315855 [INFO] Request with path, /controller/cancelXDCR/localhost:9000_default_localhost:9000_target_active3uuygo8gliuyolikb, method, POST, and content type
AdminPort17:22:57.315873 [INFO] handleRequest called
AdminPort17:22:57.315910 [INFO] Request: &{POST /controller/cancelXDCR/localhost:9000_default_localhost:9000_target_active3uuygo8gliuyolikb HTTP/1.1 1 1 map[User-Agent:[curl/7.30.0] Accept:[*/*]] 0x4851600 0 [] false localhost:12100 map[] map[] <nil> map[] 127.0.0.1:58494 /controller/cancelXDCR/localhost%3A9000_default_localhost%3A9000_target_active3uuygo8gliuyolikb <nil>}
AdminPort17:22:57.315919 [INFO] Request key decoded: controller/cancelXDCR/dynamic/POST
AdminPort17:22:57.315922 [INFO] doDeleteReplicationRequest
ReplicationManager17:22:57.315934 [INFO] Deleting replication localhost:9000_default_localhost:9000_target_active3uuygo8gliuyolikb
PipelineManager17:22:57.315939 [INFO] Try to stop the pipeline localhost:9000_default_localhost:9000_target_active3uuygo8gliuyolikb



 Comments   
Comment by Aruna Piravi [ 29/Oct/14 ]
Hi Yu,

Has this been merged yet? I pulled the latest couchbase_goxdcr_impl and couchbase_goxdcr and built the same. Still seeing the same problem -

Arunas-MacBook-Pro:bin apiravi$ curl -X POST http://localhost:12100/controller/cancelXDCR/localhost%3A9000_default_localhost%3A9000_target_ac
Arunas-MacBook-Pro:bin apiravi$ curl -X POST http://localhost:12100/controller/cancelXDCR/localhost%3A9000_default_localhost%3A9000_target_ac
Arunas-MacBook-Pro:bin apiravi$ curl -X POST http://localhost:12100/controller/cancelXDCR/localhost%3A9000_default_localhost%3A9000_target_active3
Arunas-MacBook-Pro:bin apiravi$ curl -X POST http://localhost:12100/controller/cancelXDCR/localhost%3A9000_default_localhost%3A9000_target_acti
Arunas-MacBook-Pro:bin apiravi$

None of these replications exist. Thanks.

Comment by Yu Sui [ 31/Oct/14 ]
Things are working fine on my machine.

Can you use the following command to double check your xdcr instance and gometa service?

ps -ef | grep xdcr

 501 93910 65795 0 11:46AM ttys000 0:00.01 bin/xdcr
  501 93916 93910 0 11:46AM ttys000 0:00.04 /Users/yu/goprojects/bin/gometa -config /Users/yu/goprojects/src/github.com/Xiaomei-Zhang/goxdcr/services/metadata_svc_config
  501 93992 65795 0 11:48AM ttys000 0:00.00 grep xdcr

There may be more than one instances running if previous tests fail. If so, you will need to kill all of them and re-run tests.





[MB-12391] Cannot start indexer or projector due to dependency on forestdb:: : libforestdb.so: cannot open shared object file: Created: 21/Oct/14  Updated: 31/Oct/14

Status: Open
Project: Couchbase Server
Component/s: secondary-index
Affects Version/s: sherlock
Fix Version/s: sherlock
Security Level: Public

Type: Bug Priority: Test Blocker
Reporter: Parag Agarwal Assignee: Chris Hillery
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: centos 6x, Version: 3.5.0 Enterprise Edition (build-18)


Issue Links:
Dependency
Triage: Untriaged
Is this a Regression?: Unknown

 Description   
Installed latest sherlock build and tried to run indexer and projector

[root@palm-10307 bin]# ./projector --help
./projector: error while loading shared libraries: libforestdb.so: cannot open shared object file: No such file or directory
[root@palm-10307 bin]# ./indexer --help
./indexer: error while loading shared libraries: libforestdb.so: cannot open shared object file: No such file or directory
[root@palm-10307 bin]# ./indexer
./indexer: error while loading shared libraries: libforestdb.so: cannot open shared object file: No such file or directory
[root@palm-10307 bin]# ./projector -debug localhost:9001 -kvaddrs="127.0.0.1:12002”
> -bash: unexpected EOF while looking for matching `"'
-bash: syntax error: unexpected end of file
[root@palm-10307 bin]# ./projector -debug localhost:9001 -kvaddrs="127.0.0.1:12002"
./projector: error while loading shared libraries: libforestdb.so: cannot open shared object file: No such file or directory
[root@palm-10307 bin]#

 Comments   
Comment by Sarath Lakshman [ 22/Oct/14 ]
We need to provide rpath while building indexer using cgo libraries.

Here is the corresponding change:
https://github.com/t3rm1n4l/indexing-build-scripts/commit/981ac7379f4bc58d5f885427f4a6a5e744be7378
Comment by Sarath Lakshman [ 22/Oct/14 ]
Ceej, it would be great if you could fix this problem in your toy build.
Comment by Chris Hillery [ 23/Oct/14 ]
This is actually a really ugly problem. The fix proposed above will only work on Linux with gcc, I'm betting. We limit ourselves to that compiler by using LDFLAGS that way. On a Mac, "RPATH" is handled quite differently, and of course on Windows there's no equivalent concept at all.

CMake has some pretty extensive support for bending RPATH to your will, but it depends on being able to manage the compile and link steps itself as well as the install step. Since it doesn't have built-in support for Go, it can't work its magic.

I'm searching around for other solutions that will be more cross-compiler and cross-platform. (Why did Google have to pick a name for their language that is so hard to Google for?)




[MB-11131] Several python tools don't run on Windows Created: 15/May/14  Updated: 31/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: tools
Affects Version/s: 3.0
Fix Version/s: 3.0
Security Level: Public

Type: Bug Priority: Test Blocker
Reporter: Sriram Melkote Assignee: Bin Cui
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: Windows 7

Issue Links:
Duplicate
is duplicated by MB-11159 missing cbstats.exe, cbbackup.exe, cb... Resolved
is duplicated by MB-11160 beer-sample and gamesim-sample failed... Resolved
Relates to
relates to MB-11182 cbcollect_info: Fails on Win 7 if cbs... Resolved
Triage: Untriaged
Is this a Regression?: Unknown

 Description   
When I run cbcollect_info on May 16th Windows build, I get:

couchbase logs (access) (cbbrowse_logs access) - OK
memcached stats all (['cbstats', '-a', '127.0.0.1:11210', 'all', '-b', '_admin',
 '-p', 'f3f9dfb1d07205bdbab2d570dd9cf9e0']) - Traceback (most recent call last):

  File "cbcollect_info", line 565, in <module>
    main()
  File "cbcollect_info", line 545, in main
    runner.run(task)
  File "cbcollect_info", line 127, in run
    result = task.execute(fp)
  File "cbcollect_info", line 66, in execute
    shell = use_shell, env = env)
  File "subprocess.pyc", line 679, in __init__
  File "subprocess.pyc", line 896, in _execute_child
WindowsError: [Error 2] The system cannot find the file specified

It appears (to me) like cbstats is relying on shebang line, which won't work on Windows.

 Comments   
Comment by Sriram Melkote [ 15/May/14 ]
Same problem with bin/tools/cbdocloader -- it won't run on Windows in the current form.

As this used to work in 2.5.1 - perhaps we are missing batch files, or maybe Voltron compiled python files to executables?

Thanks
Comment by Chris Hillery [ 15/May/14 ]
Voltron is still compiling python tools to exes, or at least it's attempting to.

Can you try from this installer to see if the python tools work better?

https://copy.com/MUxbEKExGIaa
Comment by Sriram Melkote [ 16/May/14 ]
No - the installer above has all the same problems as the one I tested.
Comment by Chris Hillery [ 19/May/14 ]
http://review.couchbase.org/#/c/37296/

Siri, could you try out this installer below and see if there are any other bogus shell scripts or missing python binaries?

https://copy.com/MUxbEKExGIaa
Comment by Thuan Nguyen [ 20/May/14 ]
I will install https://copy.com/MUxbEKExGIaa on windows to test it.
Comment by Thuan Nguyen [ 22/May/14 ]
Do we have any new windows build after fixed from Alk merged on May 21 6:17 PM?
https://www.couchbase.com/issues/browse/MB-11168
Comment by Maria McDuff (Inactive) [ 22/May/14 ]
Ceej,

what is remaining work for you here?
Comment by Chris Hillery [ 22/May/14 ]
I have no known remaining work. I was hoping someone could verify the fix I proposed on Monday before I pushed it through, but at this point I'm just going to publish the change and resolve this issue. If there are other related problems, please either re-open this bug or file a new one.
Comment by Ashvinder Singh [ 27/May/14 ]
I downloaded the latest binary from link: http://factory.hq.couchbase.com:8080/job/cs_300_win6408/152/artifact/voltron/couchbase-server-enterprise-3.0.0-744.setup.exe.
- Installed manually
- cbbackup and cbrestore are not executable or 'exe'.
Comment by Ashvinder Singh [ 27/May/14 ]
Also the build number shown is incorrect.
Comment by Chris Hillery [ 27/May/14 ]
I created https://www.couchbase.com/issues/browse/CBD-1382 for the build number issue.

Ashvinder, could you please test with a "master" build, rather than 3.0.0? Either that, or wait for the next 3.0.0 build and test again? The change I made to fix this originally was only on master builds. It should be in 3.0 builds starting today.
Comment by Wayne Siu [ 28/May/14 ]
Ashvinder,
please try this build http://factory.hq.couchbase.com:8080/job/cs_300_win6408/155/artifact/voltron/couchbase-server-enterprise-3.0.0-747.setup.exe or here
http://factory.couchbase.com/job/cs_300_win6408/155/artifact/voltron/couchbase-server-enterprise-3.0.0-747.setup.exe

Comment by Ashvinder Singh [ 28/May/14 ]
Verified cbbackup and cbrestore are executable
Comment by Kirk Kirkconnell [ 30/Oct/14 ]
The issue with the samples not loading and we tested this in 3.0.1.
Comment by Bin Cui [ 31/Oct/14 ]
http://review.couchbase.org/#/c/42673/




[MB-12466] Query Latency, 80th percentile, for 1 Bucket, 100M doc, 2KB/doc, DGM, 4x1 Views 500 mutations/sec/node, 400 Qops Created: 27/Oct/14  Updated: 31/Oct/14

Status: Open
Project: Couchbase Server
Component/s: view-engine
Affects Version/s: 3.0.1
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Blocker
Reporter: Thomas Anderson Assignee: Nimish Gupta
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: 4 node cluster, 2xSSD, 48vCPU, 64GBRAM, E5-4610@2.4GHz.

Triage: Untriaged
Link to Log File, atop/blg, CBCollectInfo, Core dump: http://ci.sc.couchbase.com/job/leto/644/artifact/172.23.100.29.zip
http://ci.sc.couchbase.com/job/leto/644/artifact/172.23.100.30.zip
http://ci.sc.couchbase.com/job/leto/644/artifact/172.23.100.31.zip
http://ci.sc.couchbase.com/job/leto/644/artifact/172.23.100.32.zip
http://ci.sc.couchbase.com/job/leto/644/artifact/web_log_172.23.100.29.json
Is this a Regression?: Yes

 Description   
3.0.1-1444 is release of 3.0.0-1209 + selected patches.
regression of 46% recorded. latency of 19ms vs 13ms compared with 2.5.1; vs 14ms compared with 3.0.0-1209. multiple runs to validate, ranging from 19-30ms latency.





[MB-12419] stale=false view performance up to 10x slower than stale=ok or stale=update_after with no data changing Created: 23/Oct/14  Updated: 31/Oct/14

Status: Open
Project: Couchbase Server
Component/s: view-engine
Affects Version/s: 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Perry Krug Assignee: Sriram Melkote
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Is this a Regression?: Unknown

 Description   
I have a very basic case where I am trying to understand the performance difference between the various stale options. I've created the beer-sample bucket and have no incoming workload.

Running this command:
for i in `seq 1 1000`; do curl "http://ec2-54-176-27-210.us-west-1.compute.amazo
naws.com:8092/beer-sample/_design/beer/_view/brewery_beers?inclusive_end=false&s
tale=update_after&connection_timeout=60000&limit=1&skip=0" > /dev/null 2> /dev/n
ull; done

Yields about 80 view requests/sec.

Changing that to stale=false gets only about 9 or 10 view requests/sec.

With 2.5.1, they are almost identical.

So two questions:
-Is this difference in performance to be expected?
-Is there any tuning or profiling I can do to improve it?

 Comments   
Comment by Volker Mische [ 30/Oct/14 ]
The difference is expected. With 3.0 the view engine needs to get the information whether an index is up to date trough DCP, which means some overhead (compared to 2.5 where it got it from disk and got even notified about changes).

stale=false is about certain guarantees. You will get the latest state available, this comes with some overhead. Normally you shouldn't rely on views to be up to date, but rather work with stale views.

Whenever you do a stale=false request, you expect that it will have quite some latency as indexing might happen. It's unpredictable, hence I would close this bug as won't fix as it is the expected behavior.
Comment by Perry Krug [ 30/Oct/14 ]
Thanks for the explanation Volker, but I disagree that there's nothing we could or should do with this bug.

Even with 2.5.1, didn't stale=false have to check with "something" whether there were any changes requiring an update?

The reason I'm focused on this case of no data changing is the same reason I was focused on the rebalance speed of an empty bucket many releases ago. Previously it took nearly 10 minutes to rebalance an empty bucket and we were able to bring that down to less than 1 minute which helped immensely in other cases, especially with customers who were rebalancing very small datasets.

For similar reasons, I really believe there is a large amount of improvement we can make here (going from 10x to 2x for example) that would help immensely with our customers who do want to use stale=false whether they have a lot of updates to process or not.
Comment by Volker Mische [ 30/Oct/14 ]
In 2.5.1 we didn't need to check, the updater was notified (and even when we checked, it was probably already in disk cache, hence fast). This slowness is due to using DCP. We don't have direct access to the database anymore, hence more overhead, hence slower. One could say that the slowdown is by design.

There might be ways to make it faster, but I can't think of a way without much effort.




[MB-12474] Ability to completely disable views for a given cluster. Created: 28/Oct/14  Updated: 31/Oct/14

Status: Open
Project: Couchbase Server
Component/s: query, UI, view-engine
Affects Version/s: 3.0
Fix Version/s: feature-backlog
Security Level: Public

Type: Improvement Priority: Major
Reporter: Alex Ma Assignee: Sriram Melkote
Resolution: Unresolved Votes: 0
Labels: customer
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Customer would like the ability to completely disable view/index generation for a given cluster.

Cluster may have only a k/v workload and customer is concerned about disk space usage if a view is created/published on a large data set.

 Comments   
Comment by Sriram Melkote [ 29/Oct/14 ]
We will be able to separate view and KV nodes in a cluster with planned changes. UPR was the first step to allow this. Node roles are the next one. However, this is a longer term plan (ie, post-Sherlock) deliverable.




[MB-12342] Man page for couchbase-cli Created: 13/Oct/14  Updated: 31/Oct/14

Status: Open
Project: Couchbase Server
Component/s: tools
Affects Version/s: 3.0
Fix Version/s: None
Security Level: Public

Type: Improvement Priority: Major
Reporter: Perry Krug Assignee: Bin Cui
Resolution: Unresolved Votes: 0
Labels: supportability
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates to

 Description   
It would be extremely helpful to have a man page for couchbase-cli. Really would be great for all cli tools, but this is the main one that our customers use for automated deployments.






[MB-8669] Doc : Improve the documentation to explain durability options Created: 22/Jul/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: clients, documentation
Affects Version/s: 2.0, 2.0.1, 2.1.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Major
Reporter: Tug Grall (Inactive) Assignee: Amy Kurtzman
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: Doc bug


 Description   
We have been asked from the community to clarify the "durability" options values and impact:
 - differences between PersistTo and ReplicateTo
 - impact of each value on the application code/behavior.

This is raised by our community users here:
http://www.couchbase.com/communities/q-and-a/difference-between-peristto-and-replicateto

Would be good to add description equivalent to the one made by househippo user, coming from:
http://www.couchbase.com/autodocs/couchbase-java-client-1.1.8/net/spy/memcached/PersistTo.html



 Comments   
Comment by Brad Wood [ 22/Jul/13 ]
The specific doc page I'd like to see clarified is the Java API docs for the set commends:
http://www.couchbase.com/docs/couchbase-sdk-java-1.1/couchbase-sdk-java-set-durability.html#table-couchbase-sdk_java_set-persist-replicate

{quote}
enum persistto Specify the number of nodes on which the document must be persisted to before returning.
enum replicateto Specify the number of nodes on which the document must be replicated to before returning
{quote}

Perhaps some verbiage that specifies that persistence doesn't mean just stored, but stored on disk. Also, another thing that had confused me originally was that it seemed replication and persistence were mutually independent actions, but now I believe that you cannot persist without replicating first. In other words, if I require persistence on 3 nodes, that automatically implies that replication to 3 nodes will also occur. If that is true, it might be useful to clarify.

Also, while just now re-reading that page, the word "requirment" is spelled wrong right above the last code block.
Comment by Amy Kurtzman [ 05/Nov/13 ]
See Michael N. for info.
Comment by Brad Wood [ 30/Oct/14 ]
It's been 15 months since this ticket was created for a simple verbiage update. Is there anything I can do to help with it?




[MB-12520] Investigate kernel tunings for SSD/FusionIO performance Created: 30/Oct/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: couchbase-bucket
Affects Version/s: 3.0
Fix Version/s: sherlock
Security Level: Public

Type: Task Priority: Major
Reporter: Perry Krug Assignee: Thomas Anderson
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Recently at a customer we managed to gain a significant disk write performance boost by setting these two kernel parameters (on RHEL):
kernel.sched_min_granularity_ns = 25000000
kernel.sched_migration_cost = 5000000

Keep in mind that you need to be already saturating the CPU but not the disk subsystem so this will need to be run on Enterprise SSD's and/or FusionIO and pushing >150k writes/sec to disk per node.

Coming from: https://access.redhat.com/sites/default/files/attachments/2012_perf_brief-low_latency_tuning_for_rhel6_0.pdf




[MB-12518] Cache Miss Ratio Above 200% Created: 30/Oct/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: couchbase-bucket
Affects Version/s: 3.0.1
Fix Version/s: sherlock
Security Level: Public

Type: Bug Priority: Minor
Reporter: dbwycl Assignee: Chiyoung Seo
Resolution: Unresolved Votes: 0
Labels: windows
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: Windows 2012 Hyper-V VM.

Attachments: PNG File screenshot.png    
Triage: Untriaged
Operating System: Windows 64-bit
Is this a Regression?: Unknown

 Description   
When you look at the bucket charts the cache miss ratio is often at or above 200%. This despite the fact that 100% of items are resident.




[MB-12496] CentOS 5 and Ubuntu 10.04 not deprecated, available as 3.0.1 Created: 29/Oct/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 3.0
Fix Version/s: None
Security Level: Public

Type: Bug Priority: Minor
Reporter: Ian McCloy Assignee: Anil Kumar
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Is this a Regression?: Unknown

 Description   
http://docs.couchbase.com/admin/admin/Misc/deprecated.html

As the title says, CentOS 5 and Ubuntu 10.04 haven't been deprecated in 3.0.0. They are still available in 3.0.1, also Red Hat 5 is missing from this list.



 Comments   
Comment by Ruth Harris [ 29/Oct/14 ]
Raju,

My understanding is that "deprecation" means that customers need to prepare for the operating system not being supported. And that not being supported is planned for some later release. Please clarify.

Please clarify which operating systems are now 1. deprecated and 2. not supported
Please confirm that RHEL 5 is deprecated and not supported. Also, which releases.

I've made this change:

Ubuntu 10.04 Ubuntu 10.04 will not be supported after Couchbase Server version 3.0. 3.0.0 (deprecated) 3.x (not supported)
CentOS 5 CentOS 5 will not be supported after Couchbase Server version 3.0. 3.0.0 (deprecated) 3.x (not supported)

Thanks, Ruth
Comment by Ian McCloy [ 30/Oct/14 ]
It's probably safe to say they are deprecated in 3.0.x not 3.0.0, but PM to confirm.




[MB-12525] Rerun 95% GET latency test for 2.5.1 build, forcing quivalent behavior as STALE=UPDATE_AFTER Created: 30/Oct/14  Updated: 30/Oct/14  Due: 07/Nov/14

Status: Open
Project: Couchbase Server
Component/s: performance
Affects Version/s: 2.5.1
Fix Version/s: None
Security Level: Public

Type: Task Priority: Major
Reporter: Thomas Anderson Assignee: Thomas Anderson
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: 12h
Time Spent: Not Specified
Original Estimate: 12h
Environment: 4x4 cluster, 1 Bucket, 150M x 2K documents, DGM 20%, 6Kops with 20% cache miss rate


 Description   
modify kv_latency test to force STALE=UPDATE_AFTER behavior to allow comparison of performance 2.5.1 and 3.0.1-1440. use UPSERT before GET to force flush of buffer.




[MB-12504] couchbase-cli cluster-init does not work as documented Created: 29/Oct/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation, tools
Affects Version/s: 3.0
Fix Version/s: None
Security Level: Public

Type: Bug Priority: Critical
Reporter: Kirk Kirkconnell Assignee: Bin Cui
Resolution: Unresolved Votes: 0
Labels: customer
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Operating System: Centos 64-bit
Is this a Regression?: Unknown

 Description   
When trying to init a new cluster using the command line tool /opt/couchbase/bin/couchbase-cli it does not init a cluster per what the utility's help system or our web site documentation says either new or old style of cluster-init as it seems it changed for 3.0.

If you do the following command as prescribed when doing couchbase-cli -h or on the web documentation for 3.0 it looks like this:

./couchbase-cli cluster-init -c 127.0.0.1 \
       --cluster-username=Administrator \
       --cluster-password=password \
       --cluster-port=8091 \
       --cluster-ramsize=256

This does not work.

The way it was done in 2.5.1 and earlier was like this:
./couchbase-cli cluster-init -c 127.0.0.1 \
       --cluster-init-username=Administrator \
       --cluster-init-password=password \
       --cluster-init-port=8091 \
       --cluster-init-ramsize=256

This does not work on Couchbase 3.0 or 3.0.1.

Both of the above methods throw an error that you need a -u and -p, but since this is an init of a new cluster, there is no existing username or password. If I just add a -u and -p with appropriate information, it throws an error that the username and/or password are incorrect, but again since there is no cluster yet, how can that be true.

The only way I was able to get this to work was using as follows:
# ./couchbase-cli cluster-init -c 127.0.0.1 \
> -u Administrator \
> -p password \
> --cluster-init-port=8091 \
> --cluster-init-ramsize=300
SUCCESS: init 127.0.0.1

This method does not follow either the newly documented method or the old one. IMO, the tool needs to be fixed to actually follow the current documentation. Just changing the document will not be enough as what the utility does today does not follow any convention. So it is the utility that needs to be fixed.

 Comments   
Comment by Ruth Harris [ 29/Oct/14 ]
Bin,

Could you clarify what needs to be changed and where?
1. CLI implementation, 2. CLI help info, and 3. CLI doc info

Thanks, Ruth
Comment by Bin Cui [ 30/Oct/14 ]
http://review.couchbase.org/#/c/42639/

We should not check Admin user/passwd when creating cluster.




[MB-12524] COLLATE function for specifying international sort orders for text data with n1ql Created: 30/Oct/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: None
Affects Version/s: sherlock
Fix Version/s: feature-backlog
Security Level: Public

Type: Improvement Priority: Major
Reporter: Cihan Biyikoglu Assignee: Sriram Melkote
Resolution: Unresolved Votes: 0
Labels: indexer,, n1ql
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
post sherlock - could not find an item tracking this so creating a tracking item.
need a COLLATE function that can express sort order for international languages when creating indexes and expressing n1ql queries. Something similar to the following should be possible to do:

SELECT UPPERCASE(TOSTR(b1.c1) COLLATE fr_FR) FROM bucket as b1 ORDER BY TOSTR(b1.c1) COLLATE fr_FR




[MB-4568] Need detailed sizing information for Couchbase Server Created: 21/Dec/11  Updated: 30/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.0, 2.2.0, 2.1.1, 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Critical
Reporter: Perry Krug Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: customer, info-request
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Need updated sizing information, both for RAM and disk for 2.0.

For disk, sizing needs to take into account:
-Number of indexes
-Rate of new items / updates
-Compaction timing and thresholds
-Overhead of CouchDB storage
-Difference between JSON and binary data

 Comments   
Comment by kzeller [ 22/Feb/13 ]
For investigation, would defer the research to post 2.0.1
Comment by Anil Kumar [ 24/Jul/13 ]
didn't realize it was for documentation - reopening it.
Comment by kzeller [ 06/Aug/13 ]
Discussion with Anil 8/6/2013:

-Wait for Wayne to return: will do entire "Best Practices" series/workshops and cover this for 2.0, 2.1 and 2.2.
Comment by Anil Kumar [ 25/Mar/14 ]
Ruth to work with Pavel P to get the sizing information for Couchbase Server.




[MB-7395] Need way to document how to stop currently running compaction process per-bucket Created: 12/Dec/12  Updated: 30/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.0, 2.2.0, 2.1.1, 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Major
Reporter: Perry Krug Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: supportability
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
When compaction is going to be running for a long time, we need a way to stop it if it is causing problems.

 Comments   
Comment by Dipti Borkar [ 17/Dec/12 ]
Can you add more details of what you mean by causing problems?

Compaction can be stopped if manually started (see image https://www.evernote.com/shard/s161/sh/e16f76d0-91b7-409b-9a40-f205e3dcd6f0/8cf0fd73a7a4c5de79c13f870fdc88fa ). I believe there is a REST API as well. I thought it was documented but may not be.

Assigning to Aliaksey to check on the REST API to stop compacting.
Comment by Aleksey Kondratenko [ 17/Dec/12 ]
Yes, manual compaction can be cancelled.

If it's automatic compaction and you want to stop, you can do it by disabling autocompaction either globally or in bucket details.
Comment by Dipti Borkar [ 17/Dec/12 ]
Aliaksey,

if there is a REST API to cancel, can you please add the details here and assign to Karen for documentation? Thanks

Comment by Dipti Borkar [ 17/Dec/12 ]
I think I found them.

/pools/PoolId/buckets/Id/controller/compactBucket/
/pools/PoolId/buckets/Id/controller/cancelBucketCompaction/
/pools/PoolId/buckets/Id/controller/compactDatabases/
/pools/PoolId/buckets/Id/controller/cancelDatabasesCompaction/
Comment by Perry Krug [ 18/Dec/12 ]
It would be great to have a button on the bucket to stop compaction regardless of whether it was automatically or manually started. The main problem I'm referring to is the fact that compaction adds quite a bit of disk IO and can impact the speed both of the disk writer and background fetches. if an application in production starts experiencing problems because of this we will want to stop compaction...and as part of usability/supportability, there should be an easy and obvious way to do that by the end user.
Comment by Perry Krug [ 18/Dec/12 ]
Adding documentation...

MC/Karen, could we get some documentation specifically on how to stop compaction until we have a button?
Comment by Perry Krug [ 18/Dec/12 ]
Sorry, just saw Dipti's screenshot. So can we just enable that button (and make it work obviously) when compaction is running automatically as well?
Comment by Aleksey Kondratenko [ 18/Dec/12 ]
There's not much point manually stopping automatic compaction. It'll restart itself within 30 seconds. _Right_ way is by disabling autocomaction if that's what you want.

Comment by Perry Krug [ 18/Dec/12 ]
Yeah, that makes enough sense now.

Docs, can we have a writeup on how to stop compaction effectively both the automatic and manual kind?

Comment by kzeller [ 29/Apr/13 ]
Stopping compaction per bucket available here:

http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-rest-compacting-bucket.html
Comment by kzeller [ 29/Apr/13 ]
Stopping compaction per bucket available here:

http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-rest-compacting-bucket.html
Comment by Perry Krug [ 30/Apr/13 ]
Karen, I think we need a little more work on this one.

-Manual compaction can also be started and stopped through the UI
-How does one stop an automatic compaction?
Comment by Amy Kurtzman [ 14/Jan/14 ]
Assign back to Alk to get information about manual compaction stopping process.




[MB-7718] Docs: Document Couchbase installation file structure Created: 12/Feb/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.0, 2.2.0, 2.1.1, 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Critical
Reporter: Perry Krug Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: customer
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Please create a section of documentation detailing the on-disk file/directory structure for inux/Windows/Mac.

What/where each file is, the purpose, which files are important for backup, how much relative IO is expected for certain files and which are expected to grow and/or take up the most space.

 Comments   
Comment by kzeller [ 20/Mar/13 ]
Consolidating MB-7708: Customer request for on disk file formats too.
Comment by Anil Kumar [ 25/Mar/14 ]
Ruth - Work with Bin to gather information on file structures for linux/windows/mac
Comment by Amy Kurtzman [ 23/Jun/14 ]
Verify info is in 3.0 and close.




[MB-7693] Doc request: Document administrative task of re-sizing with additional resources in Couchbase Created: 06/Feb/13  Updated: 30/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.0, 2.2.0, 2.1.1, 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Critical
Reporter: Perry Krug Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: info-request
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by MB-7790 Docs: Document "administrative task" ... Open

 Description   
As a specific section, document the best practices around changing the RAM size of the whole cluster via rebalance, quota change, etc.

Should be accompanied by description of when this might be used, the considerations invovled. Here is a writeup from support than should be used/adapted: http://support.couchbase.com/entries/21719273-How-to-Reduce-the-RAM-size-on-an-existing-Couchbase-cluster-without-shutting-down-the-cluster

 Comments   
Comment by kzeller [ 18/Mar/13 ]
Hi Perry,

Who would be someone in the organization who knows about this topic and could provide the underlying information/knowledge/guidance?


Thanks,

Karen
Comment by kzeller [ 02/Aug/13 ]
Hi,

I finally have time to integrate this info. Could you please email me access to this content in the knowledge base.


Karen
Comment by Anil Kumar [ 25/Mar/14 ]
This is associated with other ticket MB-7790.
Comment by Perry Krug [ 23/Jun/14 ]
Is this dup'ed by mb-7790? I think those two are somewhat separate...




[MB-7678] [Done-RN 2.0.2] Stats calls through moxi don't always give valid stats Created: 04/Feb/13  Updated: 30/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: moxi
Affects Version/s: 2.0, 2.0.1, 2.1.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Story Priority: Major
Reporter: Mike Wiederhold Assignee: Mike Wiederhold
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Forum post: http://www.couchbase.com/forums/comment/reply/1002748/1008656

Some of our users are going through moxi to get stats and this can result in incorrect information being returned. We need to decide how moxi should handle stats calls or if stats should just be disabled in moxi.

 Comments   
Comment by Maria McDuff (Inactive) [ 16/Apr/13 ]
steve contemplating whether to fix or not to fix. deferring out of 2.0.2.
Comment by Maria McDuff (Inactive) [ 19/Apr/13 ]
per PM, closing as won't fix. if this becomes a customer issue, we'll revisit.
moxi is no longer recommended in prod. using the smart client as the recommended.
Comment by Maria McDuff (Inactive) [ 19/Apr/13 ]
as discussed, PM recommends that this needs to be looked at since there are active users of moxi. mike agrees to investigate. if he has bandwidth next week, he will spend time to fix the stats issue.
Comment by kzeller [ 23/Apr/13 ]
Added to 2.0.2 Release notes as:

<rnentry type="knownissue">

<version ver="2.0.0m"/>

<class id="cluster"/>

<issue type="cb" ref="MB-7678"/>


<rntext>

<para>
Stats call via Moxi are currently enabled, however if you use the command-line tool <command>cbstats</command> through Moxi on port 11211 to
Couchbase Server, you will recieve incorrect server statistics. To avoid this issue you should use port 11210 when you make a
command>cbstats</command> request.
</para>


</rntext>

</rnentry>




[MB-7513] Couchbase "Runbooks" Created: 09/Jan/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Major
Reporter: Anonymous Assignee: Gokul Krishnan
Resolution: Unresolved Votes: 0
Labels: info-request
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: File Re- Crowdsourced Runbook.eml    

 Description   
Request from customers via intake from Perry:

-Couchbase "runbooks" are frequently requested by our customers. Things like "what to do when x y or z happens"

 Comments   
Comment by kzeller [ 21/Mar/13 ]
Very often I get asked questions about best practices and "runbook" tasks from the administrators/operators of
As the series progresses we can discuss how and where to incorporate into the main docs, similar to Tug's and Jasdeep's tutorials.....


==============
Couchbase...mostly from training sessions and existing customer engagements.

I had the idea earlier this week to try a blog series of once-a-week, self-contained answers to such questions. I'd ask that the "community" supply questions after seeding it with an initial set.

I'm comfortable with taking this on myself (famous last words) and was planning on using the existing blog infrastructure. Any feedback/pushback from you all?

Perry
Comment by kzeller [ 25/Mar/13 ]
Dependency:

-Perry to run crowd-sourced info aggregation
-Perry to deliver series of blogs that will be rolled into docs.
Comment by kzeller [ 24/Jul/13 ]
Hi Gokul,

This is another pubs requests that was supposed to turn into a series of blogs on the topic.

Let me know if this can be handled as knowledge base articles as this is more of 'operations'/field-facing content vs. core product features/behavior.


Regards,

Karen
Comment by Amy Kurtzman [ 23/Jun/14 ]
Gokul, is this still an outstanding issue?
Comment by Gokul Krishnan [ 25/Jun/14 ]
Yes




[MB-7790] Docs: Document "administrative task" of regular, planned server maintenance Created: 20/Feb/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.0, 2.2.0, 2.1.1, 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Critical
Reporter: Perry Krug Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: customer
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
duplicates MB-7693 Doc request: Document administrative ... Reopened

 Description   
From a customer request:
We are obliged to follow a regular OS patching schedule for all our servers and have a maintenance window every Friday night.
How would you recommend we deal with our Couchbase clusters for patching?
 
From reading the Couchbase 2.0 Manual it looks like we have two options, one being a failover, and the other removing the node then re-adding it.
What steps would you recommend we do when taking a node out to do maintenance on it? We plan to do this during our regular maintenance window when load on the servers would be really light.

And the answer:
Our best practice would be a graceful remove and rebalance so I would recommend that first. If you find it takes too long, you could do a failover. The danger with that is that some data would not be replicated and so an unexpected failure during that time would introduce a situation where you need to manually recover data. The graceful remove doesn't introduce that.

Given that these are vms, it would actually be best to spin up one or more new nodes and swap them into the cluster, that way you never reduce capacity.




Add stats for all operations to memcached (MB-7761)

[MB-7807] aggregate all kinds of ops in ops/sec stat (was: Replica Reads don't show up in the UI) Created: 22/Feb/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: UI
Affects Version/s: 1.8.1, 2.0, 2.0.1
Fix Version/s: 3.0.2
Security Level: Public

Type: Technical task Priority: Major
Reporter: Michael Nitschinger Assignee: Anil Kumar
Resolution: Unresolved Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by MB-7712 UI: CAS operations not aggregated int... Resolved
is duplicated by MB-10377 getl and cas not reported in the GUI ... Resolved

 Description   
Original description (by Michael):
While working on adding replica read to the java client, I noticed that replica reads don't show up in the UI as operations done.

I don't know if this is okay or not, but we should either fix or document it, because once we expose it to clients on a broader basis, they will run into this as well I guess.

Update:

There's other ticket(s) elsewhere. I.e. I've seen cas is not reflected too (interesting. how so? given that binary protocol doesn't have dedicated cas command). We should also consider counting evict command as well as get/set-meta-s


 Comments   
Comment by Maria McDuff (Inactive) [ 25/Mar/13 ]
bug scrub: assigning to anil. moving to 2.1
Comment by Mike Wiederhold [ 02/Oct/14 ]
http://review.couchbase.org/#/c/41821
Comment by Mark Nunberg [ 12/Oct/14 ]
I'd like to add that this is even more important now that some SDKs are moving over to a unified 'document' model where mutation operations are implicitly employing the CAS number. This can lead users (and maybe developers such as myself :)) to think that something is wrong




[MB-7773] [Windows] back and next button don't work properly during offline upgrade Created: 19/Feb/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: installer
Affects Version/s: 2.0.1, 2.5.1, 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Thuan Nguyen Assignee: Bin Cui
Resolution: Unresolved Votes: 0
Labels: system-test, windows_pm_triaged
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: Windows 2008 R2 64bit

Attachments: PNG File ss_2013-02-19_at_1.01.00 PM.png     PNG File ss_2013-02-19_at_1.19.59 PM.png    
Triage: Untriaged
Is this a Regression?: Yes

 Description   
During offline upgrade from couchbase server 1.8.1 to couchbase server 2.0.1-160, there is a windows that lets user to choose manually convert sqlite to couch or let upgrade proccess convert it. When I click on back button on that windows, it goes back previous screen. Then when I click next button, the screen with option to convert database manually does not display any more.

 Comments   
Comment by Maria McDuff (Inactive) [ 01/Apr/13 ]
per bug scrub: moving to 2.1.
Comment by Maria McDuff (Inactive) [ 20/May/14 ]
Bin,

should be an easy fix?
Comment by Anil Kumar [ 04/Jun/14 ]
Triage - June 04 2014 Bin, Ashivinder, Venu, Tony, Anil




[MB-7929] The installer should check the state for the windows firewall Created: 18/Mar/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: installer
Affects Version/s: 2.0.1, 2.1.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Major
Reporter: Trond Norbye Assignee: Bin Cui
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Our installer should check the state of the windows firewall during installation and tell the user if it is blocking the desired ports.

Alternatively we could get warnings in our WEB UI if this is wrong

See http://msdn.microsoft.com/en-us/library/windows/desktop/ff956124(v=vs.85).aspx for information about the API for the Windows Firewall

 Comments   
Comment by Maria McDuff (Inactive) [ 22/Apr/13 ]
bin, any update on this bug?
Comment by Steve Yen [ 25/Apr/13 ]
Discussed in sprint planning - this will likely not make 2.0.2 timeframe.

It's be an improvement/feature to do the checking of whether the ports are accessible.




[MB-8248] GetAndLock doesn't prevent a regular "get" from succeeding Created: 13/May/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.0.1, 2.2.0, 2.1.1, 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Perry Krug Assignee: Amy Kurtzman
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged

 Description   
In the dev guide (as well as maybe other places?): http://www.couchbase.com/docs/couchbase-devguide-2.0/get-and-lock.html

The text at the beginning says that getandlock will prevent the retrieval of an item...that will only be the case if getandlock is also used to retrieve it, but a normal get will still succeed.

 Comments   
Comment by Perry Krug [ 13/May/13 ]
Reading further, it seems that a lot of the text here needs to be reworked. Can we get someone from the SDK team engaged to review it?
Comment by Maria McDuff (Inactive) [ 20/May/14 ]
Matt,

is there anyone from your team that can look at this?
Comment by Matt Ingenthron [ 20/May/14 ]
Possibly, though not likely in the next couple of weeks. I'll keep it on myself for now.
Comment by Matt Ingenthron [ 03/Jul/14 ]
I've sat on this too long. I believe Ruth is looking at overall dev guide issues, so assigning there.
Comment by Ruth Harris [ 03/Jul/14 ]
Hi... re-assigning to Amy since she's working on the Dev Guide right now (actually... converting to DITA XML). --ruth




[MB-8186] Docs: Configuring node quota Created: 02/May/13  Updated: 30/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 1.8.1, 2.0.1, 2.2.0, 2.1.1, 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Minor
Reporter: Perry Krug Assignee: Amy Kurtzman
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Operating System: Centos 64-bit

 Description   
The current docs linked to from here:
http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-introduction-architecture-quotas.html
and
http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-introduction-architecture-quotas.html

Point to the REST API command for changing the memory quota:
http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-restapi-cluster-memory-quota.html
http://www.couchbase.com/docs/couchbase-manual-1.8/restapi-cluster-memory-quota.html

It would be much better to not force the user to use the REST API and instead have them use the couchbase-cli:
http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-cli-initializing-nodes.html
and
http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-cli-other-examples.html

 Comments   
Comment by kzeller [ 10/May/13 ]
Will appear in about an hour:

http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-introduction-architecture-quotas.html
and
http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-introduction-architecture-quotas.html
Comment by kzeller [ 10/May/13 ]
Will appear in about an hour:

http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-introduction-architecture-quotas.html
and
http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-introduction-architecture-quotas.html
Comment by Perry Krug [ 12/May/13 ]
Karen, I don't see the change on the 1.8 instructions...it still seems to point to the REST API?

Also, for both of these, I don't think it's enough to just point to the general page of the couchbase-cli, it would be much more helpful to provide an example of the command that the user should run in order to change the quota, example output, error codes, etc...I think that goes along with another open issue on improving the doucmentation of the couchbase-cli, but at the very least this page should point to an anchor for the right command instead of just to the top of the page that lists all the couchbase-cli commands
Comment by Amy Kurtzman [ 23/Jun/14 ]
Verify and close.




[MB-7955] Docs: cbstats documentation needs cleaning up Created: 21/Mar/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Minor
Reporter: Perry Krug Assignee: Amy Kurtzman
Resolution: Unresolved Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
This link: http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-cmdline-cbstats.html

The text: "Where BUCKET_HOST is the hostname and port (HOSTNAME[:PORT]) combination for a Couchbase bucket, and username and password are the authentication for the named bucket. COMMAND(and[options]) are one of the follow options:"

Is quite confusing and doesn't match the examples.

Additionally, this link: http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-monitoring-nodestats.html, doesn't include all of the command descriptions and duplicates some of the information in the previous page. All of this seems to need a bit of reworking




[MB-8011] cbbackup - should monitor disk space and error before running out of space Created: 27/Nov/12  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: tools
Affects Version/s: 1.8.1, 2.0, 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Major
Reporter: Steve Yen Assignee: Bin Cui
Resolution: Unresolved Votes: 0
Labels: customer
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
suggestion from sharon / customer situation where user ran cbbackup on same machine as couchbase and cbbackup consumed all disk space, thus adversely affecting the server.

 Comments   
Comment by Dipti Borkar [ 28/Feb/13 ]
Need to do estimate and perform a check even before starting.

and monitor at runtime to check for available space left and output a WARNING message to say "Only 5% of disk is available, backup may not complete successfully".
Comment by Anil Kumar [ 10/Apr/13 ]
Bin: Any update on this bug will this be fixed before code-freeze on Friday.
Comment by Anil Kumar [ 11/Apr/13 ]
Moving this to 2.1 since don't have enough bandwidth.
Comment by Bin Cui [ 29/Jul/14 ]
Move to backlog for the time being. We need to come up a cross platform approach to measure disk space.
Comment by Steve Yen [ 29/Jul/14 ]
One feedback on the "cost / benefit" side of doing something like this...

Any disk space checks we do ahead of time or even along the way become just a polite hint, but might not save the user from out-of-space situations. Imagine, for example, cbbackup check disk space at the start of the process, but in the middle, unbeknowst to cbbackup, somebody starts downloading a giant file and sucks up disk space.

A warning at startup, also, won't be seen by automated cron jobs that are kicking off cbbackup.




[MB-7965] bucket-flush takes over 8 seconds to complete on an empty bucket Created: 25/Mar/13  Updated: 30/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: couchbase-bucket, ns_server
Affects Version/s: 2.0.1, 2.1.0, 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Critical
Reporter: Perry Krug Assignee: Wayne Siu
Resolution: Unresolved Votes: 0
Labels: PM-PRIORITIZED, devX, ns_server-story, supportability
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged

 Description   
Simply running bucket-flush from the CLI takes over 8 seconds to return when run against a bucket with no items in it.

 Comments   
Comment by Aleksey Kondratenko [ 25/Mar/13 ]
This is duplicated somewhere. Mike currently owns this issue.
Comment by Maria McDuff (Inactive) [ 22/Apr/13 ]
hi mike, do u have the orig bug number? if you can locate it, can u update this bug so I can close this one? thanks.
Comment by Mike Wiederhold [ 22/Apr/13 ]
I didn't know we had another one. Also moving to 2.1.
Comment by Dipti Borkar [ 07/May/13 ]
Can't find any duplicate bug for this issue.
Comment by Dipti Borkar [ 07/May/13 ]
Duplicate of MB-6232
Comment by Aleksey Kondratenko [ 07/May/13 ]
Last customer (and restricted) comment actually looks like some unrelated bug. Please proceed with CBSE and feel free to assign to me.
Comment by Matt Ingenthron [ 11/Jul/13 ]
I think you mean MB-6232, but this is not currently believed to be a duplicate of that issue. That issue is specifically about the ep-engine level and this issue is more about the overall bucket flush feature.

Note that tests indicate even with the storage as ramdisk, it still takes multiple seconds.
Comment by Aleksey Kondratenko [ 11/Jul/13 ]
Currently believed to be caused by "erlang bits". Requires investigation by our team.
Comment by Maria McDuff (Inactive) [ 08/Oct/13 ]
Alk,

any update on this bug?
Comment by Aleksey Kondratenko [ 08/Oct/13 ]
No updates. Simply requires some time on investigation. Given 3.0 scope might or might not happen.
Comment by Maria McDuff (Inactive) [ 19/May/14 ]
Meenakshi,

are we seeing this latency on 3.0?
pls update. if so -- pls assign to Alk, not to me. Thanks.
Comment by Anil Kumar [ 04/Jun/14 ]
Triage - 06/04/2014 Alk, Wayne, Parag, Anil
Comment by Anil Kumar [ 04/Jun/14 ]
Meenakshi - Can you provide update?
Comment by Meenakshi Goel [ 05/Jun/14 ]
Below are the steps tried to reproduce the issue with 3.0 latest build, please let me know if something else needs to done or missed:
OS: CentOS 6.4

#/opt/couchbase/bin/couchbase-cli bucket-create --bucket="default" --bucket-type=couchbase --bucket-port=11211 --bucket-ramsize=200 --bucket-replica=1 --enable-flush=1 --cluster=172.23.107.20:8091 -u Administrator -p password
SUCCESS: bucket-create
#cat /opt/couchbase/VERSION.txt
3.0.0-779-rel
# time /opt/couchbase/bin/couchbase-cli bucket-flush --bucket="default" --cluster=172.23.107.20:8091 -u Administrator -p password
Running this command will totally PURGE database data from disk.Do you really want to do it? (Yes/No)Yes
Database data will be purged from disk ...
SUCCESS: bucket-flush

real 0m9.299s
user 0m0.074s
sys 0m0.034s
Comment by Matt Ingenthron [ 23/Jun/14 ]
Note that flush has always been very long and asynchronous, which makes it impossible to reliably integrate into test cycles. We get significant user feedback on the impact on development.

Delete/create is not a workaround, as it too is long and asynchronous.

The best workaround known to date is to reduce the number of vbuckets to something like 4.

I'm not sure if the issue reported here is common, but that would be a good is/is-not item to check on this issue.
Comment by Aleksey Kondratenko [ 29/Sep/14 ]
Lets finally get this resolved.

My gut feeling is that 8 seconds has nothing to do with ns_server's actions and is likely due to multiple fsyncs as part of flushing everything.

In order to investigate this I'll need VM that has that 9 seconds to flush. Because on my box:

*) with barriers on it takes far longer (and we know it's due to fsyncs) (and I believe there should be ticket for that for ep-engine/storage)

*) without barriers it takes much less than 8 seconds.




[MB-8563] Concurrent Cluster Updates Created: 18/Jan/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.0.1
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Major
Reporter: Anonymous Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Need to clarify with Jin:


Sorry for the delayed response. See my summary below:

* Per QE/Build Eng, we are going to ship 2.0.1 and beyond with the async threads turned on
* And, clearly document that Couchbae recommend vm swamppiness set to 10 (default 60)

Filipe - can you please ensure that all the upstream changes that are required for A+ 16 have also merged to 2.0.1 and beyond?
Karen - we probably need your help on documenting the above recommendation on vm.swappiness setting (if had done already)

Thanks much,
Jin

On Jan 6, 2013, at 11:19 AM, Perry Krug wrote:

Also Jin, are the async threads and swapiness changes being applied to future builds?


<C46CE597-D575-46A1-8F24-80FF0F13ED77[7].png>
Simple, Fast, Elastic NoSQL Database
Perry Krug
Sr. Solutions Architect
Direct: +44 7445 029 287
Email: perry@couchbase.com








On Sun, Jan 6, 2013 at 11:16 AM, Perry Krug <perry@couchbase.com> wrote:
Thanks Jin, I've communicated this to the customer and will follow up with them.

Dipti/Sharon, it seems imperative that we get this new sizing information into the hands of the field and our customers...


<C46CE597-D575-46A1-8F24-80FF0F13ED77[7].png>
Simple, Fast, Elastic NoSQL Database
Perry Krug
Sr. Solutions Architect
Direct: +44 7445 029 287
Email: perry@couchbase.com








On Sat, Jan 5, 2013 at 10:48 PM, Jin Lim <jin@couchbase.com> wrote:
Please see the QE test results below. All has been running OK more than 8 hours with the hw configuration based on Ketaki's one-off capacity planning.

Thanks much,
Jin

Begin forwarded message:

From: Thuan Nguyen <thuan@couchbase.com>
Subject: Re: conc cluster
Date: January 5, 2013 2:30:35 AM PST
To: Aliaksey <alkondratenko@gmail.com>, Ketaki Gangal <Ketaki@couchbase.com>
Cc: Farshid Ghods <farshid@couchbase.com>, Jin Lim <jin@couchbase.com>, Chisheng Hong <chisheng@couchbase.com>

With more powerful server (more core cpu, more RAM, swappiness = 10 and erlang set to +A 16), I could not reproduce concur issues in our in home vms.
None of errors concur saw in their cluster appear in this test run after more than 8 hours running.

Report is here:
        https://github.com/couchbaselabs/couchbase-qe-docs/blob/master/CONCUR/2013_01_05-6core-4gbRAM-swappiness10-A16.txt
Collect info and atop files from cluster A
        https://s3.amazonaws.com/packages.couchbase/collect_info/2_0_0/conc/3nodes-atop-200GA-cluster-A-6core-20130105.tgz
        https://s3.amazonaws.com/packages.couchbase/collect_info/2_0_0/conc/3nodes-collect-info-200GA-cluster-A-6core-20130105.tgz
Collect info and atop files from cluster B
        https://s3.amazonaws.com/packages.couchbase/collect_info/2_0_0/conc/3nodes-atop-200GA-cluster-B-6core-20130105.tgz
        https://s3.amazonaws.com/packages.couchbase/collect_info/2_0_0/conc/3nodes-collect-info-200GA-cluster-B-6core-20130105.tgz
Stats all, memcached and beam capture
        https://s3.amazonaws.com/packages.couchbase/collect_info/2_0_0/conc/stats-memcached-beam-200A-6core-20130105.tgz


Thanks
Thuan Nguyen

From: Aliaksey <alkondratenko@gmail.com>
Date: Thursday, January 3, 2013 12:13 PM
To: Ketaki Gangal <Ketaki@couchbase.com>
Cc: Farshid Ghods <farshid@couchbase.com>, Thuan Nguyen <thuan@couchbase.com>, Jin Lim <jin@couchbase.com>, Chisheng Hong <chisheng@couchbase.com>
Subject: Re: conc cluster

+1 for everything below from me. Including "swamp"-iness :)


On Thu, Jan 3, 2013 at 12:10 PM, Ketaki Gangal <Ketaki@couchbase.com> wrote:
Thanks Aliaksey.

Based on the run-results and the previous discussions, we are recommending the following capacity planning for this particular use-case.

RAM :
XDCR recommended value : 2G
OS : 0.5G
Bucket size : Concur Bucket is 1.2 G + Memory for Indexes
Data
Indexes - Memory used will vary based on number/type of indexes.

CPU
Memcached ( unto 3 buckets) : 1 core
Cluster Management : 1 core.
XDCR ( 1 replications ) : 1 core
Per Index : 1 core
[Consider replica Indexes into planning as well.]


Concur use-case has one 1.2G bucket, with 2 replicas, 3 views 1 ddoc , min capacity should be 4G of memory and 6 core CPU.

QE will run test with this updated environment, with +A 16 enabled on all the machines ( This is the recommended value from 2.0.1 onwards) and updating the swampiness to 10.

thanks,
Ketaki

On Jan 3, 2013, at 10:47 AM, Aliaksey Kandratsenka wrote:

Please dont.... See below


On Thu, Jan 3, 2013 at 9:02 AM, Farshid Ghods <farshid@couchbase.com> wrote:
OK now that we saw severe crash on both source and destination cluster ketaki please update the ticket and tony please complete the rest report with diags and other stats we captured

Next test is to restart the test by only switching the +A16 on all nodes on both clusters

It looks like there's some misunderstanding of what we need to do.

My understanding is that it's perfectly clear that their problem is not lack of +A (or at least not their main problem) but massively undersized hardware.

My understanding is that adding +A 16 will not help here, given that they managed to exhaust swap and cause even ns_server to be terminated by oom killer.

IMHO what we should test is adding more ram and "proving" that it's what we should advise them. IMHO
 
Farshid

Thuan Nguyen <thuan@couchbase.com> wrote:

I did change query to run on production view.
I saw bucket shutdown by memcached on 2 nodes 30 and 32 at destination


Shutting down bucket "loginservice" on 'ns_1@10.3.3.32' for server shutdown ns_memcached002 ns_1@10.3.3.32 22:18:08 - Wed Jan 2, 2013

Shutting down bucket "loginservice" on 'ns_1@10.3.3.30' for server shutdown ns_memcached002 ns_1@10.3.3.30 22:10:45 - Wed Jan 2, 2013



Thanks
Thuan Nguyen

From: Farshid Ghods <farshid@couchbase.com>
Date: Wednesday, January 2, 2013 9:47 PM
To: Thuan Nguyen <thuan@couchbase.com>
Cc: Ketaki Gangal <Ketaki@couchbase.com>
Subject: Re: conc cluster

btw this should be a production view ( i pubished it now so you need to change your mcsoda or curl commands to query a production view )

On Jan 2, 2013, at 6:18 PM, Thuan Nguyen <thuan@couchbase.com> wrote:

Load at each cluster 6M items with different keys at source and destination. Due to bi xdcr, both cluster will be 12M at the end.

Thanks
Thuan Nguyen

From: Farshid Ghods <farshid@couchbase.com>
Date: Wednesday, January 2, 2013 6:16 PM
To: Thuan Nguyen <thuan@couchbase.com>
Cc: Chisheng Hong <chisheng@couchbase.com>, Ketaki Gangal <Ketaki@couchbase.com>, Aliaksey <alkondratenko@gmail.com>
Subject: Re: conc cluster

Why 12m ? The test spec says 6m

Thuan Nguyen <thuan@couchbase.com> wrote:

Swap on source cluster reach 30% now. Bucket only has 1.5 M items. Will load to 12 M item. No crash yet.

Thanks
Thuan Nguyen



source cluster 1:
10.3.2.41
10.3.2.42
10.3.2.121

destination cluster 2:
10.3.3.40
10.3.3.42
10.3.3.43


Thanks
Thuan Nguyen






 Comments   
Comment by Amy Kurtzman [ 23/Jun/14 ]
Ruth, please verify that this is covered and close. Swapiness.




[MB-8593] MRW documentation needs to be updated Created: 11/Jul/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.2.0, 2.1.1, 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Perry Krug Assignee: Amy Kurtzman
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged

 Description   
While working with Anil, I learned that the initial descriptions of MRW are not as accurate as they could be, which led our manual to get out of date.

He's in the process of updating that, but I wanted to make sure that our public documentation gets updated as well:
http://www.couchbase.com/docs/couchbase-manual-2.1.0/couchbase-introduction-architecture-diskstorage.html
http://www.couchbase.com/docs/couchbase-manual-2.1.0/couchbase-admin-tasks-mrw.html

 Comments   
Comment by kzeller [ 11/Jul/13 ]
I will ask him for his notes/info to revise this when he has them.

Odd, This content did go through 2-3 reviews prior to 2.1, didn't it?
Comment by kzeller [ 31/Jul/13 ]
Hi,

After you have your updated notes of MRW changes for 2.1.1+, or 2.2, please attach and we will review. Then I'll incorporate.


Karen
Comment by Anil Kumar [ 27/Aug/13 ]
Can you talk to Sundar/Chiyoung and they should be able to provide you with accurate information.
Comment by Maria McDuff (Inactive) [ 18/Sep/13 ]
Re-Assigning server doc issues to Amy.




[MB-8564] cbhealthchecker should produce a timestamped, zipped file by default Created: 03/Jul/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: tools
Affects Version/s: 2.1.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Major
Reporter: Perry Krug Assignee: Bin Cui
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Now that we are shipping cbhealthchecker with the product and asking customers to run it more frequently, it would be great if it produced a timestamped and zipped output file (similar to cbcollect_info) for customers to much more easily upload

 Comments   
Comment by Maria McDuff (Inactive) [ 08/Oct/13 ]
Bin,

any update on this issue?




[MB-8271] Document usage examples of vbuckettool Created: 14/May/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.0.1
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Minor
Reporter: Perry Krug Assignee: Amy Kurtzman
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
http://www.couchbase.com/docs/couchbase-manual-2.0/couchbase-admin-cmdline-vbuckettool.html

There are a few other command-line tools that need similar attention.




[MB-8508] installer - windows packages should be signed Created: 26/Nov/12  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: build
Affects Version/s: 2.0, 2.1.0, 2.2.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Task Priority: Critical
Reporter: Steve Yen Assignee: Chris Hillery
Resolution: Unresolved Votes: 0
Labels: windows_pm_triaged
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates to
relates to MB-5577 print out Couchbase in the warning sc... Open
relates to MB-9165 Windows 8 Smartscreen blocks Couchbas... Resolved

 Description   
see also: http://www.couchbase.com/issues/browse/MB-7250
see also: http://www.couchbase.com/issues/browse/MB-49


 Comments   
Comment by Steve Yen [ 10/Dec/12 ]
Part of the challenge here would be figuring out the key-ownership process. Perhaps PM's should go create, register and own the signing keys/certs.
Comment by Steve Yen [ 31/Jan/13 ]
Reassigning as I think Phil has been tracking down the keys to the company.
Comment by Phil Labee (Inactive) [ 01/May/13 ]
Need more information:

Why do we need to sign windows app?
What problems are we addressing?
Do you want to release through the Windows Store?
What versions of Windows do we need to support?
Comment by Phil Labee (Inactive) [ 01/May/13 ]
need to know what problem we're trying to solve
Comment by Wayne Siu [ 06/Sep/13 ]
No security warning box is the objective.
Comment by Wayne Siu [ 20/Jun/14 ]
Anil,
I assume this is out of 3.0. Please update if it's not.
Comment by Anil Kumar [ 20/Jun/14 ]
we should still consider it for 3.0 unless there is no time to fix then candidate for punting.
Comment by Wayne Siu [ 30/Jul/14 ]
Moving it out of 3.0.
Comment by Anil Kumar [ 17/Sep/14 ]
we need this for Windows 3.0 GA timeframe




[MB-8668] Separate CPU hungry pieces (xdcr and/or views) out of ns_server erlang VM Created: 19/Jul/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: ns_server
Affects Version/s: 2.2.0
Fix Version/s: sherlock, 3.0.2
Security Level: Public

Type: Task Priority: Critical
Reporter: Aleksey Kondratenko Assignee: Artem Stemkovski
Resolution: Unresolved Votes: 0
Labels: ns_server-story
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
XDCR can eat tons of resources. Sometimes maybe hogging cluster-management resources. Views can spent lots of CPU and IO too.

Plus we know that views are significantly faster if we don't run erlang with async io threads. And we currently have to because otherwise cluster management will not receive it's cpu cycles timely.


 Comments   
Comment by Dipti Borkar [ 21/Jul/13 ]
Alk, any reason why this is marked for 2.2?
Comment by Aleksey Kondratenko [ 23/Jul/13 ]
Dipti, I was under impression that we agreed to try to do this as part of 2.2.0. It would be awesome (i.e. some people are doing weird things and cause weird timeouts in management layer because of the way things are configured) if we could but given time constraints this might be impossible.
Comment by Aleksey Kondratenko [ 25/Jul/13 ]
As per Dipti this is least priority




[MB-8670] DOCS: Info on backing up one cluster and restoring to another Created: 22/Jul/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Major
Reporter: David Haikney Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
The 1.8 version of the manual contains information on how to backup from one cluster and restore to another:
http://www.couchbase.com/docs/couchbase-manual-1.8/couchbase-backup-restore-restore.html#couchbase-backup-restore-prevstate-different
This is particularly useful for customers who want to replicate a cluster to a new cluster on an upgraded version of the software. That referenced section appears to be missing from the 2.1 manual. Also the paragraph on the following page is misleading and needs rewording:
http://www.couchbase.com/docs/couchbase-manual-2.1.0/couchbase-backup-restore-backup-cbbackup.html
(It is not a requirement for the vbucket maps to be identical in this scenario.)

We should also include a use-case oriented view of when and where to use backup and restore.
Perhaps a howto style e.g. (i) Taking regular backups of your data. (ii) Howto perform an offline upgrade. (iii) Migrating your existing data to a new cluster.

 

 Comments   
Comment by Ruth Harris [ 18/Nov/13 ]
It looks like the information did not get dropped off by mistake, but design.
These sections were re-vamped as of 2.0.

A use-case scenario is a good idea with your suggested How-to style.

I'm going to change the status from bug to enhancement for the above reasons. And schedule it for 3.0 release.




[MB-8671] Improper error message in case of stale CAS. Created: 22/Jul/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: clients
Affects Version/s: 2.1.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Minor
Reporter: Deepti Dawar Assignee: Michael Nitschinger
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Getting an improper error message in case of stale CAS. Following is what has been tried :

1) Added a new key testStaleCAS" with a value. This op stores the cas value as cas1.
2) Invoked set operation on the same key. This time the cas changes to cas2.
3) Append operation is called on the key with the cas value of cas1.

The error message which is returned from the server is 'Data Exists for the key'.
Where as it should be something specific to the wrongly used cas value like 'Invalid use of Cas Id'.

 Comments   
Comment by Mike Wiederhold [ 30/Sep/13 ]
Trond,

This is the generic error generated by memcached for the EEXISTS error. Do you think there is a way we could make the error message more specific?
Comment by Trond Norbye [ 01/Oct/13 ]
Which client is this?
Comment by Trond Norbye [ 01/Oct/13 ]
It should be fixed in the client. the error text is going to be removed from the server at some point..
Comment by Deepti Dawar [ 01/Oct/13 ]
Hi Trond,

This is the Java Client, but its only delegating the message which is coming from the server.
Can you please kindly check and let me know what are you sending back from the server in this scenario.

Thanks !
Comment by Trond Norbye [ 01/Oct/13 ]
It is the text that is currently being sent back from the server, but will be removed rather than changed. The client should provide the text for the various return codes (and possibly localize them).




[MB-9090] Couchbase Web Console "change" JSON numerical values due to JS limitations Created: 09/Sep/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.1.1
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Tug Grall (Inactive) Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Operating System: MacOSX 64-bit
Is this a Regression?: Yes

 Description   
When doing some test with large numbers in JSON attributes I realized that Couchbase Console is 'changing/rounding' some values due probably to the internal limitations of JS language.

For example save using a client this:
--
{
  "bigValue": 12345678901234567890,
  "bigValueAsString": "12345678901234567890"
}
---

if you look at this in the console the value is now:
---
{
  "bigValue": 12345678901234567000,
  "bigValueAsString": "12345678901234567890"
}
---

So the console "changes" the value, only as a view for now, but if the use click "Save", the value is changed in the DB.

This is due to JS limits, if I execute this in JS engine:
console.log(12345678901234567890);
it shows:
12345678901234567000

So I do not have a solution for that but I guess we should:
- at least document this
- see if we cannot do something like not put a message and disable the save when one of the attribute will be rounded.






 Comments   
Comment by Aleksey Kondratenko [ 09/Sep/13 ]
Confirming JS limitation here. JS is working with 64-bit floats. So something like 52-53 integer is max it can represent honestly.

But I cannot make any decisions on how we want JSON relate to JS. Passing to Dipti.
Comment by Maria McDuff (Inactive) [ 19/May/14 ]
Not a must-have for 3.0, MAC OSX.
Passing to PM for post 3.0. evaluation.
Comment by Anil Kumar [ 23/Jun/14 ]
We need to document the limitation.




[MB-8872] a number of capi REST API endpoints are not secured Created: 19/Aug/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: ns_server, view-engine
Affects Version/s: 2.0, 2.1.0, 2.2.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Task Priority: Critical
Reporter: Aleksey Kondratenko Assignee: Nimish Gupta
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
The following APIs are all un-protected apparently.

[httpd_global_handlers]
/ = {couch_httpd_misc_handlers, handle_welcome_req, <<"Welcome">>}
_active_tasks = {couch_httpd_misc_handlers, handle_task_status_req}
_view_merge = {couch_httpd_view_merger, handle_req}
_set_view = {couch_set_view_http, handle_req}

[httpd_db_handlers]
_view_cleanup = {couch_httpd_db, handle_view_cleanup_req}
_compact = {couch_httpd_db, handle_compact_req}
_design = {couch_httpd_db, handle_design_req}
_changes = {couch_httpd_db, handle_changes_req}

[httpd_design_handlers]
_view = {couch_httpd_view, handle_view_req}
_info = {couch_httpd_db, handle_design_info_req}

At least _view above is overridden by capi layer.

I've myself just tried _changes feed and it worked.


 Comments   
Comment by Aleksey Kondratenko [ 19/Aug/13 ]
CC-ed some stakeholders
Comment by Aleksey Kondratenko [ 10/Oct/13 ]
Should be considered for 3.0
Comment by Maria McDuff (Inactive) [ 19/May/14 ]
Alk,

yes, pls fix. required for 3.0 ssl.
Comment by Sriram Melkote [ 23/May/14 ]
Filipe, as it's not clear what the downstream effects are of securing these, request you to consider and fix this appropriately
Comment by Sriram Melkote [ 11/Jun/14 ]
Alk, do you mean that these endpoints should need authentication? Or are they bypassing SSL when it's enabled?
Comment by Aleksey Kondratenko [ 11/Jun/14 ]
Lack auth today. Exposing user data
Comment by Sriram Melkote [ 16/Jun/14 ]
I'd like to defer this to 3.0.1 as it has been this way for many releases and don't want to put in non-bugfixes at this point in release




[MB-8715] Append/Prepend commands are not using the EXPIRATION parameter and it should be documented Created: 29/Jul/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.1.0, 2.2.0, 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Tug Grall (Inactive) Assignee: Amy Kurtzman
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

The append/preprend Memcached commands take a EXP (and Flag) as parameter but these parameters are not used by the server.

We should either:
- document that this parameters are not used by these command in this page:
http://www.couchbase.com/docs/memcached-api/memcached-api-protocol-text_append.html

- if we implement them we should be careful as our behavior will be different with vanilla memcached






CBHealthChecker - Fix fetching number of CPU processors (MB-8686)

[MB-8817] REST API support to report number of CPU cores for a specified node Created: 13/Aug/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: ns_server
Affects Version/s: 2.2.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Technical task Priority: Major
Reporter: Bin Cui Assignee: Bin Cui
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
One approach is to publish api to run cbcollect_info and retrieve result remotely. scope argument is expected to limit task to a certain group of tasks.

Another approach is to add number of cpu core as part of the current REST call.

 Comments   
Comment by Aleksey Kondratenko [ 13/Aug/13 ]
I need really good reason for that. We're not going to add random APIs for random needs.

Also most likely some escript (not REST API) is going to be a easier to do given that erlang does have this information.
Comment by Aleksey Kondratenko [ 16/Aug/13 ]
See my comment above




[MB-8693] [Doc] distribute couchbase-server through yum and ubuntu package repositories Created: 24/Jul/13  Updated: 30/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: build
Affects Version/s: 2.1.0, 2.2.0, 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Task Priority: Critical
Reporter: Anil Kumar Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: 2.0.1-release-notes
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Dependency
depends on MB-6972 distribute couchbase-server through y... Resolved
Flagged:
Release Note

 Description   
this helps us in handling dependencies that are needed for couchbase server
sdk team has already implemented this for various sdk packages.

we might have to make some changes to our packaging metadata to work with this schema

 Comments   
Comment by Wayne Siu [ 10/Jul/14 ]
Steps are documented in MB-6972.
Please let us know if you have any questions.
Comment by Anil Kumar [ 11/Jul/14 ]
Ruth - Have we documented this for 3.0 features - we should.
Comment by Ruth Harris [ 17/Jul/14 ]
not yet.
Comment by Wayne Siu [ 30/Jul/14 ]
Any ETA on this?
Comment by Ruth Harris [ 05/Sep/14 ]
I added MB-6972 to the list of fixed bugs in the Release Notes

The following instructions in mb-6972 are specific to Centos 5/6 CE & EE and to yum. Also, Phil's last update said he failed to install Centos6-x86

Basically:

Centos 5/6 community/enterprise
Log on to Centos machine
wget http://packages.couchbase.com/releases/couchbase-server/keys/couchbase-server-public-key

gpg --import couchbase-server-public-key
sudo wget http://packages.couchbase.com/releases/couchbase-server/yum.repos.d/&lt;5|6>/enterprise/couchbase-server.repo --output-document=/etc/yum.repos.d/couchbase-server.repo

vi /etc/yum.repos.d/couchbase-server.repo to verify information
yum install couchbase-server





[MB-8774] healthchecker should warn or fail with Transparent Huge Pages enabled Created: 08/Aug/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: tools
Affects Version/s: 2.1.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Minor
Reporter: James Mauss Assignee: Bin Cui
Resolution: Unresolved Votes: 0
Labels: customer
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
We have seen Transparent Huge Pages being enabled cause problems and even crashes on nodes.

Is there a way to be able to warn or fail on install to make sure that these are not enabled?

 Comments   
Comment by Dipti Borkar [ 08/Aug/13 ]
Other NoSQL databases have the same issue. Not sure if we should build this into the installer, its way to specific. We should certain document is as a pre-requisite and in the best practices section.

James, can you please open a separate doc bug for this.
Comment by Perry Krug [ 09/Aug/13 ]
How about building it into the healthchecker and/or log analyser?
Comment by Dipti Borkar [ 09/Aug/13 ]
healthchecker is a good option.
Comment by Anil Kumar [ 27/Aug/13 ]
We need to add ALERT to healthchecker report if tools find out 'Transparent Huge Pages' are enabled on RHEL6 servers.

ALERT: Disable 'Transparent HugePages' on RHEL6 Kernels

-------------

You can check the current setting for Transparent HugePages "enabled=[always]".

    # cat /sys/kernel/mm/transparent_hugepage/enabled
    [always] madvise never
    #


Comment by Bin Cui [ 27/Aug/13 ]
Healthchecker is agentless monitoring/management tool. Unless ns_server provides such monitoring capability, healthchecker won't be able to run any script remotely.




[MB-9584] icu-config is shipped in our bin directory Created: 18/Nov/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: build
Affects Version/s: 2.1.1, 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Marty Schoch Assignee: Chris Hillery
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Operating System: MacOSX 64-bit

 Description   
Today I ran:

$ icu-config --ldconfig
### icu-config: Can't find /opt/couchbase/lib/libicuuc.dylib - ICU prefix is wrong.
### Try the --prefix= option
### or --detect-prefix
### (If you want to disable this check, use the --noverify option)
### icu-config: Exitting.

I was surprised that icu-config was in the couchbase bin dir, and therefore in my path.

I asked in IRC and no one seemed to think it was actually useful for anything, and it doesn't even appear to output valid paths anyway. Recommend we remove it to avoid any user confusion.

 Comments   
Comment by Aleksey Kondratenko [ 31/Jul/14 ]
It is actually useful on non-osx unix boxes. OSX is unique in a way due to this prefix changing/rewriting/whatever. On GNU/Linux icu we ship is actually usable to compile and link to.
Comment by Chris Hillery [ 31/Jul/14 ]
So, I could either figure out how to make icu-config work on a MacOS build, or eliminate it on MacOS. The latter is certainly quicker so I'll probably do that.




[MB-9509] cbtransfer documentation to include details about memory snapshot limitation and python regex patterns Created: 11/Nov/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.2.0, 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Aruna Piravi Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
cbtransfer documentation does not include following details -

1. The search patterns supported by -k or --key option.
The pattern matching syntax is at ...

    http://docs.python.org/2/howto/regex.html#regex-howto
    http://docs.python.org/2/library/re.html

2. Cbtransfer works by snapshotting the memory at the time it is invoked. All keys extracted pertain to the snapshot and not to the data that is added/edited/deleted while cbtransfer is running.

Bin can add more info if needed. Thanks.




[MB-9222] standalone moxi-server -- no core on RHEL5 Created: 04/Oct/13  Updated: 30/Oct/14  Due: 20/Jun/14

Status: Open
Project: Couchbase Server
Component/s: build
Affects Version/s: 1.8.1, 2.1.1
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Critical
Reporter: Alexander Petrossian (PAF) Assignee: Chris Hillery
Resolution: Unresolved Votes: 0
Labels: moxi
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged

 Description   
moxi init.d script contains
{code}
ulimit -c unlimited
{code}
line, which is supposed to allow core-dumps.

But then it uses OS /etc/.../functions function "daemon",
which overrides this ulimit.

One need to use
{code}
DAEMON_COREFILE_LIMIT=unlimited
{code}

environment variable, which will be handled by "daemon" function to do "ulimit -c unlimited".

 Comments   
Comment by Alexander Petrossian (PAF) [ 04/Oct/13 ]
once we did that we found out that moxi does chdir("/").
we've found in sources, that one can use "-r" command line switch to prevent "ch /" from happening.
plus "/var/run" folder, which is chdired to prior "daemon" command is no good anyway, it can not be written to by "moxi" user anyway.

I feel that being able to write cores is very important.
I agree that that may not be a good idea to enable that by default.

But now this is broken in 3 places, which is not good.

we suggest:
# cd /tmp (instead of cd /var/run) -- usually safe place to write by any user and exists on all systems.
# document -r command line switch (currently not documented in "moxi -h")
# add DAEMON_COREFILE_LIMIT before calling "daemon" function
Comment by Alexander Petrossian (PAF) [ 04/Oct/13 ]
regarding the default... we see there is core (and .dump) here:
[root@spms-lbas ~]# ls -la /opt/membase/var/lib/membase
-rw------- 1 membase membase 851615744 Feb 19 2013 core.12674
-rw-r----- 1 membase membase 12285899 Oct 4 17:45 erl_crash.dump

so maybe it is a good idea to enable it by default?

[root@spms-lbas ~]# file /opt/membase/var/lib/membase/core.12674
/opt/membase/var/lib/membase/core.12674: ELF 64-bit LSB core file AMD x86-64, version 1 (SYSV), SVR4-style, from 'memcached'
[root@spms-lbas ~]#
Comment by Matt Ingenthron [ 20/Dec/13 ]
Steve: who is the right person to look at this these days?
Comment by Maria McDuff (Inactive) [ 19/May/14 ]
Iryna,

can you confirm if this is still happening in 3.0?
If it is, pls assign to Steve Y. Otherwised, resolve and close.
Thanks.
Comment by Steve Yen [ 25/Jul/14 ]
(scrubbing through ancient moxi bugs)

Hi Chris,
Looks like Alexander Petrossian has found the issue and the fix with the DAEMON_COREFILE_LIMIT env variable.

Can you incorporate this into moxi-init.d ?

Thanks,
Steve
Comment by Chris Hillery [ 25/Jul/14 ]
For prioritization purposes: Are we actually producing a standalone moxi product anymore? I'm unaware of any builds for it, so does it make sense to tag this bug "3.0" or indeed fix it at all?
Comment by Steve Yen [ 25/Jul/14 ]
Hi Chris,
We are indeed still supposed to provide a standalone moxi build (unless I'm out of date on news).

Priority-wise, IMHO, it's not highest (but that's just my opinion), as I believe folks can still get by with the standalone moxi from 2.5.1. That is moxi hasn't changed that very much functionally -- although Trond & team did a bunch of rewrite / refactoring to make it easier to build and develop (cmake, etc).

Cheers,
Steve
Comment by Wayne Siu [ 01/Aug/14 ]
Per PM (Anil), standalone moxi is still supported in 3.0.
Raising the priority to Critical to be included in the release.




[MB-9426] Docs: AMI Instructions should indicate the need for external internet access when starting up Created: 28/Oct/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.2.0, 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Major
Reporter: Perry Krug Assignee: Amy Kurtzman
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
When the Couchbase AMI (in Amazon's marketplace) starts up, it must download Couchbase from the internet. If being deployed inside a VPC and/or with blocked firewall access, this is not possible and no warning or sign of distress is noted.

Please update the installation instructions to indicate this need.




[MB-9553] Changes in testrunner should not trigger a new build Created: 14/Nov/13  Updated: 30/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: build
Affects Version/s: 2.5.0, 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Minor
Reporter: Pavel Paulau Assignee: Chris Hillery
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Comments   
Comment by Phil Labee (Inactive) [ 14/Nov/13 ]
Changed from "bug" to "enhancement". Since testrunner is in the manifest, it is a product component.

The only way to accomplish what you want is to remove testrunner from the manifest.
Comment by Phil Labee (Inactive) [ 14/Nov/13 ]
How would it affect QE processes to remove testrunner from the manifests?
Comment by Pavel Paulau [ 14/Nov/13 ]
I'd not recommend to exclude testrunner from manifest. This is very bad idea.
Comment by Pavel Paulau [ 14/Nov/13 ]
"Won't fix" is another bad idea.
Comment by Pavel Paulau [ 24/Mar/14 ]
Due to new policy we create lots of new builds after testrunner changes.
And there are many jenkins jobs that automatically verify new builds. Complete waste of time.

I'm pretty sure there is a programmatic way to filter out testrunner changes.
Comment by Wayne Siu [ 27/Mar/14 ]
Ceej,
If we could skip creating new builds for testrunner changes, it will free up builders for core product changes. Can you take a look?
Comment by Chris Hillery [ 27/Mar/14 ]
This is moot while the known-good builds are no longer in service. We only do builds every N hours, regardless of changes.

In the known-good process, I already relegated testrunner changes to last (ie, changes in any other project would be run first). I agree with Phil that we should not actually skip the builds for testrunner changes, because they are part of what determines whether the product is "correct".

I am lowering the priority of this to Minor for the time being. When/if the known-good process is restarted, we will revisit what the actual problem is and how best to solve it. IMHO, if we can lower our build time (which we want to do for many reasons), then this problem becomes moot for build-team. Perhaps we should not trigger QE runs for testrunner changes, however.




[MB-9143] Allow replica count to be edited Created: 17/Sep/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.5.0
Fix Version/s: 2.5.0, 3.0.2

Type: Task Priority: Critical
Reporter: Perry Krug Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates to
relates to MB-2512 Allow replica count to be edited Closed

 Description   
Currently the replication factor cannot be edited after a bucket has been created. It would be nice to have this functionality.

 Comments   
Comment by Ruth Harris [ 06/Nov/13 ]
Currently, it's added to the 3.0 Eng branch by Alk. See MB-2512. This would be a 3.0 doc enhancement.
Comment by Perry Krug [ 25/Mar/14 ]
FYI, this is already in as-of 2.5 and probably needs to be documented there as well...if possible before 3.0
Comment by Amy Kurtzman [ 16/May/14 ]
Anil, Can you verify whether this was added in 2.5 or 3.0?
Comment by Anil Kumar [ 28/May/14 ]
Verified - As perry mentioned this was added in 2.5 release. We need to document this soon for 2.5 docs.




[MB-9825] Rebalance exited with reason bad_replicas Created: 06/Jan/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: couchbase-bucket
Affects Version/s: 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Critical
Reporter: Pavel Paulau Assignee: Venu Uppalapati
Resolution: Unresolved Votes: 0
Labels: performance, windows_pm_triaged
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: 2.5.0 enterprise edition (build-1015)

Platform = Physical
OS = Windows Server 2012
CPU = Intel Xeon E5-2630
Memory = 64 GB
Disk = 2 x HDD

Triage: Triaged
Operating System: Windows 64-bit
Link to Log File, atop/blg, CBCollectInfo, Core dump: http://ci.sc.couchbase.com/job/zeus-64/564/artifact/

 Description   
Rebalance-out, 4 -> 3, 1 bucket x 50M x 2KB, DGM, 1 x 1 views

Bad replicators after rebalance:
Missing = [{'ns_1@172.23.96.27','ns_1@172.23.96.26',597},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',598},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',599},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',600},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',601},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',602},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',603},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',604},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',605},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',606},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',607},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',608},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',609},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',610},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',611},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',612},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',613},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',614},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',615},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',616},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',617},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',618},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',619},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',620},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',621},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',622},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',623},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',624},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',625},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',626},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',627},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',628},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',629},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',630},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',631},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',632},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',633},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',634},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',635},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',636},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',637},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',638},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',639},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',640},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',641},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',642},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',643},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',644},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',645},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',646},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',647},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',648},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',649},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',650},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',651},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',652},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',653},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',654},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',655},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',656},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',657},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',658},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',659},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',660},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',661},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',662},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',663},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',664},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',665},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',666},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',667},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',668},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',669},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',670},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',671},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',672},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',673},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',674},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',675},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',676},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',677},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',678},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',679},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',680},
{'ns_1@172.23.96.27','ns_1@172.23.96.26',681}]
Extras = []

 Comments   
Comment by Aleksey Kondratenko [ 06/Jan/14 ]
Looks like producer node simply closed socket.

Most likely duplicate of old issue where both socket sides suddenly see connection as closed.

Relevant log messages:

[error_logger:info,2014-01-06T10:30:00.231,ns_1@172.23.96.26:error_logger<0.6.0>:ale_error_logger_handler:log_report:72]
=========================PROGRESS REPORT=========================
          supervisor: {local,'ns_vbm_new_sup-bucket-1'}
             started: [{pid,<0.1169.0>},
                       {name,
                           {new_child_id,
                               [597,598,599,600,601,602,603,604,605,606,607,
                                608,609,610,611,612,613,614,615,616,617,618,
                                619,620,621,622,623,624,625,626,627,628,629,
                                630,631,632,633,634,635,636,637,638,639,640,
                                641,642,643,644,645,646,647,648,649,650,651,
                                652,653,654,655,656,657,658,659,660,661,662,
                                663,664,665,666,667,668,669,670,671,672,673,
                                674,675,676,677,678,679,680,681],
                               'ns_1@172.23.96.27'}},
                       {mfargs,
                           {ebucketmigrator_srv,start_link,
                               [{"172.23.96.27",11209},
                                {"172.23.96.26",11209},
                                [{on_not_ready_vbuckets,
                                     #Fun<tap_replication_manager.2.133536719>},
                                 {username,"bucket-1"},
                                 {password,get_from_config},
                                 {vbuckets,
                                     [597,598,599,600,601,602,603,604,605,606,
                                      607,608,609,610,611,612,613,614,615,616,
                                      617,618,619,620,621,622,623,624,625,626,
                                      627,628,629,630,631,632,633,634,635,636,
                                      637,638,639,640,641,642,643,644,645,646,
                                      647,648,649,650,651,652,653,654,655,656,
                                      657,658,659,660,661,662,663,664,665,666,
                                      667,668,669,670,671,672,673,674,675,676,
                                      677,678,679,680,681]},
                                 {set_to_pending_state,false},
                                 {takeover,false},
                                 {suffix,"ns_1@172.23.96.26"}]]}},
                       {restart_type,temporary},
                       {shutdown,60000},
                       {child_type,worker}]



[rebalance:debug,2014-01-06T12:12:33.870,ns_1@172.23.96.26:<0.1169.0>:ebucketmigrator_srv:terminate:737]Dying with reason: normal

Mon Jan 06 12:12:44.371917 Pacific Standard Time 3: (bucket-1) TAP (Producer) eq_tapq:replication_ns_1@172.23.96.26 - disconnected, keep alive for 300 seconds
Comment by Maria McDuff (Inactive) [ 10/Jan/14 ]
looks like a dupe of memcached connection issue.
will close this as a dupe.
Comment by Wayne Siu [ 15/Jan/14 ]
Chiyoung to add more debug logging to 2.5.1.
Comment by Chiyoung Seo [ 17/Jan/14 ]
I added more warning-level logs for disconnection events in the memcached layer. We will continue to investigate this issue for 2.5.1 or 3.0 release.

http://review.couchbase.org/#/c/32567/

merged.
Comment by Cihan Biyikoglu [ 08/Apr/14 ]
Given we have more verbose logging, can we reproduce the issue again and see if we can get a better idea on where the problem is?
thanks
Comment by Pavel Paulau [ 08/Apr/14 ]
This issue happened only on Windows so far.
I wasn't able to reproduce it in 2.5.1 and obviously we haven't tested 3.0 yet.
Comment by Cihan Biyikoglu [ 25/Jun/14 ]
Pavel, do you have the repro with the detailed logs now? if yes, could we assign to a dev for fixing?
Comment by Pavel Paulau [ 25/Jun/14 ]
This is Windows specific bug. We are not testing Windows yet.
Comment by Pavel Paulau [ 27/Jun/14 ]
Just FYI.

I have finally tried Windows build. It's absolutely unstable and not ready for performance testing yet.
Please don't expect news any time soon.
Comment by Anil Kumar [ 23/Sep/14 ]
Triage - Venu, any update on this ticket? When can you give us an update on this.





[MB-9863] Specify in Documentation: Physical vs Virtual RAM Created: 08/Jan/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.1.0, 2.2.0, 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Minor
Reporter: Gwen Leong (Inactive) Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Is this a Regression?: Yes

 Description   
A customer asked this question regarding the documentation:

Context: http://docs.couchbase.com/couchbase-manual-2.2/#using-multi--readers-and-writers
Sentence: The recommended hardware requirements are quad-core processes on 64-bit CPU and 3GHz, 16GB RAM physical storage.
Terminology in question: 16GB RAM physical storage

=>Question: Sounds like it's confusing 16GB physical RAM with disk storage, recommend to reword as "16GB physical RAM" or just "16GB RAM" as it may be a 16GB Virtual Machine and doesn't need to be physical.

 Comments   
Comment by Gwen Leong (Inactive) [ 08/Jan/14 ]
Hi Anil - let me know which is the best way to word it. After that, I'll be working with the writers to rewrite this first section.
Comment by Anil Kumar [ 20/Jun/14 ]
Minor fix




[MB-9805] moxi/win32: Reduce usage of low TCP/IP listen ports for internal moxi pipes. Created: 27/Dec/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: moxi
Affects Version/s: 2.2.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Major
Reporter: Dave Rigby Assignee: Steve Yen
Resolution: Unresolved Votes: 0
Labels: windows
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: Win32 (Windows 7 in local testing)


 Description   
A customer has noted that moxi listens on a number of low numbered ports in addition to the configured listen port(s) - default 11211. These are causing problems for the customer as they can sometimes block their other applications which expect to be able to listen on these ports.

The customer would like to be able to configure the ports used here, essentially to 'blacklist' certain ports from being used by moxi, or if possible remove these additional listening ports altogether.

Example of the ports used (windows netstat; note this *only* occurs on Win32, OS X / Linux netstat is clean):

$ less netstat_win.txt |grep -A1 LISTENING|grep -B1 moxi|grep -v "\-\-"
  TCP 0.0.0.0:1121 0.0.0.0:0 LISTENING
 [moxi.exe]
  TCP 0.0.0.0:11211 0.0.0.0:0 LISTENING
 [moxi.exe]
  TCP 127.0.0.1:1051 0.0.0.0:0 LISTENING
 [moxi.exe]
  TCP 127.0.0.1:1053 0.0.0.0:0 LISTENING
 [moxi.exe]
  TCP 127.0.0.1:1096 0.0.0.0:0 LISTENING
 [moxi.exe]
  TCP 127.0.0.1:1103 0.0.0.0:0 LISTENING
 [moxi.exe]
  TCP 127.0.0.1:1110 0.0.0.0:0 LISTENING
 [moxi.exe]
  TCP 127.0.0.1:1117 0.0.0.0:0 LISTENING
 [moxi.exe]
  TCP [::]:1120 [::]:0 LISTENING
 [moxi.exe]
  TCP [::]:11211 [::]:0 LISTENING
 [moxi.exe]



 Comments   
Comment by Dave Rigby [ 27/Dec/13 ]
From a brief look at the code it appears that on WIN32 moxi uses TCP listen sockets for to create the send/receive notify FDs for inter-thread communication - see https://github.com/membase/moxi/blob/master/win32/win32.h#L257 and https://github.com/membase/moxi/blob/master/thread.c#L678 (Added by commit: https://github.com/membase/moxi/commit/19bc663).

I'm no expert on Win32 socket programming, but why can we not use _pipe() from Windows' CRT (http://msdn.microsoft.com/en-us/library/edze9h7e(v=vs.80).aspx) instead of the pair of TCP sockets?
Comment by Trond Norbye [ 27/Dec/13 ]
Do they really _WANT_ to use moxi? please note that they are _WAY_ better off using a "smat client" rather than going through moxi...




[MB-9759] Document the minimal set of steps to configure a node and cluster from scratch using the CLI Created: 17/Dec/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.2.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Major
Reporter: Perry Krug Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
We have all of the details by using couchbase-cli, but they are not listed out in one easy set of steps for us to point to via the docs.

It should include:
-Installing the node from scratch
-Configuring the necessary bits of one node, including setting the RAM size, optionally changing the data/index directories, and optionally setting the hostname
-Adding a node to an already existing cluster and rebalance
-Creating a bucket (not port-based, supply a password)

 Comments   
Comment by Ruth Harris [ 04/Apr/14 ]
Improvement to documentation. Scheduled for 3.0 release. --Ruth




[MB-9623] Capability table comparing couchbase and memcached buckets should include XDCR Created: 20/Nov/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.2.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Patrick Varley Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: http://docs.couchbase.com/couchbase-manual-2.2/#data-storage


 Description   
The third table under Data Storage which compares memcached Buckets to couchbase Buckets should have a row for XDCR.




[MB-9603] investigate why we need to sleep 1 sec on cluster leave Created: 19/Nov/13  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: ns_server
Affects Version/s: 2.1.1
Fix Version/s: 3.0.2
Security Level: Public

Type: Task Priority: Major
Reporter: Artem Stemkovski Assignee: Artem Stemkovski
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
... and if possible get rid of this sleep




[MB-10039] [Multi-instance testing]CBworkloadgen crashed while running server re-add rebalance during 20Million item insert run. Created: 27/Jan/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Venu Uppalapati Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: CentOS 6.4 64-bit
RAM:256GB
Dual 2.9GHz 8-core Xeon E5-2690 for 32 total cores (16 + hyperthreading)

Attachments: File splitcbworkcrash4.z01     Zip Archive splitcbworkcrash4.zip    
Triage: Triaged
Operating System: Centos 64-bit
Is this a Regression?: Yes

 Description   
the following message is seen when an instance is failed over and rebalance is clicked. the instance is not the instance that CBworkloadgen is connected to.
2014-01-28 11:42:16,919: s1 refreshing sink map: http://localhost:9000
2014-01-28 11:42:16,927: s0 refreshing sink map: http://localhost:9000
2014-01-28 11:42:16,927: s2 refreshing sink map: http://localhost:9000
2014-01-28 11:45:10,348: s2 error: async operation: error: conn.send() exception: [Errno 32] Broken pipe on sink: http://localhost:9000(default@N/A-0)
2014-01-28 11:45:10,349: s0 error: async operation: error: conn.send() exception: [Errno 32] Broken pipe on sink: http://localhost:9000(default@N/A-2)
2014-01-28 11:45:10,351: s1 error: async operation: error: conn.send() exception: [Errno 32] Broken pipe on sink: http://localhost:9000(default@N/A-1)
error: conn.send() exception: [Errno 32] Broken pipe

Steps to reproduce:
0) following steps were done to simulate system level tests.
1) run following command to start inserting 20M items into cluster.
user1@xxxx-1111 bin]$ ./cbworkloadgen -n localhost:9000 -i 200000000 -t 3 -u Administrator -p password
2) create groups and hit rebalance.
3)while rebalance is in progress, hit failover on one node and then add it back. Click rebalance again.
4)at some point during the above steps, cbworkloadgen crashed.

2014-01-27 18:52:59,610: s0 warning: received NOT_MY_VBUCKET; perhaps the cluster is/was rebalancing; vbucket_id: 436, key: pymc77480993, spec: http://localhost:9000, host:port: 172.23.100.18:12000
2014-01-27 18:52:59,610: s0 warning: received NOT_MY_VBUCKET; perhaps the cluster is/was rebalancing; vbucket_id: 439, key: pymc77480108, spec: http://localhost:9000, host:port: 172.23.100.18:12000
2014-01-27 18:52:59,613: s0 refreshing sink map: http://localhost:9000
2014-01-27 18:53:48,938: s0 error: recv exception: [Errno 104] Connection reset by peer
2014-01-27 18:53:48,938: s2 error: recv exception: [Errno 104] Connection reset by peer
2014-01-27 18:53:48,938: s1 error: recv exception: [Errno 104] Connection reset by peer
2014-01-27 18:53:48,938: s0 MCSink exception:
2014-01-27 18:53:48,938: s2 MCSink exception:
2014-01-27 18:53:48,939: s1 MCSink exception:
2014-01-27 18:53:48,939: s0 error: async operation: error: MCSink exception: on sink: http://localhost:9000(default@N/A-0)
2014-01-27 18:53:48,939: s2 error: async operation: error: MCSink exception: on sink: http://localhost:9000(default@N/A-2)
2014-01-27 18:53:48,940: s1 error: async operation: error: MCSink exception: on sink: http://localhost:9000(default@N/A-1)
error: MCSink exception:


 Comments   
Comment by Venu Uppalapati [ 27/Jan/14 ]
attached cbcollect_info from node of cbworkloadgen crash
Comment by Anil Kumar [ 04/Jun/14 ]
Triage - June 04 2014 Bin, Ashivinder, Venu, Tony
Comment by Anil Kumar [ 17/Jun/14 ]
Triage : June 17 2014 Anil, Bin, Wayne, Ashvinder, Tony
Comment by Anil Kumar [ 17/Jul/14 ]
Triage - Bin, Ashvinder, Wayne .. July 17th

Bin is going to provide us the upper limit for max items cbworkloadgen can accept. whats the default?

Assign this to documentation.

Comment by Steve Yen [ 29/Jul/14 ]
Spoke with Bin and the notes...

Awhile back, he wasn't able to reproduce this.

Also, back in July, Bin made a rebalance related fix to cbworkloadgen (MB-7981), although that was just to smooth out pauses. But, it might be related.

Also, there should be no limits in the "-i" param to cbworkloadgen, other than python integer limits (2^63; MAX_INT). (Also, btw, above in the cmd-line it has 200 million item insert instead of 20 million items.)
Comment by Anil Kumar [ 29/Jul/14 ]
Ruth - Need to document the limitation or limit.




[MB-9931] "Using Reference Documents for Lookups" has code errors / confusing comments Created: 16/Jan/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.2.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Minor
Reporter: Alex McFadyen Assignee: Amy Kurtzman
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
http://docs.couchbase.com/couchbase-devguide-2.2/#using-reference-documents-for-lookups

In the documentation it confuses value for user:uid with user::count in several places, which makes it hard to follow and could (and did) really confuse people.

user::count = 101
user::uid = 12f1

Some examples :

{code}
# using same variables from above for the user's data

# add reference document for username
c.add("username::#{user_username.downcase}", new_id) # => save lookup document, with document key = "username::johnsmith" => 101

# add reference document for email
c.add("email::#{user_email.downcase}", new_id) # => save lookup document, with document key = "email::jsmith@domain.com" => 101

# add reference document for Facebook ID
c.add("fb::#{user_fb}", new_id) # => save lookup document, with document key = "fb::12393" => 101
{code}

It should have used 12f1 instead in that instance.

Its confused futher in the next code block :

{code}
#retrieve input from a web form
user_username = params["username"]

# retrieve by user_id value using username provided in web form
user_id = c.get("username::#{user_username.downcase}") # => get the user_id # => 12f1
user_hash = c.get("user::#{user_id}") # => get the primary User document (key = user::12f1)

# puts user_hash
# => { "uid" => 101, "type" => "user", "name" => "John Smith", "email" => "jsmith@domain.com", "fbid" => "12393" }

#get additional web form parameter, email
user_email = params["email"]
{code}

 Comments   
Comment by Ruth Harris [ 04/Apr/14 ]
Could you indicate (specifically) which ones should be changed?

For example,
CHANGE TO:
# add reference document for Facebook ID
c.add("fb::#{user_fb}", new_id) # => save lookup document, with document key = "fb::12393" => 12f1





[MB-9874] [Windows] Couchstore drop and reopen of file handle fails Created: 09/Jan/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: storage-engine
Affects Version/s: 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Critical
Reporter: Trond Norbye Assignee: Chiyoung Seo
Resolution: Unresolved Votes: 0
Labels: windows, windows_pm_triaged
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: Windows


 Description   
The unit test doing couchstore_drop_file and couchstore_repoen_file fails due to COUCHSTORE_READ_ERROR when it tries to reopen the file.

The commit http://review.couchbase.org/#/c/31767/ disabled the test to allow the rest of the unit tests to be executed.

 Comments   
Comment by Anil Kumar [ 17/Jul/14 ]
Triage - Chiyoung, Anil, Venu, Wayne .. July 17th




[MB-10050] Document - Configuring Couchbase to run on user-defined ports + multiple instance on single server Created: 29/Jan/14  Updated: 30/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.5.0
Fix Version/s: 2.5.1, 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Anil Kumar Assignee: Aruna Piravi
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Document - Configuring Couchbase to run on user-defined Firewall Port

 Comments   
Comment by Anil Kumar [ 26/Feb/14 ]
perry - I think the documentation needs fixing here as it currently doesn't mention anything about using multiple paths. Given that we require multiple paths, this means that it is impossible to use the 'rpm' installation to accomplish this, so the docs need to be updated to reflect that.
Comment by Perry Krug [ 05/May/14 ]
FYI, this is marked as "fix" for 2.5.1, but still has not been finished. There was a support ticket today with a user confused on how to accomplish this...can it be prioritized as our public documentation is currently very misleading.

Thank you
Comment by Ruth Harris [ 19/May/14 ]
Anil, Perry,

Can you better define what needs to updated, changed, added in the current documentation and who the main contact person is for developing this material?
Comment by Perry Krug [ 19/May/14 ]
From what I can tell, there are two issues here:
-The docs need to be updated to reflect that you must install Couchbase with different paths for each instance
-The docs need to be updated to reflect that you cannot use the standard RPM installation since that does not allow for specifying separate paths

I believe Steve Yen or Aruna are the best points of contact.

Thanks!
Comment by Amy Kurtzman [ 16/Jun/14 ]
Please provide the information that needs to be changed or added to the documentation.




[MB-9917] DOC - memcached should dynamically adjust the number of worker threads Created: 14/Jan/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Blocker
Reporter: Trond Norbye Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
4 threads is probably not ideal for a 24 core system ;)

 Comments   
Comment by Anil Kumar [ 25/Mar/14 ]
Trond - Can you explain is this new feature in 3.0 or fixing documentation on older docs?
Comment by Ruth Harris [ 17/Jul/14 ]
Trond, Could you provide more information here and then reassign to me? --ruth
Comment by Trond Norbye [ 24/Jul/14 ]
New in 3.0 is that memcached no longer defaults to 4 threads for the frontend, but use 75% of the number of cores reported of the system (with a minimum of 4 cores).

There are 3 ways to tune this:

* Export MEMCACHED_NUM_CPUS=number of threads you want before starting couchbase server

* Use the -t <number> command line argument (this will go away in the future)

* specify it in the configuration file read during startup (but when started from the full server this file is regenerated every time so you'll loose the modifications)




[MB-10183] Edit "include_doc" to "include_docs" for correct usage Created: 11/Feb/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Trivial
Reporter: Ketaki Gangal Assignee: Amy Kurtzman
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged

 Description   
In a couple of places I see reference to "include_docs=true" as "include_doc" , it is missing the 's'
The resulting usage will not yield the expected result of including the docs on the query result.

Reference : http://www.couchbase.com/docs//couchbase-manual-2.0/couchbase-views-expiration.html

For example :
http://127.0.0.1:9500/beer-sample/_design/beer/_view/brewery_beers?stale=update_after&connection_timeout=60000&limit=10&skip=0&include_doc=true

{"total_rows":7302,"rows":[
{"id":"21st_amendment_brewery_cafe","key":["21st_amendment_brewery_cafe"],"value":null},
{"id":"21st_amendment_brewery_cafe-21a_ipa","key":["21st_amendment_brewery_cafe","21st_amendment_brewery_cafe-21a_ipa"],"value":null},

Whereas the correct usage and expected result for this query should be

http://127.0.0.1:9500/beer-sample/_design/beer/_view/brewery_beers?stale=update_after&connection_timeout=60000&limit=10&skip=0&include_docs=true

{"total_rows":7302,"rows":[
{"id":"21st_amendment_brewery_cafe","key":["21st_amendment_brewery_cafe"],"value":null,"doc":{"meta":{"id":"21st_amendment_brewery_cafe","rev":"1-00000001b0f188800000000000000000","expiration":0,"flags":0},"json":{"name":"21st Amendment Brewery Cafe","city":"San Francisco","state":"California","code":"94107","country":"United States","phone":"1-415-369-0900","website":"http://www.21st-amendment.com/","type":"brewery","updated":"2010-10-24 13:54:07","description":"The 21st Amendment Brewery offers a variety of award winning house made brews and American grilled cuisine in a comfortable loft like setting. Join us before and after Giants baseball games in our outdoor beer garden. A great location for functions and parties in our semi-private Brewers Loft. See you soon at the 21A!","address":["563 Second Street"],"geo":{"accuracy":"ROOFTOP","lat":37.7825,"lon":-122.393}}}},
{"id":"21st_amendment_brewery_cafe-21a_ipa","key":["21st_amendment_brewery_cafe","21st_amendment_brewery_cafe-21a_ipa"],"value":null,"doc":{"meta":{"id":"21st_amendment_brewery_cafe-21a_ipa","rev":"1-00000001ae132e250000000000000000","expiration":0,"flags":0},"json":{"name":"21A IPA","abv":7.2,"ibu":0.0,"srm":0.0,"upc":0,"type":"beer","brewery_id":"21st_amendment_brewery_cafe","updated":"2010-07-22 20:00:20","description":"Deep golden color. Citrus and piney hop aromas. Assertive malt backbone supporting the overwhelming bitterness. Dry hopped in the fermenter with four types of hops giving an explosive hop aroma. Many refer to this IPA as Nectar of the Gods. Judge for yourself. Now Available in Cans!","style":"American-Style India Pale Ale","category":"North American

 Comments   
Comment by Ruth Harris [ 06/Mar/14 ]
Being deprecated in 3.0.
Comment by Ruth Harris [ 01/May/14 ]
We have a whole section on Detecting Expired Documents in Result Sets that talk about using include_docs.
What's the replacement?
Comment by Volker Mische [ 17/Jul/14 ]
Ruth, there's no replacement on the HTTP API level. You would need to use the SDKs for it, they still support include_docs.




[MB-10180] Server Quota: Inconsistency between documentation and CB behaviour Created: 11/Feb/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.2.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Blocker
Reporter: Dave Rigby Assignee: Ruth Harris
Resolution: Unresolved Votes: 1
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File MB-10180_max_quota.png    
Issue Links:
Relates to
relates to MB-2762 Default node quota is still too high Resolved
relates to MB-8832 Allow for some back-end setting to ov... Open
Triage: Untriaged
Operating System: Ubuntu 64-bit
Is this a Regression?: Yes

 Description   
In the documentation for the product (and general sizing advice) we tell people to allocate no more than 80% of their memory for the Server Quota, to leave headroom for the views, disk write queues and general OS usage.

However on larger[1] nodes we don't appear to enforce this, and instead allow people to allocate up to 1GB less than the total RAM.

This is inconsistent, as we document and tell people one thing and let them do another.

This appears to be something inherited from MB-2762, which the intent of which appeared to only allow the relaxing of this when joining a cluster, however this doesn't appear to be how it works - I can successfully change the existing cluster quota from the CLI to a "large" value:

    $ /opt/couchbase/bin/couchbase-cli cluster-edit -c localhost:8091 -u Administrator -p dynam1te --cluster-ramsize=127872
    ERROR: unable to init localhost (400) Bad Request
    [u'The RAM Quota value is too large. Quota must be between 256 MB and 127871 MB (memory size minus 1024 MB).']

While I can see some logic to relax the 80% constraint on big machines, with the advent of 2.X features 1024MB seems far too small an amount of headroom.

Suggestions to resolve:

A) Revert to a straightforward 80% max, with a --force option or similar to allow specific customers to go higher if they know what they are doing
B) Leave current behaviour, but document it.
B) Increase minimum headroom to something more reasonable for 2.X, *and* document the beaviour.

([1] On a machine with 128,895MB of RAM I get the "total-1024" behaviour, on a 1GB VM I get 80%. I didn't check in the code what the cutoff for 80% / total-1024 is).


 Comments   
Comment by Dave Rigby [ 11/Feb/14 ]
Screenshot of initial cluster config: maximum quota is total_RAM-1024
Comment by Aleksey Kondratenko [ 11/Feb/14 ]
Do not agree with that logic.

There's IMHO quite a bit of difference between default settings, recommended settings limit and allowed settings limit. The later can be wider for folks who really know what they're doing.
Comment by Aleksey Kondratenko [ 11/Feb/14 ]
Passed to Anil, because that's not my decision to change limits
Comment by Dave Rigby [ 11/Feb/14 ]
@Aleksey: I'm happy to resolve as something other than my (A,B,C), but the problem here is that many people haven't even been aware of this "extended" limit in the system - and moreover on a large system we actually advertise it in the GUI when specifying the allowed limit (see attached screenshot).

Furthermore, I *suspect* that this was originally only intended for upgrades for 1.6.X (see http://review.membase.org/#/c/4051/), but somehow is now being permitted for new clusters.

Ultimately I don't mind what our actual max quota value is, but the app behaviour should be consistent with the documentation (and the sizing advice we give people).
Comment by Maria McDuff (Inactive) [ 19/May/14 ]
raising to product blocker.
this inconsistency has to be resolved - PM to re-align.
Comment by Anil Kumar [ 28/May/14 ]
Going with option B - Leave current behaviour, but document it.
Comment by Ruth Harris [ 17/Jul/14 ]
I only see the 80% number coming up as an example of setting the high water mark (85% suggested). The Server Quota section doesn't mention anything. The working set managment & ejection section(s) and item pager sub-section also mention high water mark.

Can you be more specific about where this information is? Anyway, the best solution is to add a "note" in the applicable section(s).

--ruth

Comment by Dave Rigby [ 21/Jul/14 ]
@Ruth: So the current product behaviour is that the ServerQuota limit depends on the maximum memory available:

* For machines with <= X MB of memory, the maximum server quota is 80% of total physical memory
* For machines with > X MB of memory, the maximum Server Quota is Total Physical Memory - 1024.

The value of 'X' is fixed in the code, but it wasn't obvious what it's actually is (it's derived from a few different things. I suggest you ask Alk who should be able to provide the value of it.




[MB-10144] couchbase-cli does not seem to allow for setting of hostname Created: 06/Feb/14  Updated: 30/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.2.0, 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Perry Krug Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by MB-12308 couchbase-cli should support a --host... Resolved
Triage: Untriaged

 Description   
I might just be missing it from the docs or help text, but I couldn't find a way of explicitly applying a hostname to a new node that was being started using couchbase-cli.

This is important for customers trying to automate the entire process of setting up a node.

I do see a REST API (http://docs.couchbase.com/couchbase-manual-2.2/#using-hostnames-with-couchbase-server) but trying to give users a single-point-of-entry via couchbase-cli.

 Comments   
Comment by Steve Yen [ 06/Feb/14 ]
Hi Bin,

I marked this as 2.5.1 so it shows up on your radar sooner rather than later (3.0), but probably need to check PM really wants it in 2.5.1 or 2.5.2 or etc

-- steve
Comment by Perry Krug [ 19/Mar/14 ]
Bin, we actually do have both a UI and a REST API facility for setting the hostname of a node. I agree that we do not have the ability to change it afterwards, but I'm just asking for a similar CLI functionality of setting it in the first place so that users do not have to use a combination of multiple tools to setup their cluster:
http://docs.couchbase.com/couchbase-manual-2.5/cb-install/#using-hostnames
Comment by Bin Cui [ 15/Apr/14 ]
http://review.couchbase.org/#/c/35740/
Comment by Perry Krug [ 15/Apr/14 ]
Re-opening for docs...




[MB-10146] Document editor overwrites precision of long numbers Created: 06/Feb/14  Updated: 30/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Critical
Reporter: Perry Krug Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Triaged

 Description   
Just tested this out, not sure what diagnostics to capture so please let me know.

Simple test case:
-Create new document via document editor in UI
-Document contents are:
{"id": 18446744072866779556}
-As soon as you save, the above number is rewritten to:
{
  "id": 18446744072866780000
}
-The same effect is had if you edit a document that was inserted with the above "long" number

 Comments   
Comment by Aaron Miller (Inactive) [ 06/Feb/14 ]
It's worth noting views will always suffer from this, as it is a limitation of Javascript in general. Many JSON libraries have this behavior as well (even though they don't *have* to).
Comment by Aleksey Kondratenko [ 11/Apr/14 ]
cannot fix it. Just closing. If you want to reopen, please pass it to somebody responsible for overall design.
Comment by Perry Krug [ 11/Apr/14 ]
Reopening and assigning to docs, we need this to be release noted IMO.
Comment by Ruth Harris [ 14/Apr/14 ]
Reassigning to Anil. He makes the call on what we put in the release notes for known and fixed issues.
Comment by Anil Kumar [ 09/May/14 ]
Ruth - Lets release note this for 3.0.




[MB-10096] cbepctl doesn't support setting certain thresholds by percentage Created: 31/Jan/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.2.0, 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Perry Krug Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Duplicate
is duplicated by MB-10092 Changing watermark thresholds doesn't... Closed
Triage: Untriaged

 Description   
I'm not sure if this is a bug in ep-engine or just that the docs need to be updated, but there are a few places where we state that cbepctl can be used to set certain values as a percentage and the provided commands don't work (they take an absolute value instead of a percentage)

Please check with the engineering team on what the right documentation and support capabilities should be.

http://docs.couchbase.com/couchbase-manual-2.2/#changing-thresholds-for-ejection
http://docs.couchbase.com/couchbase-manual-2.2/#changing-disk-write-queue-quotas (the examples and the table of descriptions don't match)


 Comments   
Comment by Mike Wiederhold [ 19/Feb/14 ]
This is a documentation issue. You must use the percentage sign in order to set the value by percentage.




[MB-10126] It would be nice if make simple-test didn't flood the console Created: 05/Feb/14  Updated: 30/Oct/14  Due: 20/Jun/14

Status: Open
Project: Couchbase Server
Component/s: test-execution
Affects Version/s: feature-backlog
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Major
Reporter: Trond Norbye Assignee: Tommie McAfee
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Running make-simple test emits tons of output. I would prefer if it could just print out when it starts each test, then print out the result of that test.. all of the other output could go into a "logfile" I could "tail" if I want to..

With the current model it's pretty hard to know where in the test batch we're at, if one of the tests already failed etc..

 Comments   
Comment by Maria McDuff (Inactive) [ 12/Feb/14 ]
Phil,

if this is not for you to fix, pls assign accordingly.
Comment by Trond Norbye [ 20/Feb/14 ]
Running the test right now printed out 5099 lines...
Comment by Trond Norbye [ 20/Feb/14 ]
Iryana: could you reassign this to the right person within the team fixing stuff in testrunner?
Comment by Iryna Mironava [ 21/Feb/14 ]
http://review.couchbase.org/#/c/33798/
added target simple-test-no-logs
functional test console output looks like:
----------------------------------------------------------------------
Ran 1 test in 6.850s

OK
do_warmup_100k (memcapable.WarmUpMemcachedTest) ... ok

----------------------------------------------------------------------
Ran 1 test in 56.089s

OK
test_view_ops (view.createdeleteview.CreateDeleteViewTests) ... ok

----------------------------------------------------------------------
Ran 1 test in 95.749s

OK
test_employee_dataset_startkey_endkey_queries_rebalance_in (viewquerytests.ViewQueryTests) ... ok

----------------------------------------------------------------------
Ran 1 test in 180.524s

OK
test_simple_dataset_stale_queries_data_modification (viewquerytests.ViewQueryTests) ... ok

----------------------------------------------------------------------
Ran 1 test in 72.432s

OK
For performance tests output please check with Pavel
Comment by Trond Norbye [ 21/Feb/14 ]
That looks a lot better :) It shouldn't be its own target, but flooding the console should rather be enabled by running:

make simple-test VERBOSE=1

(that's the way we enable verbose output from running the normal compilation (the verbose variable only needs to be set, the value doesn't matter))
Comment by Maria McDuff (Inactive) [ 14/Mar/14 ]
Iryna,

this can be marked as 'resolved - won't fix', as the verbosity parameter is configurable.
Comment by Trond Norbye [ 17/Mar/14 ]
I disagree.. Try running make simple-test in the testrunner directory. It starts off nice, but then ends up with:

filename: conf/simple.conf
Global Test input params:
{'cluster_name': 'dev-4-nodes-xdcr',
 'conf_file': 'conf/simple.conf',
 'ini': 'b/resources/dev-4-nodes-xdcr.ini',
 'log_level': 'CRITICAL',
 'makefile': 'True',
 'num_nodes': 4,
 'spec': 'simple'}
Logs will be stored at /Users/trond/compile/couchbase/clean/testrunner/logs/testrunner-14-Mar-17_21-14-52/rebalance.rebalancein.RebalanceInTests.rebalance_in_with_ops,nodes_in=3,replicas=1,items=50000,doc_ops=create;update;delete.logging.conf

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t rebalance.rebalancein.RebalanceInTests.rebalance_in_with_ops,nodes_in=3,replicas=1,items=50000,doc_ops=create;update;delete

Test Input params:
{'log_level': 'CRITICAL', 'replicas': '1', 'doc_ops': 'create;update;delete', 'items': '50000', 'nodes_in': '3', 'makefile': 'True', 'conf_file': 'conf/simple.conf', 'num_nodes': 4, 'cluster_name': 'dev-4-nodes-xdcr', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 1, 'spec': 'simple'}
Run before suite setup for rebalance.rebalancein.RebalanceInTests.rebalance_in_with_ops
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
testrunner logs, diags and results are available under logs/testrunner-14-Mar-17_21-14-52
Logs will be stored at /Users/trond/compile/couchbase/clean/testrunner/logs/testrunner-14-Mar-17_21-14-52/performance.eperf.EVPerfClient.test_minimal,stats=0,items=1000,hot_init_items=1000.logging.conf

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t performance.eperf.EVPerfClient.test_minimal,stats=0,items=1000,hot_init_items=1000

Test Input params:
{'hot_init_items': '1000', 'stats': '0', 'items': '1000', 'makefile': 'True', 'conf_file': 'conf/simple.conf', 'num_nodes': 4, 'cluster_name': 'dev-4-nodes-xdcr', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 2, 'log_level': 'CRITICAL', 'spec': 'simple'}
unable to import cbtop: see http://pypi.python.org/pypi/cbtop
unable to import seriesly: see http://pypi.python.org/pypi/seriesly
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
summary so far suite performance.eperf.EVPerfClient , pass 1 , fail 0
testrunner logs, diags and results are available under logs/testrunner-14-Mar-17_21-14-52
Logs will be stored at /Users/trond/compile/couchbase/clean/testrunner/logs/testrunner-14-Mar-17_21-14-52/do_warmup_100k.logging.conf

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t memcapable.WarmUpMemcachedTest.do_warmup_100k

Test Input params:
{'cluster_name': 'dev-4-nodes-xdcr', 'log_level': 'CRITICAL', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 3, 'makefile': 'True', 'spec': 'simple', 'conf_file': 'conf/simple.conf', 'num_nodes': 4}
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
summary so far suite performance.eperf.EVPerfClient , pass 1 , fail 0
summary so far suite memcapable.WarmUpMemcachedTest , pass 1 , fail 0
testrunner logs, diags and results are available under logs/testrunner-14-Mar-17_21-14-52
Logs will be stored at /Users/trond/compile/couchbase/clean/testrunner/logs/testrunner-14-Mar-17_21-14-52/view.createdeleteview.CreateDeleteViewTests.test_view_ops,ddoc_ops=create,test_with_view=True,num_ddocs=1,num_views_per_ddoc=10,items=1000,skip_cleanup=False.logging.conf

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t view.createdeleteview.CreateDeleteViewTests.test_view_ops,ddoc_ops=create,test_with_view=True,num_ddocs=1,num_views_per_ddoc=10,items=1000,skip_cleanup=False

Test Input params:
{'cluster_name': 'dev-4-nodes-xdcr', 'ddoc_ops': 'create', 'log_level': 'CRITICAL', 'num_ddocs': '1', 'items': '1000', 'test_with_view': 'True', 'makefile': 'True', 'num_views_per_ddoc': '10', 'conf_file': 'conf/simple.conf', 'skip_cleanup': 'False', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 4, 'num_nodes': 4, 'spec': 'simple'}
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
summary so far suite performance.eperf.EVPerfClient , pass 1 , fail 0
summary so far suite memcapable.WarmUpMemcachedTest , pass 1 , fail 0
summary so far suite view.createdeleteview.CreateDeleteViewTests , pass 1 , fail 0
testrunner logs, diags and results are available under logs/testrunner-14-Mar-17_21-14-52
Logs will be stored at /Users/trond/compile/couchbase/clean/testrunner/logs/testrunner-14-Mar-17_21-14-52/view.viewquerytests.ViewQueryTests.test_employee_dataset_startkey_endkey_queries_rebalance_in,num_nodes_to_add=1,skip_rebalance=true,docs-per-day=1.logging.conf

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t view.viewquerytests.ViewQueryTests.test_employee_dataset_startkey_endkey_queries_rebalance_in,num_nodes_to_add=1,skip_rebalance=true,docs-per-day=1

Test Input params:
{'docs-per-day': '1', 'log_level': 'CRITICAL', 'num_nodes_to_add': '1', 'makefile': 'True', 'conf_file': 'conf/simple.conf', 'num_nodes': 4, 'cluster_name': 'dev-4-nodes-xdcr', 'skip_rebalance': 'true', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 5, 'spec': 'simple'}
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
summary so far suite performance.eperf.EVPerfClient , pass 1 , fail 0
summary so far suite memcapable.WarmUpMemcachedTest , pass 1 , fail 0
summary so far suite view.createdeleteview.CreateDeleteViewTests , pass 1 , fail 0
summary so far suite view.viewquerytests.ViewQueryTests , pass 1 , fail 0
testrunner logs, diags and results are available under logs/testrunner-14-Mar-17_21-14-52
Logs will be stored at /Users/trond/compile/couchbase/clean/testrunner/logs/testrunner-14-Mar-17_21-14-52/view.viewquerytests.ViewQueryTests.test_simple_dataset_stale_queries_data_modification,num-docs=1000,skip_rebalance=true.logging.conf

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t view.viewquerytests.ViewQueryTests.test_simple_dataset_stale_queries_data_modification,num-docs=1000,skip_rebalance=true

Test Input params:
{'log_level': 'CRITICAL', 'makefile': 'True', 'conf_file': 'conf/simple.conf', 'num_nodes': 4, 'cluster_name': 'dev-4-nodes-xdcr', 'skip_rebalance': 'true', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 6, 'num-docs': '1000', 'spec': 'simple'}
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
summary so far suite performance.eperf.EVPerfClient , pass 1 , fail 0
summary so far suite memcapable.WarmUpMemcachedTest , pass 1 , fail 0
summary so far suite view.createdeleteview.CreateDeleteViewTests , pass 1 , fail 0
summary so far suite view.viewquerytests.ViewQueryTests , pass 2 , fail 0
testrunner logs, diags and results are available under logs/testrunner-14-Mar-17_21-14-52
Logs will be stored at /Users/trond/compile/couchbase/clean/testrunner/logs/testrunner-14-Mar-17_21-14-52/xdcr.uniXDCR.unidirectional.load_with_ops,replicas=1,items=10000,value_size=128,ctopology=chain,rdirection=unidirection,doc-ops=update-delete.logging.conf

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t xdcr.uniXDCR.unidirectional.load_with_ops,replicas=1,items=10000,value_size=128,ctopology=chain,rdirection=unidirection,doc-ops=update-delete

Test Input params:
{'doc-ops': 'update-delete', 'log_level': 'CRITICAL', 'replicas': '1', 'items': '10000', 'value_size': '128', 'makefile': 'True', 'conf_file': 'conf/simple.conf', 'num_nodes': 4, 'cluster_name': 'dev-4-nodes-xdcr', 'ctopology': 'chain', 'rdirection': 'unidirection', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 7, 'spec': 'simple'}
setting param: exp_pager_stime 10
setting param: exp_pager_stime 10
setting param: exp_pager_stime 10
setting param: exp_pager_stime 10
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
summary so far suite performance.eperf.EVPerfClient , pass 1 , fail 0
summary so far suite memcapable.WarmUpMemcachedTest , pass 1 , fail 0
summary so far suite view.createdeleteview.CreateDeleteViewTests , pass 1 , fail 0
summary so far suite view.viewquerytests.ViewQueryTests , pass 2 , fail 0
summary so far suite xdcr.uniXDCR.unidirectional , pass 1 , fail 0
testrunner logs, diags and results are available under logs/testrunner-14-Mar-17_21-14-52
Logs will be stored at /Users/trond/compile/couchbase/clean/testrunner/logs/testrunner-14-Mar-17_21-14-52/xdcr.uniXDCR.unidirectional.load_with_failover,replicas=1,items=10000,ctopology=chain,rdirection=unidirection,doc-ops=update-delete,failover=source.logging.conf

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t xdcr.uniXDCR.unidirectional.load_with_failover,replicas=1,items=10000,ctopology=chain,rdirection=unidirection,doc-ops=update-delete,failover=source

Test Input params:
{'doc-ops': 'update-delete', 'log_level': 'CRITICAL', 'replicas': '1', 'items': '10000', 'failover': 'source', 'makefile': 'True', 'conf_file': 'conf/simple.conf', 'num_nodes': 4, 'cluster_name': 'dev-4-nodes-xdcr', 'ctopology': 'chain', 'rdirection': 'unidirection', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 8, 'spec': 'simple'}
setting param: exp_pager_stime 10
setting param: exp_pager_stime 10
setting param: exp_pager_stime 10
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
summary so far suite performance.eperf.EVPerfClient , pass 1 , fail 0
summary so far suite memcapable.WarmUpMemcachedTest , pass 1 , fail 0
summary so far suite view.createdeleteview.CreateDeleteViewTests , pass 1 , fail 0
summary so far suite view.viewquerytests.ViewQueryTests , pass 2 , fail 0
summary so far suite xdcr.uniXDCR.unidirectional , pass 2 , fail 0
testrunner logs, diags and results are available under logs/testrunner-14-Mar-17_21-14-52
Run after suite setup for xdcr.uniXDCR.unidirectional.load_with_failover
rebalance.rebalancein.RebalanceInTests.rebalance_in_with_ops pass
performance.eperf.EVPerfClient.test_minimal pass
memcapable.WarmUpMemcachedTest.do_warmup_100k pass
view.createdeleteview.CreateDeleteViewTests.test_view_ops pass
view.viewquerytests.ViewQueryTests.test_employee_dataset_startkey_endkey_queries_rebalance_in pass
view.viewquerytests.ViewQueryTests.test_simple_dataset_stale_queries_data_modification pass
xdcr.uniXDCR.unidirectional.load_with_ops pass
xdcr.uniXDCR.unidirectional.load_with_failover pass
scripts/start_cluster_and_run_tests.sh: line 81: 30701 Terminated: 15 COUCHBASE_NUM_VBUCKETS=64 python ./cluster_run --nodes=$servers_count >&$wd/cluster_run.log (wd: ~/compile/couchbase/clean/ns_server)


Comment by Trond Norbye [ 30/Apr/14 ]
I just ran this again, and it _starts off_ nice, but then dumps a ton of output (see below). This is a problem for me since it is making it hard to see if things failed or not (so I have to do either try to search in all of the text, or do the echo $? thing...)

trond@ok:1026> gmake run-mats
cd testrunner && /Applications/Xcode.app/Contents/Developer/usr/bin/make simple-test
scripts/start_cluster_and_run_tests.sh b/resources/dev-4-nodes-xdcr.ini conf/simple.conf 0
~/compile/couchbase/ptrace/testrunner ~/compile/couchbase/ptrace/testrunner
~/compile/couchbase/ptrace/testrunner
rebalance_in_with_ops (rebalance.rebalancein.RebalanceInTests) ... ok

----------------------------------------------------------------------
Ran 1 test in 86.373s

OK
test_minimal (performance.eperf.EVPerfClient)
Minimal performance test which covers load and access phases ... ok

----------------------------------------------------------------------
Ran 1 test in 4.081s

OK
do_warmup_100k (memcapable.WarmUpMemcachedTest) ... ok

----------------------------------------------------------------------
Ran 1 test in 45.663s

OK
test_view_ops (view.createdeleteview.CreateDeleteViewTests) ... ok

----------------------------------------------------------------------
Ran 1 test in 80.916s

OK
test_employee_dataset_startkey_endkey_queries_rebalance_in (view.viewquerytests.ViewQueryTests) ... ok

----------------------------------------------------------------------
Ran 1 test in 114.800s

OK
test_simple_dataset_stale_queries_data_modification (view.viewquerytests.ViewQueryTests) ... ok

----------------------------------------------------------------------
Ran 1 test in 27.321s

OK
load_with_ops (xdcr.uniXDCR.unidirectional) ... ok

----------------------------------------------------------------------
Ran 1 test in 146.671s

OK
load_with_failover (xdcr.uniXDCR.unidirectional) ... ok

----------------------------------------------------------------------
Ran 1 test in 202.815s

OK
filename: conf/simple.conf
Global Test input params:
{'cluster_name': 'dev-4-nodes-xdcr',
 'conf_file': 'conf/simple.conf',
 'ini': 'b/resources/dev-4-nodes-xdcr.ini',
 'log_level': 'CRITICAL',
 'makefile': 'True',
 'num_nodes': 4,
 'spec': 'simple'}
Logs will be stored at /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_1

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t rebalance.rebalancein.RebalanceInTests.rebalance_in_with_ops,nodes_in=3,replicas=1,items=50000,doc_ops=create;update;delete

Test Input params:
{'log_level': 'CRITICAL', 'replicas': '1', 'doc_ops': 'create;update;delete', 'items': '50000', 'nodes_in': '3', 'makefile': 'True', 'conf_file': 'conf/simple.conf', 'num_nodes': 4, 'cluster_name': 'dev-4-nodes-xdcr', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 1, 'logs_folder': '/Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_1', 'spec': 'simple'}
Run before suite setup for rebalance.rebalancein.RebalanceInTests.rebalance_in_with_ops
Cluster instance shutdown with force
Cluster instance shutdown with force
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
testrunner logs, diags and results are available under /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_1
Logs will be stored at /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_2

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t performance.eperf.EVPerfClient.test_minimal,stats=0,items=1000,hot_init_items=1000

Test Input params:
{'hot_init_items': '1000', 'stats': '0', 'items': '1000', 'makefile': 'True', 'conf_file': 'conf/simple.conf', 'num_nodes': 4, 'cluster_name': 'dev-4-nodes-xdcr', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 2, 'log_level': 'CRITICAL', 'logs_folder': '/Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_2', 'spec': 'simple'}
unable to import cbtop: see http://pypi.python.org/pypi/cbtop
unable to import seriesly: see http://pypi.python.org/pypi/seriesly
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
summary so far suite performance.eperf.EVPerfClient , pass 1 , fail 0
testrunner logs, diags and results are available under /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_2
Logs will be stored at /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_3

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t memcapable.WarmUpMemcachedTest.do_warmup_100k

Test Input params:
{'cluster_name': 'dev-4-nodes-xdcr', 'log_level': 'CRITICAL', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 3, 'logs_folder': '/Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_3', 'makefile': 'True', 'spec': 'simple', 'conf_file': 'conf/simple.conf', 'num_nodes': 4}
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
summary so far suite performance.eperf.EVPerfClient , pass 1 , fail 0
summary so far suite memcapable.WarmUpMemcachedTest , pass 1 , fail 0
testrunner logs, diags and results are available under /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_3
Logs will be stored at /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_4

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t view.createdeleteview.CreateDeleteViewTests.test_view_ops,ddoc_ops=create,test_with_view=True,num_ddocs=1,num_views_per_ddoc=10,items=1000,skip_cleanup=False

Test Input params:
{'cluster_name': 'dev-4-nodes-xdcr', 'ddoc_ops': 'create', 'log_level': 'CRITICAL', 'num_ddocs': '1', 'items': '1000', 'test_with_view': 'True', 'makefile': 'True', 'num_views_per_ddoc': '10', 'conf_file': 'conf/simple.conf', 'skip_cleanup': 'False', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 4, 'num_nodes': 4, 'logs_folder': '/Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_4', 'spec': 'simple'}
Cluster instance shutdown with force
Cluster instance shutdown with force
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
summary so far suite performance.eperf.EVPerfClient , pass 1 , fail 0
summary so far suite memcapable.WarmUpMemcachedTest , pass 1 , fail 0
summary so far suite view.createdeleteview.CreateDeleteViewTests , pass 1 , fail 0
testrunner logs, diags and results are available under /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_4
Logs will be stored at /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_5

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t view.viewquerytests.ViewQueryTests.test_employee_dataset_startkey_endkey_queries_rebalance_in,num_nodes_to_add=1,skip_rebalance=true,docs-per-day=1

Test Input params:
{'docs-per-day': '1', 'log_level': 'CRITICAL', 'num_nodes_to_add': '1', 'makefile': 'True', 'conf_file': 'conf/simple.conf', 'num_nodes': 4, 'cluster_name': 'dev-4-nodes-xdcr', 'skip_rebalance': 'true', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 5, 'logs_folder': '/Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_5', 'spec': 'simple'}
Cluster instance shutdown with force
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
summary so far suite performance.eperf.EVPerfClient , pass 1 , fail 0
summary so far suite memcapable.WarmUpMemcachedTest , pass 1 , fail 0
summary so far suite view.createdeleteview.CreateDeleteViewTests , pass 1 , fail 0
summary so far suite view.viewquerytests.ViewQueryTests , pass 1 , fail 0
testrunner logs, diags and results are available under /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_5
Logs will be stored at /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_6

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t view.viewquerytests.ViewQueryTests.test_simple_dataset_stale_queries_data_modification,num-docs=1000,skip_rebalance=true

Test Input params:
{'log_level': 'CRITICAL', 'makefile': 'True', 'conf_file': 'conf/simple.conf', 'num_nodes': 4, 'cluster_name': 'dev-4-nodes-xdcr', 'skip_rebalance': 'true', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 6, 'num-docs': '1000', 'logs_folder': '/Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_6', 'spec': 'simple'}
Cluster instance shutdown with force
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
summary so far suite performance.eperf.EVPerfClient , pass 1 , fail 0
summary so far suite memcapable.WarmUpMemcachedTest , pass 1 , fail 0
summary so far suite view.createdeleteview.CreateDeleteViewTests , pass 1 , fail 0
summary so far suite view.viewquerytests.ViewQueryTests , pass 2 , fail 0
testrunner logs, diags and results are available under /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_6
Logs will be stored at /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_7

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t xdcr.uniXDCR.unidirectional.load_with_ops,replicas=1,items=10000,value_size=128,ctopology=chain,rdirection=unidirection,doc-ops=update-delete

Test Input params:
{'doc-ops': 'update-delete', 'log_level': 'CRITICAL', 'replicas': '1', 'items': '10000', 'value_size': '128', 'makefile': 'True', 'conf_file': 'conf/simple.conf', 'num_nodes': 4, 'cluster_name': 'dev-4-nodes-xdcr', 'ctopology': 'chain', 'rdirection': 'unidirection', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 7, 'logs_folder': '/Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_7', 'spec': 'simple'}
setting param: exp_pager_stime 10
setting param: exp_pager_stime 10
setting param: exp_pager_stime 10
setting param: exp_pager_stime 10
Cluster instance shutdown with force
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
summary so far suite performance.eperf.EVPerfClient , pass 1 , fail 0
summary so far suite memcapable.WarmUpMemcachedTest , pass 1 , fail 0
summary so far suite view.createdeleteview.CreateDeleteViewTests , pass 1 , fail 0
summary so far suite view.viewquerytests.ViewQueryTests , pass 2 , fail 0
summary so far suite xdcr.uniXDCR.unidirectional , pass 1 , fail 0
testrunner logs, diags and results are available under /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_7
Logs will be stored at /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_8

./testrunner -i b/resources/dev-4-nodes-xdcr.ini makefile=True -t xdcr.uniXDCR.unidirectional.load_with_failover,replicas=1,items=10000,ctopology=chain,rdirection=unidirection,doc-ops=update-delete,failover=source

Test Input params:
{'doc-ops': 'update-delete', 'log_level': 'CRITICAL', 'replicas': '1', 'items': '10000', 'failover': 'source', 'makefile': 'True', 'conf_file': 'conf/simple.conf', 'num_nodes': 4, 'cluster_name': 'dev-4-nodes-xdcr', 'ctopology': 'chain', 'rdirection': 'unidirection', 'ini': 'b/resources/dev-4-nodes-xdcr.ini', 'case_number': 8, 'logs_folder': '/Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_8', 'spec': 'simple'}
setting param: exp_pager_stime 10
setting param: exp_pager_stime 10
setting param: exp_pager_stime 10
Cluster instance shutdown with force
summary so far suite rebalance.rebalancein.RebalanceInTests , pass 1 , fail 0
summary so far suite performance.eperf.EVPerfClient , pass 1 , fail 0
summary so far suite memcapable.WarmUpMemcachedTest , pass 1 , fail 0
summary so far suite view.createdeleteview.CreateDeleteViewTests , pass 1 , fail 0
summary so far suite view.viewquerytests.ViewQueryTests , pass 2 , fail 0
summary so far suite xdcr.uniXDCR.unidirectional , pass 2 , fail 0
testrunner logs, diags and results are available under /Users/trond/compile/couchbase/ptrace/testrunner/logs/testrunner-14-Apr-30_12-07-38/test_8
Run after suite setup for xdcr.uniXDCR.unidirectional.load_with_failover
rebalance.rebalancein.RebalanceInTests.rebalance_in_with_ops pass
performance.eperf.EVPerfClient.test_minimal pass
memcapable.WarmUpMemcachedTest.do_warmup_100k pass
view.createdeleteview.CreateDeleteViewTests.test_view_ops pass
view.viewquerytests.ViewQueryTests.test_employee_dataset_startkey_endkey_queries_rebalance_in pass
view.viewquerytests.ViewQueryTests.test_simple_dataset_stale_queries_data_modification pass
xdcr.uniXDCR.unidirectional.load_with_ops pass
xdcr.uniXDCR.unidirectional.load_with_failover pass
scripts/start_cluster_and_run_tests.sh: line 81: 63983 Terminated: 15 COUCHBASE_NUM_VBUCKETS=64 python ./cluster_run --nodes=$servers_count >&$wd/cluster_run.log (wd: ~/compile/couchbase/ptrace/ns_server)
trond@ok:1027> echo $? ~/compile/couchbase/ptrace
0

Comment by Wayne Siu [ 30/Jul/14 ]
Tommie,
Can you please take a look at this request? Thanks.




[MB-10232] Docs: Inconsistent user key usage between text and graphics Created: 17/Feb/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.2.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Minor
Reporter: Don Stacy Assignee: Amy Kurtzman
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged

 Description   

Section: http://docs.couchbase.com/couchbase-devguide-2.5/#using-reference-documents-for-lookups
Area: First paragraph and then an issue between the text and the graphics
Issues: (1) The string 'can still provide better performance if you are the lookup frequently’ needs a change. The word ‘are’ is not correct. Not sure what it should be though. (2) Change 'associated with the main the document’ to 'associated with the main document’ (remove the second ‘the’). (3) The later description of the code does not appear to match the code. We talk about user::count, but then we have a uid value of 12f1, from the UUID call in the code. The screenshots look right, but there is a text reference to user::101 in the string 'key for the user record user::101. The first document’. This does not appear to be accurate since our key is user:12f1. We also show pointers to => 101, which are not valid. I personally like the user::101 syntax as an example, because it shows use of a counter instead of a somewhat random value from the UUID call. That’s what is shown on couchbasemodels.com. Whichever approach we prefer to show, it just needs to be consistent between the code snippets and the graphics.




[MB-10184] Views rest api change Created: 11/Feb/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Ruth Harris Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged

 Description   
From Volker:

there's a little change in the views API that might lead to the need to
update the docs for 3.0. It's a change that introduces a warning to
prevent confusing results [1]. I saw in some docs that the "group"
paramter and "group_level" got specified within the same query.

I've just quote the commit message as I think it explains things well:

Prior to this change `group` and `group_level` where overriding
each other. `group` was parsed as `group_level=exact`. This means
that the value of `group_level` was determined by the ordering
when both `group` and `group_level` were given.
This leads to confusion. With this change the following error will
be thrown in case both `group` and a `group_level` are specified
in the query:
Query parameter `group_level` is "not compatible with `group`

[1] http://review.couchbase.org/31251

Cheers,
  Volker

PS: It's not yet merged but will be soon when some internal issue is
resolved.





[MB-10228] Docs: Non-functioning curl command Created: 17/Feb/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.2.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Minor
Reporter: Don Stacy Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged

 Description   
Section: http://docs.couchbase.com/couchbase-manual-2.5/cb-admin/#what-to-include-in-good-issue-reports-jira
Area: Area that starts with 'If you suspect the indexer is stuck'
Issues: I think there are a few issues with this section. (1) There is a run-on sentence of sorts where we say ‘to confirm that, the goal again is to isolate. Feels like two sentences to me. (2) The curl command as written has special quotes around it. Copying and pasting this command will not work. It needs to have regular quotes, not the fancy kind that Word adds in. (3) I can only get the curl command to work against port 8092. I have not come across port 9500 at Couchbase. (4) The json_xs is a separate program that may not be on the server. You may just want to mention that someone needs to install it for this command to work as shown.

 Comments   
Comment by Ruth Harris [ 14/Oct/14 ]
Troubleshooting section.




[MB-10229] Docs: SASL authentication version notes incorrect? Created: 17/Feb/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.2.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Trivial
Reporter: Don Stacy Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged

 Description   
Section: http://docs.couchbase.com/couchbase-devguide-2.5/#providing-sasl-authentication
Area: Third paragraph
Issues: I think this is meant to say that our approach changed in version 2.2 from PLAIN to CRAM-MD5, as it states two statuses for encrypting. Yet we say both are ‘as of Couchbase Server 2.2’.




[MB-10214] Mac version update check is incorrectly identifying newest version Created: 14/Feb/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: build
Affects Version/s: 2.0.1, 2.2.0, 2.1.1
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Blocker
Reporter: David Haikney Assignee: Chris Hillery
Resolution: Unresolved Votes: 0
Labels: None
Σ Remaining Estimate: Not Specified Remaining Estimate: Not Specified
Σ Time Spent: Not Specified Time Spent: Not Specified
Σ Original Estimate: Not Specified Original Estimate: Not Specified
Environment: Mac OS X

Attachments: PNG File upgrade_check.png    
Issue Links:
Duplicate
is duplicated by MB-12345 Version 3.0.0-1209-rel prompts for up... Closed
Sub-Tasks:
Key
Summary
Type
Status
Assignee
MB-12051 Update the Release_Server job on Jenk... Technical task Open Chris Hillery  
Is this a Regression?: Yes

 Description   
Running 2.1.1 version of couchbase on a Mac, "check for latest version" reports the latest version is already running (e.g. see attached screenshot)


 Comments   
Comment by Aleksey Kondratenko [ 14/Feb/14 ]
Definitely not ui bug. It's using phone home to find out about upgrades. And I have no idea who owns that now.
Comment by Steve Yen [ 12/Jun/14 ]
got an email from ravi to look into this
Comment by Steve Yen [ 12/Jun/14 ]
Not sure if this is correct analysis, but I did a quick scan of what I think is the mac installer, which I think is...

  https://github.com/couchbase/couchdbx-app

It gets its version string by running a "git describe", in the Makefile here...

  https://github.com/couchbase/couchdbx-app/blob/master/Makefile#L1

Currently, a "git describe" on master branch returns...

  $ git describe
  2.1.1r-35-gf6646fa

...which is *kinda* close to the reported version string in the screenshot ("2.1.1-764-rel").

So, I'm thinking one fix needed would be a tagging (e.g., "git tag -a FOO -m FOO") of the couchdbx-app repository.

So, reassigning to Phil to do that appropriately.

Also, it looks like the our mac installer is using an open-source packaging / installer / runtime library called "sparkle" (which might be a little under-maintained -- not sure).

  https://github.com/andymatuschak/Sparkle/wiki

The sparkle library seems to check for version updates by looking at the URL here...

  https://github.com/couchbase/couchdbx-app/blob/master/cb.plist.tmpl#L42

Which seems to either be...

  http://appcast.couchbase.com/membasex.xml

Or, perhaps...

  http://appcast.couchbase.com/couchbasex.xml

The appcast.couchbase.com appears to be actually an S3 bucket, off of our production couchbase AWS account. So those *.xml files need to be updated, as they currently have content that has older versions. For example, http://appcast.couchbase.com/couchbase.xml looks currently like...

    <rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sparkle="http://www.andymatuschak.org/xml-namespaces/sparkle" version="2.0">
    <channel>
    <title>Updates for Couchbase Server</title>
    <link>http://appcast.couchbase.com/couchbase.xml&lt;/link>
    <description>Recent changes to Couchbase Server.</description>
    <language>en</language>
    <item>
    <title>Version 1.8.0</title>
    <sparkle:releaseNotesLink>
    http://www.couchbase.org/wiki/display/membase/Couchbase+Server+1.8.0
    </sparkle:releaseNotesLink>
    <!-- date -u +"%a, %d %b %Y %H:%M:%S GMT" -->
    <pubDate>Fri, 06 Jan 2012 16:11:17 GMT</pubDate>
    <enclosure url="http://packages.couchbase.com/1.8.0/Couchbase-Server-Community.dmg" sparkle:version="1.8.0" sparkle:dsaSignature="MCwCFAK8uknVT3WOjPw/3LkQpLBadi2EAhQxivxe2yj6EU6hBlg9YK/5WfPa5Q==" length="33085691" type="application/octet-stream"/>
    </item>
    </channel>
    </rss>

Not updating the xml files, though, probably causes no harm. Just that our osx users won't be pushed news on updates.
Comment by Phil Labee (Inactive) [ 12/Jun/14 ]
This has nothing to do with "git describe". There should be no place in the product that "git describe" should be used to determine version info. See:

    http://hub.internal.couchbase.com/confluence/display/CR/Branching+and+Tagging

so there's definitely a bug in the Makefile.

The version update check seems to be out of date. The phone-home file is generated during:

    http://factory.hq.couchbase.com:8080/job/Product_Staging_Server/

but the process of uploading it is not automated.
Comment by Steve Yen [ 12/Jun/14 ]
Thanks for the links.

> This has nothing to do with "git describe".

My read of the Makefile makes me think, instead, that "git describe" is the default behavior unless it's overridden by the invoker of the make.

> There should be no place in the product that "git describe" should be used to determine version info. See:
> http://hub.internal.couchbase.com/confluence/display/CR/Branching+and+Tagging

It appears all this couchdbx-app / sparkle stuff predates that wiki page by a few years, so I guess it's inherited legacy.

Perhaps voltron / buildbot are not setting the PRODUCT_VERSION correctly before invoking the the couchdbx-app make, which makes the Makefile default to 'git describe'?

    commit 85710d16b1c52497d9f12e424a22f3efaeed61e4
    Date: Mon Jun 4 14:38:58 2012 -0700

    Apply correct product version number
    
    Get version number from $PRODUCT_VERSION if it's set.
    (Buildbot and/or voltron will set this.)
    If not set, default to `git describe` as before.
    
> The version update check seems to be out of date.

Yes, that's right. The appcast files are out of date.

> The phone-home file is generated during:
> http://factory.hq.couchbase.com:8080/job/Product_Staging_Server/

I think appcast files for OSX / sparkle are a _different_ mechanism than the phone-home file, and an appcast XML file does not appear to be generated/updated by the Product_Staging_Server job.

But, I'm not an expert or really qualified on the details here -- this is just my opinions from a quick code scan, not from actually doing/knowing.

Comment by Wayne Siu [ 01/Aug/14 ]
Per PM (Anil), we should get this fixed by 3.0 RC1.
Raising the priority to Critical.
Comment by Wayne Siu [ 07/Aug/14 ]
Phil,
Please provide update.
Comment by Anil Kumar [ 12/Aug/14 ]
Triage - Upgrading to 3.0 Blocker

Comment by Wayne Siu [ 20/Aug/14 ]
Looks like we may have a short term "fix" for this ticket which Ceej and I have tested.
@Ceej, can you put in the details here?
Comment by Chris Hillery [ 20/Aug/14 ]
The file is hosted in S3, and we proved tonight that overwriting that file (membasex.xml) with a version containing updated version information and download URLs works as expected. We updated it to point to 2.2 for now, since that is the latest version with a freely-available download URL.

We can update the Release_Server job on Jenkins to create an updated version of this XML file from a template, and upload it to S3.

Assigning back to Wayne for a quick question: Do we support Enterprise edition for MacOS? If we do, then this solution won't be sufficient without more effort, because the two editions will need different Sparkle configurations for updates. Also, Enterprise edition won't be able to directly download the newer release, unless we provide a "hidden" URL for that (the download link on the website goes to a form).
Comment by Chris Hillery [ 14/Oct/14 ]
We manually uploaded a new version of membasex.xml when 3.0.0 was released, but as MB-12345 shows, it doesn't work correctly (it still thinks there's a new download even if you're running the released 3.0.0).

I do not anticipate being able to put more time into this issue in the near future.




[MB-10285] document DNS SRV record options Created: 24/Feb/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.5.0, 2.5.1
Fix Version/s: 2.5.1, 3.0.2
Security Level: Public

Type: Improvement Priority: Major
Reporter: Matt Ingenthron Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PDF File DNS SRV record writeup and format.pdf    

 Description   
For the appropriate section of any development and deployment docs, please add a section about DNS SRV records. Optionally, an administrator can have a set of records which list a few or all of the hosts in their cluster such that applications, mainly client libraries, can lookup hosts to bootstrap.

I'll attach a bit more info from Michael.

While this isn't a feature of 2.x including 2.5, it should be covered there since it's part of cluster administration. The feature is really something to be added to all of the client libraries.

Brett commented (which we agreed to):
Also, just to clarify the logic we have determined so far:
We would sort through the list of nodes by priority, where lowest priority number means highest priority. Then for all nodes of an identical priority, we would perform a weighted randomized picking of a node from that group. If the first choice fails, we will again weighted randomly pick a new node, while staying within the existing priority group, and ignoring the just-failed node. Once we have exhausted checking all servers of a specific priority, we would then go to the next priority group of servers and do the same thing.


 Comments   
Comment by Matt Ingenthron [ 24/Feb/14 ]
Writeup from Michael




[MB-10280] couchstore commit() may be incorrectly padding file size prior to first fsync, causing second fsync to do more work Created: 21/Feb/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: storage-engine
Affects Version/s: 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Marty Schoch Assignee: Chiyoung Seo
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: May also affect earlier versions I observed this issue while reading couchstore codebase.

Attachments: Text File couchstore_header.patch    
Triage: Untriaged

 Description   
I discussed this with Aaron in irc and he agreed there may be a problem.

The code here:

https://github.com/couchbase/couchstore/blob/master/src/couch_db.cc#L189-L193

is attempting to extend the file size to account for the header which will subsequently be written after the first fsync. (this allows the second fsync to be an fdatasync, which avoids writing metadata)

But, the headers need to be aligned on 4096-byte block boundaries and this calculation does not account for that. Further the db_write_buf() method does not account for that either.

This means that in most cases, the file size will change again when we actually write header.

 Comments   
Comment by Aleksey Kondratenko [ 13/Mar/14 ]
Interestingly, I actually remember testing this stuff and seeing no metadata commits in blktrace of second fdatasync. Weird.
Comment by Marty Schoch [ 26/Mar/14 ]
I did some more testing on this. Here is a couchstore file after 1 document was added and the changes were committed.

$ hexdump -C go-couchstore.couch
00000000 01 00 00 00 1d f0 29 2b 6b 0b 00 00 00 00 00 00 |......)+k.......|
00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000020 00 00 80 00 00 11 b2 db 45 1d 0f 38 7b 22 63 6f |........E..8{"co|
00000030 6e 74 65 6e 74 22 3a 31 32 33 7d 80 00 00 23 4a |ntent":123}...#J|
00000040 71 35 c4 22 2c 01 00 50 00 00 17 64 6f 63 2d 30 |q5.",..P...doc-0|
00000050 00 01 01 44 01 00 00 00 19 00 00 00 00 00 22 00 |...D..........".|
00000060 00 00 00 00 01 80 80 00 00 23 5d a9 ba 13 23 18 |.........#]...#.|
00000070 01 00 60 00 00 17 00 01 01 14 01 00 50 00 00 19 |..`.........P...|
00000080 01 0a 34 00 22 00 00 00 00 00 01 80 64 6f 63 2d |..4.".......doc-|
00000090 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |0...............|
000000a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000000d0 00 00 00 00 00 00 00 80 00 00 01 d2 02 ef 8d 00 |................|
000000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001000 01 00 00 00 4a 9c 85 5e c1 0b 00 00 00 00 00 01 |....J..^........|
00001010 00 00 00 00 00 00 00 00 00 00 00 00 00 11 00 1c |................|
00001020 00 00 00 00 00 00 00 66 00 00 00 00 00 2b 00 00 |.......f.....+..|
00001030 00 00 01 00 00 00 00 00 3b 00 00 00 00 00 2b 00 |........;.....+.|
00001040 00 00 00 01 00 00 00 00 00 00 00 00 00 00 19 |...............|
0000104f

The header is at 0x1000, the bySeq tree is at 0x66, the byId tree is at 0x3b, the document body is at 0x22.

There is a chunk written at 0xd7, but nothing points to it. The structure of it is for a 1 byte chunk, consistent with what we write when trying to extend the file. Only its at wrong spot, and would have unsuccessfully extended the file to the correct length.

This was done with an older version of couchstore than we currently use, so to see if it still happens I looked at a file created by Couchbase Server 2.5.

Here is the end of one of the beer-sample vbucket files:

0000b100 00 00 00 00 00 00 00 00 00 00 80 00 00 01 d2 02 |................|
0000b110 ef 8d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
0000b120 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
0000c000 01 00 00 00 56 9e 73 93 26 0b 00 00 00 00 00 09 |....V.s.&.......|
0000c010 00 00 00 00 00 00 00 00 00 00 00 00 00 11 00 1c |................|
0000c020 00 0c 00 00 00 00 a3 dc 00 00 00 00 02 65 00 00 |.............e..|
0000c030 00 00 09 00 00 00 00 a1 7c 00 00 00 00 02 60 00 |........|.....`.|
0000c040 00 00 00 09 00 00 00 00 00 00 00 00 00 14 ca 00 |................|
0000c050 00 00 00 b0 5b 00 00 00 00 00 5d |....[.....]|
0000c05b

Without studying it too carefully, we see at 0xb10a the start of what appears to be one of these 1 byte chunks. This is a pretty strong indication that this behavior is still happening.

I modified the code to print the end file position before and after writing the correct header. If the code is working correctly, we would expect the same file position in both. Here we see:

done doing file extend pos: 224
done writing actual file header: 4175

I then tested again with my patch.

done doing file extend pos: 4175
done writing actual file header: 4175

Patch attached.
Comment by Anil Kumar [ 17/Jul/14 ]
Triage - Chiyoung, Anil, Venu, Wayne .. July 17th
Comment by Anil Kumar [ 29/Jul/14 ]
Triage : Anil, Wayne .. July 29th

Chiyoung - Let us know if you're planning to fix it by 3.0 RC or should we move this out to 3.0.1.




[MB-10279] Update Docs to reflect rehash=1 syntax for cbrestore for Linux -> Mac Created: 21/Feb/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.2.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Jeff Dillon Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Operating System: MacOSX 64-bit

 Description   
Currently the docs say to use the memcached protocol when restoring between Linux to Mac. This is no longer necessary, and generates an error:

MCSink MC error: 32 on sink

The proper syntax now uses -x rehash=1. The docs should be updated to reflect this. Possibly similar to:

 The proper syntax for Linux - Mac cbrestore is as follows

    * There is no longer the need for the memcached protocol
    * Use port 8091
    * Add -x rehash=1

Example for Linux -> Mac:

./cbrestore backup -u username -p password -x rehash=1 couchbase://host:8091 --bucket-source my_bucket --bucket-destination my_bucket


Related issues:

https://www.couchbase.com/issues/browse/MB-8988

https://www.couchbase.com/issues/browse/MB-7265






[MB-10251] Documentation for cbcollect_info does not contain info about xdcr logs Created: 18/Feb/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Aruna Piravi Assignee: Amy Kurtzman
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged

 Description   

http://www.couchbase.com/docs//couchbase-manual-2.0/couchbase-admin-cmdline-cbcollect_info.html
does not contain details about xdcr logs.


 Comments   
Comment by Dipti Borkar [ 18/Feb/14 ]
Can you provide what details you are looking for?
If you don't know, please assign to dev so that they can provide proper feedback to docs to know what to include. Thanks much.
Comment by Amy Kurtzman [ 23/Jun/14 ]
Please provide information for documentation.
Comment by Aleksey Kondratenko [ 23/Jun/14 ]
Link above doesn't work for me. So I have no idea what I'm supposed to do.




[MB-10292] [windows] assertion failure in test_file_sort Created: 24/Feb/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: storage-engine
Affects Version/s: 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Critical
Reporter: Trond Norbye Assignee: Sundar Sridharan
Resolution: Unresolved Votes: 0
Labels: windows_pm_triaged
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Operating System: Windows 64-bit
Is this a Regression?: Unknown

 Description   
assertion on line 263 fails: assert(ret == FILE_SORTER_SUCCESS);

ret == FILE_SORTER_ERROR_DELETE_FILE

 Comments   
Comment by Trond Norbye [ 27/Feb/14 ]
I've disabled the test for win32 with http://review.couchbase.org/#/c/33985/ to allow us to find other regressions..
Comment by Anil Kumar [ 17/Jul/14 ]
Triage - Chiyoung, Anil, Venu, Wayne .. July 17th
Comment by Don Pinto [ 23/Sep/14 ]
Trond, Chiyoung, any update here
Quick question here - Is this the case that the unit test needs to be updated or bug in the code?

Comment by Chiyoung Seo [ 23/Sep/14 ]
We didn't look at this windows issue yet, but will update it soon.




[MB-10381] CB: Views include_docs parameter is deprecated for 3.0 Created: 06/Mar/14  Updated: 30/Oct/14

Status: In Progress
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Ruth Harris Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Is this a Regression?: Unknown

 Description   
 include_docs parameter for views which will go away in 3.0

 Comments   
Comment by Ruth Harris [ 06/Mar/14 ]
Put in release notes.

See Ilam for the changes in the behavior

From Dipti: 1. include_docs is going away from the REST API for the query. it still stays in the SDK.
Comment by Ruth Harris [ 14/Oct/14 ]
See Misc/Trbl-wrongdocs.dita, Views/views-operation.dita to be updated. It's not mentioned in the REST/Views api
Comment by Ruth Harris [ 14/Oct/14 ]
After confirmation, list include_docs in the deprecated section.
Comment by Ruth Harris [ 24/Oct/14 ]
Fixed. Added to deprecated section in 3.0.
Fixed. In 3.0, removed Misc/Trbl-wrongdocs.dita

TBD: re-write OR remove Views/views-operation.dita > How expiration impacts views > Detecting Expired Documents in Result Sets

See also, https://www.couchbase.com/issues/browse/MB-10183




[MB-10479] able to start graceful failover when node is unhealthy -> Rebalance exited with reason {pre_rebalance_config_synchronization_failed Created: 17/Mar/14  Updated: 30/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Andrei Baranouski Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: 3.0.0-443

Triage: Triaged
Operating System: Centos 64-bit
Is this a Regression?: No

 Description   
steps:

1. 3 nodes in cluster
2. turn on firewall on one node
3. wait while node become "unhealthy"
4. trigger graceful failover



[2014-03-17 18:36:26,188] - [remote_util:1450] INFO - running command.raw on 10.3.4.145: /sbin/iptables -A INPUT -p tcp -i eth0 --dport 1000:60000 -j REJECT
[2014-03-17 18:36:27,731] - [remote_util:1479] INFO - command executed successfully
[2014-03-17 18:36:27,731] - [remote_util:2263] INFO - enabled firewall on ip:10.3.4.145 port:8091 ssh_username:root
[2014-03-17 18:36:27,731] - [remote_util:1450] INFO - running command.raw on 10.3.4.145: /sbin/iptables --list
[2014-03-17 18:36:29,304] - [remote_util:1479] INFO - command executed successfully
[2014-03-17 18:36:29,304] - [remote_util:1401] INFO - Chain INPUT (policy ACCEPT)
[2014-03-17 18:36:29,305] - [remote_util:1401] INFO - target prot opt source destination
[2014-03-17 18:36:29,305] - [remote_util:1401] INFO - REJECT tcp -- anywhere anywhere tcp dpts:cadlock2:60000 reject-with icmp-port-unreachable
[2014-03-17 18:36:29,305] - [remote_util:1401] INFO -
[2014-03-17 18:36:29,305] - [remote_util:1401] INFO - Chain FORWARD (policy ACCEPT)
[2014-03-17 18:36:29,306] - [remote_util:1401] INFO - target prot opt source destination
[2014-03-17 18:36:29,306] - [remote_util:1401] INFO -
[2014-03-17 18:36:29,306] - [remote_util:1401] INFO - Chain OUTPUT (policy ACCEPT)
[2014-03-17 18:36:29,306] - [remote_util:1401] INFO - target prot opt source destination
[2014-03-17 18:36:29,306] - [remote_util:1401] INFO -
[2014-03-17 18:36:29,306] - [remote_util:1401] INFO - Chain RH-Firewall-1-INPUT (0 references)
[2014-03-17 18:36:29,306] - [remote_util:1401] INFO - target prot opt source destination
[2014-03-17 18:36:32,503] - [rest_client:125] INFO - node ns_1@10.3.4.145 status : unhealthy
[2014-03-17 18:36:32,503] - [rest_client:132] INFO - node ns_1@10.3.4.145 status_reached : True
[2014-03-17 18:36:32,503] - [failovertests:72] INFO - node 10.3.4.145:8091 is 'unhealthy' as expected
[2014-03-17 18:36:33,727] - [rest_client:942] INFO - fail_over node ns_1@10.3.4.145 successful
[2014-03-17 18:36:34,905] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:37,482] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:40,441] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:43,481] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:46,512] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:49,559] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:52,518] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:55,456] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:36:58,390] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:37:01,530] - [rest_client:1075] INFO - rebalance percentage : 0 %
[2014-03-17 18:37:04,490] - [rest_client:1059] ERROR - {u'status': u'none', u'errorMessage': u'Rebalance failed. See logs for detailed reason. You can try rebalance again.'} - rebalance failed
[2014-03-17 18:37:09,870] - [rest_client:1838] INFO - Latest logs from UI:
[2014-03-17 18:37:09,870] - [rest_client:1839] ERROR - {u'node': u'ns_1@10.3.4.144', u'code': 2, u'text': u"Rebalance exited with reason {pre_rebalance_config_synchronization_failed,\n ['ns_1@10.3.4.145']}\n", u'shortText': u'message', u'serverTime': u'2014-03-17T08:09:14.238Z', u'module': u'ns_orchestrator', u'tstamp': 1395068954238, u'type': u'info'}
[2014-03-17 18:37:09,871] - [rest_client:1839] ERROR - {u'node': u'ns_1@10.3.4.144', u'code': 0, u'text': u"Starting vbucket moves for graceful failover of 'ns_1@10.3.4.145'", u'shortText': u'message', u'serverTime': u'2014-03-17T08:08:44.227Z', u'module': u'ns_rebalancer', u'tstamp': 1395068924227, u'type': u'info'}
[2014-03-17 18:37:09,871] - [rest_client:1839] ERROR - {u'node': u'ns_1@10.3.4.144', u'code': 1, u'text': u'Rebalance completed successfully.\n', u'shortText': u'message', u'serverTime': u'2014-03-17T08:06:08.669Z', u'module': u'ns_orchestrator', u'tstamp': 1395068768669, u'type': u'info'}
[2014-03-17 18:37:09,871] - [rest_client:1839] ERROR - {u'node': u'ns_1@10.3.4.144', u'code': 0, u'text': u'Bucket "default" rebalance does not seem to be swap rebalance', u'shortText': u'message', u'serverTime': u'2014-03-17T08:02:05.908Z', u'module': u'ns_vbucket_mover', u'tstamp': 1395068525908, u'type': u'info'}
[2014-03-17 18:37:09,871] - [rest_client:1839] ERROR - {u'node': u'ns_1@10.3.4.145', u'code': 0, u'text': u'Bucket "default" loaded on node \'ns_1@10.3.4.145\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2014-03-17T08:02:05.205Z', u'module': u'ns_memcached', u'tstamp': 1395068525205, u'type': u'info'}
[2014-03-17 18:37:09,871] - [rest_client:1839] ERROR - {u'node': u'ns_1@10.3.4.147', u'code': 0, u'text': u'Bucket "default" loaded on node \'ns_1@10.3.4.147\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2014-03-17T08:02:05.023Z', u'module': u'ns_memcached', u'tstamp': 1395068525023, u'type': u'info'}
[2014-03-17 18:37:09,871] - [rest_client:1839] ERROR - {u'node': u'ns_1@10.3.4.144', u'code': 0, u'text': u'Started rebalancing bucket default', u'shortText': u'message', u'serverTime': u'2014-03-17T08:02:04.681Z', u'module': u'ns_rebalancer', u'tstamp': 1395068524681, u'type': u'info'}
[2014-03-17 18:37:09,872] - [rest_client:1839] ERROR - {u'node': u'ns_1@10.3.4.144', u'code': 0, u'text': u'Bucket "bucket0" rebalance does not seem to be swap rebalance', u'shortText': u'message', u'serverTime': u'2014-03-17T07:57:48.975Z', u'module': u'ns_vbucket_mover', u'tstamp': 1395068268975, u'type': u'info'}
[2014-03-17 18:37:09,872] - [rest_client:1839] ERROR - {u'node': u'ns_1@10.3.4.145', u'code': 0, u'text': u'Bucket "bucket0" loaded on node \'ns_1@10.3.4.145\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2014-03-17T07:57:48.621Z', u'module': u'ns_memcached', u'tstamp': 1395068268621, u'type': u'info'}
[2014-03-17 18:37:09,872] - [rest_client:1839] ERROR - {u'node': u'ns_1@10.3.4.147', u'code': 0, u'text': u'Bucket "bucket0" loaded on node \'ns_1@10.3.4.147\' in 0 seconds.', u'shortText': u'message', u'serverTime': u'2014-03-17T07:57:48.296Z', u'module': u'ns_memcached', u'tstamp': 1395068268296, u'type': u'info'}


Please note that we almost immediately got that node "unhealthy" and started graceful failover
it's better to get in response that we can not perform graceful failover due to node is unreachable

 Comments   
Comment by Andrei Baranouski [ 17/Mar/14 ]
https://s3.amazonaws.com/bugdb/jira/MB-10479/1040460b/10.3.4.144-3172014-844-diag.zip
https://s3.amazonaws.com/bugdb/jira/MB-10479/1040460b/10.3.4.145-3172014-849-diag.zip
https://s3.amazonaws.com/bugdb/jira/MB-10479/1040460b/10.3.4.146-3172014-848-diag.zip
https://s3.amazonaws.com/bugdb/jira/MB-10479/1040460b/10.3.4.147-3172014-852-diag.zip
Comment by Aleksey Kondratenko [ 17/Mar/14 ]
So all that is through REST API? Or UI ?
Comment by Andrei Baranouski [ 18/Mar/14 ]
through Rest API ( automation tests)
Comment by Andrei Baranouski [ 18/Mar/14 ]
please see MB-10495
Comment by Andrei Baranouski [ 18/Mar/14 ]
Even when I waited "unhealthy" I was able to run graceful failover with status "OK" and then
 opened confirmation dialog, checked graceful failover and start failover


Failed over 'ns_1@10.3.4.146': ok ns_rebalancer000 ns_1@10.3.4.144 07:18:23 - Tue Mar 18, 2014
Starting failing over 'ns_1@10.3.4.146' ns_rebalancer000 ns_1@10.3.4.144 07:18:23 - Tue Mar 18, 2014
Rebalance exited with reason {pre_rebalance_config_synchronization_failed,
['ns_1@10.3.4.146']}
ns_orchestrator002 ns_1@10.3.4.144 07:18:03 - Tue Mar 18, 2014
Starting vbucket moves for graceful failover of 'ns_1@10.3.4.146' ns_rebalancer000 ns_1@10.3.4.144 07:18:03 - Tue Mar 18, 2014
Node 'ns_1@10.3.4.145' saw that node 'ns_1@10.3.4.146' went down. Details: [{nodedown_reason,
net_tick_timeout}] ns_node_disco005 ns_1@10.3.4.145 07:07:56 - Tue Mar 18, 2014
Comment by Parag Agarwal [ 28/May/14 ]
Alk, this bug will have an impact on CLI tools. Still failing for us in our runs. Please take a look when you get a chance.
Comment by Aleksey Kondratenko [ 05/Jun/14 ]
Well you should understand that it's supposed to fail. The only thing to fix here is better error message.
Comment by Aleksey Kondratenko [ 11/Jun/14 ]
Graceful failover is expected to fail in this case. It's only supposed to work if node is healthy and able to takeover vbuckets off it.

I was thinking about making error in this case nicer, but decided against it because:

a) there are lots of ways how ill node may fail graceful failover

b) we plan to overhaul our REST API anyways and as part of that we can consider more general work of exposing rebalance/graceful-failover errors in nicer and more readable way.
Comment by Parag Agarwal [ 25/Jun/14 ]
Please keep the issue open till you resolve it via your overhaul of API. We would like to track and close it till we have a solution in place.
Comment by Aleksey Kondratenko [ 25/Jun/14 ]
Disagree. It's not a bug that graceful failover of unhealthy node fails.
Comment by Anil Kumar [ 25/Jun/14 ]
- Firstly behavior on Admin UI is correct that is for nodes which are unreachable the only option to Faillover is through 'Hard FailOver"

- Whereas the issue is with limitation on REST API which doesn't provide proper error message to user on rebalance fail

Assigning this to documentation to document limitation for REST API for 3.0.




[MB-10432] Removed ep_max_txn_size stat/engine_parameter Created: 11/Mar/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Critical
Reporter: Mike Wiederhold Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Is this a Regression?: Unknown

 Description   
This value is no longer used in the server. Please not that you need to update the documentation for cbepctl since this stat could be set with that script.




[MB-10512] Update documentation to convey we don't support rolling downgrades Created: 19/Mar/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Task Priority: Critical
Reporter: Abhishek Singh Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
Update documentation to convey we don't support rolling downgrades to 2.2 once all nodes are running on 2.5




[MB-10436] Installer should throw a warning if it detects swappiness is not 0 Created: 11/Mar/14  Updated: 30/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: installer
Affects Version/s: 2.5.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Improvement Priority: Critical
Reporter: Patrick Varley Assignee: Thuan Nguyen
Resolution: Unresolved Votes: 0
Labels: customer, install
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: linux

Issue Links:
Relates to
relates to MB-9193 Installer should throw a warning if i... Resolved

 Comments   
Comment by Matt Ingenthron [ 11/Mar/14 ]
While this would be great at install time, shouldn't we run this regularly and alert if it changes as well? This is true for THP as well. As you know, OS upgrades or migrations can easily disable this kind of thing.
Comment by Bin Cui [ 14/Mar/14 ]
http://review.couchbase.org/#/c/34518/
Comment by Kirk Kirkconnell [ 04/Aug/14 ]
See my comment above as to why it is being reopened. The main reason is the installer text only offers a temporary solution to the problem for users and not a permanent change to the OS settings that will last a reboot.




[MB-10384] DOCS: stale=false and other options need to change Created: 06/Mar/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: documentation
Affects Version/s: 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Major
Reporter: Ruth Harris Assignee: Ruth Harris
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Is this a Regression?: Unknown

 Description   
From Dipti:

the flow of the data etc for stale=false and other options also needs to change

2. stale parameter still stays, but the semantics of the way it works have change. the data doesn't need to persist to disk first before the index picks it up. So if you want to use stale=false, (to make the query results consistent with the data) you no longer need to first use observe (as stated in the text above) and then use stale=false.

-----
[3/6/14, 10:41:49 AM] Dipti Borkar: here's the docs. http://docs.couchbase.com/couchbase-manual-2.5/cb-admin/#index-updates-and-the-stale-parameter
[3/6/14, 10:42:31 AM] Dipti Borkar: so all text that says something like this "Irrespective of the stale parameter, documents can only be indexed by the system once the document has been persisted to disk. If the document has not been persisted to disk, use of the stale will not force this process. You can use the observe operation to monitor when documents are persisted to disk and/or updated in the index."

------
 Check "diagram" dataflow


 Comments   
Comment by Ruth Harris [ 06/Mar/14 ]
Keywords to search on:

hmmmmm.... index, memory, stale, stale-false, data flow, observe,




[MB-10428] Not obvious what triggers builds with known-good process Created: 11/Mar/14  Updated: 30/Oct/14

Status: Open
Project: Couchbase Server
Component/s: build
Affects Version/s: 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Task Priority: Minor
Reporter: Pavel Paulau Assignee: Chris Hillery
Resolution: Unresolved Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   
More than 10 3.0 builds were created in the past 24 hours.

Some of them were reasonably triggered by product changes.
Some of them were triggered by changes in testrunner (known issue, sadly cannot be addressed by build team).
However I didn't find explanation for other builds (e.g, 3.0.0-425, 3.0.0-428, 3.0.0-429, 3.0.0-432).

Why were they triggered? Can we make it more visible?

 Comments   
Comment by Pavel Paulau [ 12/Mar/14 ]
Any feedback?
Comment by Pavel Paulau [ 15/Mar/14 ]
It's happening again.

Wayne, can you comment on this please?
Comment by Chris Hillery [ 15/Mar/14 ]
It's the backlog being run after the known-good process hung for a day. I already have an action item to make it clearer what changes are being built.
Comment by Pavel Paulau [ 17/Mar/14 ]
What about build 3.0.0-445, for instance?
No code changes, no testrunner changes, less than two hours after builds 3.0.0-444.
Comment by Chris Hillery [ 17/Mar/14 ]
There known-good build process will fire off a build for every commit to github (not every 4 hours like previously). At times that will result in a greater number of builds, and they will "stack up".

You cannot look at the timestamps of the build and the timestamps of commits to github to determine what changes are in a build anymore.

The problem in this particular case was that the known-good process hung, leading to no builds at all for some time. When I restarted it, it had to work through all the intervening commits, leading to a large number of builds.

As I said, I have an action item to make it clearer what is happening and what commit(s) are in a build. I will reassign this bug back to me to track that issue since I don't appear to have another one for it already.
Comment by Pavel Paulau [ 17/Mar/14 ]
Ok, makes sense. But I still have several questions.

Does your explanation apply to build 3.0.0-445 which was created today? Why does "known-good process" hang often and often?
Comment by Chris Hillery [ 17/Mar/14 ]
Actually, no, it appears that 3.0.0-445 was triggered by hand. I need to figure out how to prevent that.

The main reason that the known-good process hangs is because our connectivity to github is very flaky. When developing it, I needed a way to lock the buildbot builds to prevent overlapping builds, and I used a lockfile in github for that. But when the connectivity goes out, it can create a deadlock. Then when I manually break the lock, the backed-up commits get fired off one by one, leading to an unexplained spate of builds. Again, this is something I'm working on; I'm testing a solution that should remove the need for explicit locks, which will make the deadlock impossible and significantly lower the chance of the system falling down.
Comment by Pavel Paulau [ 17/Mar/14 ]
Thanks, everything is absolutely clear now.
Comment by Pavel Paulau [ 24/Mar/14 ]
Probably it happened again with 3.0.0-505/506.
Comment by Chris Hillery [ 01/May/14 ]
Lowering priority since known-good builds have been disabled for a while.




[MB-10694] Eliminate cygwin requirement for testing on Windows Created: 31/Mar/14  Updated: 30/Oct/14

Status: Reopened
Project: Couchbase Server
Component/s: test-execution
Affects Version/s: 3.0
Fix Version/s: 3.0.2
Security Level: Public

Type: Bug Priority: Minor
Reporter: Trond Norbye Assignee: Tommie McAfee
Resolution: Unresolved Votes: 0
Labels: windows_pm_triaged
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Triage: Untriaged
Is this a Regression?: Unknown

 Description   
We have made great strides in eliminating the cygwin/mingw requirements from the main build for Couchbase Server. However, there are many parts of the environment which still are dependent on GNU make in particular, and on a Unix-like (cygwin) environment in general - voltron, the buildbot scripts, and testrunner being the most obvious. We hope to eliminate those over time as well, and this bug will track that effort.

Original description from Trond:

The script to start / stop the test is implemented in bash which is unavailable on our windows machines (after the move to cmake). Move to python?

 Comments   
Comment by Chris Hillery [ 02/Apr/14 ]
I'm lowering the priority of this one, as it is not going to happen immediately and is of less urgency that making the main product build work. I'll assign it to myself as it is a larger issue that just commit validation.
Comment by