[MB-8427] Filter out non-UTF-8 keys and log them Created: 06/Jun/13  Updated: 28/Aug/13  Resolved: 16/Aug/13

Status: Closed
Project: Couchbase Server
Component/s: cross-datacenter-replication, storage-engine
Affects Version/s: 2.1.0
Fix Version/s: 2.2.0
Security Level: Public

Type: Bug Priority: Blocker
Reporter: Junyi Xie (Inactive) Assignee: Dipti Borkar
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
blocks MB-8747 [Doc'd 2.2] Filter out non-UTF-8 keys... Closed

Copy and paste discussion with Damien

Junyi, can you write jira ticket that documentation and implementation in this area don't agree? Ideally also fix it and get Filipe to review it, but either way we need the ticket to track the discrepancy so that even when fixed when have a record that some versions don't have the behavior for support and historical purposes.


Damien Katz
Couchbase CTO | http://damienkatz.com | 510-421-8914

Hi Damien,

As you said, there was a discussion about it and it was determined that upper (ep_engine, couchstore, or couchdb) layer will check non-utf-8 ids and log them before these ids reach XDCR so XDCR won't see any non-utf8 ids. Thus XDCR itself does not validate and log them. Looks like the validation does happen in CouchDB layer (look at couchdb/src/couchdb/couch_doc:json_id()), but I cannot find where the non-utf8 ids are logged.

If this design does change, seems to me we shall log non-utf8 ids in CouchDB layer, the couch_db:changes_since() function called by XDCR to read changes from CouchDB should filter out and log all non-utf8 ids in CouchDB logs, is that correct?



Comment by Junyi Xie (Inactive) [ 01/Aug/13 ]

fix on gerrit pending review.
Comment by Junyi Xie (Inactive) [ 02/Aug/13 ]
Please see discussion at MB-8747

put the commit on gerrit on hold waiting for further decision from PM.

Comment by Maria McDuff (Inactive) [ 28/Aug/13 ]
already done.
Generated at Fri Nov 28 05:22:53 CST 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.