Details
-
Type:
Bug
-
Status:
Resolved
-
Priority:
Critical
-
Resolution: Fixed
-
Affects Version/s: 2.0
-
Fix Version/s: 2.0.1
-
Component/s: storage-engine
-
Security Level: Public
-
Labels:None
-
Environment:Observed on Mac and Linux platforms
Description
Steps to reproduce:
1. Set up XDCR (observed when sending to ElasticSearch but not believe to be related to this)
2. Create JSON document
3. Observe that document is replicated as expected to remote side
[{meta={id=2, rev=2-000015818d7d3c0a0000000000000000, expiration=0, flags=0}, base64=eyJjbGljayI6InRvIGVkaXQiLCJuZXcgaW4gMi4wIjoidGhlcmUgYXJlIG5vIHJlc2VydmVkIGZpZWxkIG5hbWVzIn0=}]
in the _bulk_docs request the document appears to be JSON and the base64 data correctly represents the document
4. Delete the document
5. Observe that the document deletion is replicated across to remote side, but with some unusual settings:
[{meta={id=2, rev=3-000015818d7d3c0b0000000000000000, att_reason=non-JSON mode, expiration=0, flags=0, deleted=true}, base64=V1QBAOAAAENfbG9jYWwvdmJzdGF0ZXsiBQdoIjogImFjdGl2ZSIsICJjaGVja3BvaW50X2lkARsAMQEWXG1heF9kZWxldGVkX3NlcW5vIjogIjAifQ==}]
Note that the document is indicated as binary (non-JSON mode) and that the the base64 encoded data appears to be some random memory on the server:
echo "V1QBAOAAAENfbG9jYWwvdmJzdGF0ZXsiBQdoIjogImFjdGl2ZSIsICJjaGVja3BvaW50X2lkARsAMQEWXG1heF9kZWxldGVkX3NlcW5vIjogIjAifQ==" | base64 -D
WT?C_local/vbstate{"h": "active", "checkpoint_id\max_deleted_seqno": "0"}
1. Set up XDCR (observed when sending to ElasticSearch but not believe to be related to this)
2. Create JSON document
3. Observe that document is replicated as expected to remote side
[{meta={id=2, rev=2-000015818d7d3c0a0000000000000000, expiration=0, flags=0}, base64=eyJjbGljayI6InRvIGVkaXQiLCJuZXcgaW4gMi4wIjoidGhlcmUgYXJlIG5vIHJlc2VydmVkIGZpZWxkIG5hbWVzIn0=}]
in the _bulk_docs request the document appears to be JSON and the base64 data correctly represents the document
4. Delete the document
5. Observe that the document deletion is replicated across to remote side, but with some unusual settings:
[{meta={id=2, rev=3-000015818d7d3c0b0000000000000000, att_reason=non-JSON mode, expiration=0, flags=0, deleted=true}, base64=V1QBAOAAAENfbG9jYWwvdmJzdGF0ZXsiBQdoIjogImFjdGl2ZSIsICJjaGVja3BvaW50X2lkARsAMQEWXG1heF9kZWxldGVkX3NlcW5vIjogIjAifQ==}]
Note that the document is indicated as binary (non-JSON mode) and that the the base64 encoded data appears to be some random memory on the server:
echo "V1QBAOAAAENfbG9jYWwvdmJzdGF0ZXsiBQdoIjogImFjdGl2ZSIsICJjaGVja3BvaW50X2lkARsAMQEWXG1heF9kZWxldGVkX3NlcW5vIjogIjAifQ==" | base64 -D
WT?C_local/vbstate{"h": "active", "checkpoint_id\max_deleted_seqno": "0"}
Activity
- All
- Comments
- Work Log
- History
- Activity
- Gerrit Reviews
Hide
Permalink
Jin Lim
added a comment -
http://review.couchbase.org/#/c/24586/ merged.
Show
Jin Lim
added a comment - http://review.couchbase.org/#/c/24586/ merged.
Hide
Thuan Nguyen
added a comment -
Integrated in github-couchdb-preview #564 (See [http://qa.hq.northscale.net/job/github-couchdb-preview/564/])
MB-7631 Interpret bp of 0 as empty body (Revision 8e961880696e0398229e7a837344b73053beed84)
Result = SUCCESS
Filipe David Borba Manana :
Files :
* src/couchdb/couch_db.erl
Result = SUCCESS
Filipe David Borba Manana :
Files :
* src/couchdb/couch_db.erl
Show
Thuan Nguyen
added a comment - Integrated in github-couchdb-preview #564 (See [ http://qa.hq.northscale.net/job/github-couchdb-preview/564/ ])
MB-7631 Interpret bp of 0 as empty body (Revision 8e961880696e0398229e7a837344b73053beed84)
Result = SUCCESS
Filipe David Borba Manana :
Files :
* src/couchdb/couch_db.erl
Hide
Thuan Nguyen
added a comment -
Integrated in github-couchdb-preview #563 (See [http://qa.hq.northscale.net/job/github-couchdb-preview/563/])
MB-7631 Interpret bp of 0 as empty body (Revision c086371a81e43021966620c469134e08530fe38a)
Revert "MB-7631 Interpret bp of 0 as empty body" (Revision bfd853e859ab961f8ee6c8fb8b38e667b3dd0a9c)
Result = SUCCESS
Filipe David Borba Manana :
Files :
* src/couchdb/couch_db.erl
Filipe David Borba Manana :
Files :
* src/couchdb/couch_db.erl
Revert "
Result = SUCCESS
Filipe David Borba Manana :
Files :
* src/couchdb/couch_db.erl
Filipe David Borba Manana :
Files :
* src/couchdb/couch_db.erl
Show
Thuan Nguyen
added a comment - Integrated in github-couchdb-preview #563 (See [ http://qa.hq.northscale.net/job/github-couchdb-preview/563/ ])
MB-7631 Interpret bp of 0 as empty body (Revision c086371a81e43021966620c469134e08530fe38a)
Revert " MB-7631 Interpret bp of 0 as empty body" (Revision bfd853e859ab961f8ee6c8fb8b38e667b3dd0a9c)
Result = SUCCESS
Filipe David Borba Manana :
Files :
* src/couchdb/couch_db.erl
Filipe David Borba Manana :
Files :
* src/couchdb/couch_db.erl
Show
Filipe Manana
added a comment - Yes, fine for me.
Hide
Junyi Xie
added a comment -
It is good news that we do not actually store garbage. If the couchdb read some garbage (as Aaron pointed out), then couchdb should fix it.
Replicator is just one consumer of this API and relies on couchdb to fetch all mutations, and IMHO it should not look into doc body. Put some code in replicator does not solve the problem, other caller of couchdb:open_doc() will still read garbage.
Replicator is just one consumer of this API and relies on couchdb to fetch all mutations, and IMHO it should not look into doc body. Put some code in replicator does not solve the problem, other caller of couchdb:open_doc() will still read garbage.
Show
Junyi Xie
added a comment - It is good news that we do not actually store garbage. If the couchdb read some garbage (as Aaron pointed out), then couchdb should fix it.
Replicator is just one consumer of this API and relies on couchdb to fetch all mutations, and IMHO it should not look into doc body. Put some code in replicator does not solve the problem, other caller of couchdb:open_doc() will still read garbage.
Hide
Jin Lim
added a comment -
Aaron/Filipe/Junyi, thanks for your input and quick help here. It looks like we can address this issue fairly simple, wether from the replicator or couchdb sides. If I may I will leave this up to your decision. However, please let me know since it is fairly simple, good fix wether we should consider push it to 2.0.1 or not. Thanks.
Show
Jin Lim
added a comment - Aaron/Filipe/Junyi, thanks for your input and quick help here. It looks like we can address this issue fairly simple, wether from the replicator or couchdb sides. If I may I will leave this up to your decision. However, please let me know since it is fairly simple, good fix wether we should consider push it to 2.0.1 or not. Thanks.
Hide
Filipe Manana
added a comment -
From what I understood from Jin's comments, ep-engine/couchstore side are doing the right thing, that is, not persisting garbage to disk for deleted documents.
It's only a problem of couchdb not knowing that it shouldn't read bodies for deleted documents (unlike in Apache CouchDB).
Issues like this could well be solved by people who work on the replicator imho.
It's only a problem of couchdb not knowing that it shouldn't read bodies for deleted documents (unlike in Apache CouchDB).
Issues like this could well be solved by people who work on the replicator imho.
Show
Filipe Manana
added a comment - From what I understood from Jin's comments, ep-engine/couchstore side are doing the right thing, that is, not persisting garbage to disk for deleted documents.
It's only a problem of couchdb not knowing that it shouldn't read bodies for deleted documents (unlike in Apache CouchDB).
Issues like this could well be solved by people who work on the replicator imho.
Hide
We aren't storing garbage, I think. We've only been _reading_ garbage.
Show
Aaron Miller
added a comment - - edited We aren't storing garbage, I think. We've only been _reading_ garbage.
Show
Aaron Miller
added a comment - http://review.couchbase.org/#/c/24586/ Here.
Hide
Filipe Manana
added a comment -
Is it that hard to add to the replicator code that doesn't read bodies for deleted items, or make couch_db:open_doc/2 not read bodies for deleted items?
Show
Filipe Manana
added a comment - Is it that hard to add to the replicator code that doesn't read bodies for deleted items, or make couch_db:open_doc/2 not read bodies for deleted items?
Hide
I finally did some debugging this issue and found that ep engine is handling deleted items correctly. It never passes doc body but NULL to couchstore for deleting docs.
However, with Aaron's help, it appears to be that open_doc_int() in couch_db.erl still tries to read the doc body for a deleted item instead of trasting it as a empty doc. Assign it to couchdb team (Filipe) for further debugging and a fix. Thanks.
However, with Aaron's help, it appears to be that open_doc_int() in couch_db.erl still tries to read the doc body for a deleted item instead of trasting it as a empty doc. Assign it to couchdb team (Filipe) for further debugging and a fix. Thanks.
Show
Jin Lim
added a comment - - edited I finally did some debugging this issue and found that ep engine is handling deleted items correctly. It never passes doc body but NULL to couchstore for deleting docs.
However, with Aaron's help, it appears to be that open_doc_int() in couch_db.erl still tries to read the doc body for a deleted item instead of trasting it as a empty doc. Assign it to couchdb team (Filipe) for further debugging and a fix. Thanks.
Hide
Junyi Xie
added a comment -
Jin,
Thanks a lot for looking into this.
Dipti,
Here is my take on your questions
- do we store extra data for every deleted key?
It is fairly easy to reproduce, but I am not sure if it is for "every" deleted key
- is this also the case for expired items?
Expired time will end up a del op from ep_engine, if the code path is the same, probably it will be a victim of this bug too.
- is it a fixed amount of additional data that gets stored? how much per document?
The garbage seems from a random segment of memory (possibly due to some dangling pointer), if it is the case, we store a random chunk of data in doc_body, the size could be variant.
Thanks a lot for looking into this.
Dipti,
Here is my take on your questions
- do we store extra data for every deleted key?
It is fairly easy to reproduce, but I am not sure if it is for "every" deleted key
- is this also the case for expired items?
Expired time will end up a del op from ep_engine, if the code path is the same, probably it will be a victim of this bug too.
- is it a fixed amount of additional data that gets stored? how much per document?
The garbage seems from a random segment of memory (possibly due to some dangling pointer), if it is the case, we store a random chunk of data in doc_body, the size could be variant.
Show
Junyi Xie
added a comment - Jin,
Thanks a lot for looking into this.
Dipti,
Here is my take on your questions
- do we store extra data for every deleted key?
It is fairly easy to reproduce, but I am not sure if it is for "every" deleted key
- is this also the case for expired items?
Expired time will end up a del op from ep_engine, if the code path is the same, probably it will be a victim of this bug too.
- is it a fixed amount of additional data that gets stored? how much per document?
The garbage seems from a random segment of memory (possibly due to some dangling pointer), if it is the case, we store a random chunk of data in doc_body, the size could be variant.
Hide
Jin Lim
added a comment -
I agree we will still continue to look into this if there is a small fix. We just didn't have enough engineering cycles for debugging this, but will do so this week.
Please see my comments below for Dipti's questions above;
- do we store extra data for every deleted key? yes on src node but not dst node (Junyi please correct me if wrong)
- is this also the case for expired items? yes since ep engine handles both explicitly deleted and expired items with the same disk write operation (del).
- is it a fixed amount of additional data that gets stored? it appears to be not much but random according to Junyi's latest debugging
Again, will do a little bit of more engineering debugging and make a final call. Thanks for practical suggestion from QE though.
Please see my comments below for Dipti's questions above;
- do we store extra data for every deleted key? yes on src node but not dst node (Junyi please correct me if wrong)
- is this also the case for expired items? yes since ep engine handles both explicitly deleted and expired items with the same disk write operation (del).
- is it a fixed amount of additional data that gets stored? it appears to be not much but random according to Junyi's latest debugging
Again, will do a little bit of more engineering debugging and make a final call. Thanks for practical suggestion from QE though.
Show
Jin Lim
added a comment - I agree we will still continue to look into this if there is a small fix. We just didn't have enough engineering cycles for debugging this, but will do so this week.
Please see my comments below for Dipti's questions above;
- do we store extra data for every deleted key? yes on src node but not dst node (Junyi please correct me if wrong)
- is this also the case for expired items? yes since ep engine handles both explicitly deleted and expired items with the same disk write operation (del).
- is it a fixed amount of additional data that gets stored? it appears to be not much but random according to Junyi's latest debugging
Again, will do a little bit of more engineering debugging and make a final call. Thanks for practical suggestion from QE though.
Hide
Dipti Borkar
added a comment -
Forum post just for reference: http://www.couchbase.com/forums/thread/persistence-becoming-issue-can-it-be-disabled
Show
Dipti Borkar
added a comment - Forum post just for reference: http://www.couchbase.com/forums/thread/persistence-becoming-issue-can-it-be-disabled
Hide
Dipti Borkar
added a comment -
I think it does cause a huge bandwidth issue for large documents and expirations. This is particularly the case for ad targeting. Also, many people have complained about unusual disk growth with TTL, I suspect this may be the issue. In addition to storing tombstones, we are also storing data in couchstore. questions that will help us decide are:
- do we store extra data for every deleted key?
- is this also the case for expired items?
- is it a fixed amount of additional data that gets stored? how much per document?
I'm not so concerned about deletes since typically very small percentage of items are deleted <1%. But more concerned about expired items.
- do we store extra data for every deleted key?
- is this also the case for expired items?
- is it a fixed amount of additional data that gets stored? how much per document?
I'm not so concerned about deletes since typically very small percentage of items are deleted <1%. But more concerned about expired items.
Show
Dipti Borkar
added a comment - I think it does cause a huge bandwidth issue for large documents and expirations. This is particularly the case for ad targeting. Also, many people have complained about unusual disk growth with TTL, I suspect this may be the issue. In addition to storing tombstones, we are also storing data in couchstore. questions that will help us decide are:
- do we store extra data for every deleted key?
- is this also the case for expired items?
- is it a fixed amount of additional data that gets stored? how much per document?
I'm not so concerned about deletes since typically very small percentage of items are deleted <1%. But more concerned about expired items.
Hide
Farshid Ghods
added a comment -
Jin/Dipti/Yaseen,
given that this is not a regression from 2.0 and it does not impact xdcr functionality and does not cause huge bandwidth issue i recommend moving this to 2.0.2
given that this is not a regression from 2.0 and it does not impact xdcr functionality and does not cause huge bandwidth issue i recommend moving this to 2.0.2
Show
Farshid Ghods
added a comment - Jin/Dipti/Yaseen,
given that this is not a regression from 2.0 and it does not impact xdcr functionality and does not cause huge bandwidth issue i recommend moving this to 2.0.2
Hide
Junyi Xie
added a comment -
Dipti,
My feeling is if we can find the culprit, it may be a small fix. But we need to figure out where the problem is. As I mentioned above, we incorrectly store some garbage of deleted item. XDCR just exposed that.
IMHO, ep_engine should first look at this issue to see if there is any dangling pointer of doc_body. If not, couchdb and couchstore will need to investigate. Thanks!
My feeling is if we can find the culprit, it may be a small fix. But we need to figure out where the problem is. As I mentioned above, we incorrectly store some garbage of deleted item. XDCR just exposed that.
IMHO, ep_engine should first look at this issue to see if there is any dangling pointer of doc_body. If not, couchdb and couchstore will need to investigate. Thanks!
Show
Junyi Xie
added a comment - Dipti,
My feeling is if we can find the culprit, it may be a small fix. But we need to figure out where the problem is. As I mentioned above, we incorrectly store some garbage of deleted item. XDCR just exposed that.
IMHO, ep_engine should first look at this issue to see if there is any dangling pointer of doc_body. If not, couchdb and couchstore will need to investigate. Thanks!
Show
Dipti Borkar
added a comment - is this a small change that we can fix in 2.0.1?
Hide
Junyi Xie
added a comment -
Talked to Damien and Jin briefly.
XDCR just read all mutations from storage and try to send to remote side. XDCR itself does not create or parse the document body. All info XDCR need is the key and metadata, not the body of document. The content in doc boy is totally beyond XDCR's control.
Looks like the "garbage in doc body" is persisted in storage because by Marty's testcase, the garbage is something created in memory. So even without XDCR, we may waste potentially a lot of storage space by persisting a bunch of garbage for deleted items. XDCR just expose this issue.
On XDCR side, ofcourse we can parse the doc body before sending it to avoid the garbage, but IHMO it is NOT a good fix, because it just "hides" the issue instead of fixing it. Parsing doc body is not supposed to happen in XDCR layer.
Per Aaron's comment, "ep-engine correctly sets the body to empty when doing deletes", therefore I would suggest to CouchDB/CouchStore team look at this issue first why we read garbage for deleted item.
XDCR just read all mutations from storage and try to send to remote side. XDCR itself does not create or parse the document body. All info XDCR need is the key and metadata, not the body of document. The content in doc boy is totally beyond XDCR's control.
Looks like the "garbage in doc body" is persisted in storage because by Marty's testcase, the garbage is something created in memory. So even without XDCR, we may waste potentially a lot of storage space by persisting a bunch of garbage for deleted items. XDCR just expose this issue.
On XDCR side, ofcourse we can parse the doc body before sending it to avoid the garbage, but IHMO it is NOT a good fix, because it just "hides" the issue instead of fixing it. Parsing doc body is not supposed to happen in XDCR layer.
Per Aaron's comment, "ep-engine correctly sets the body to empty when doing deletes", therefore I would suggest to CouchDB/CouchStore team look at this issue first why we read garbage for deleted item.
Show
Junyi Xie
added a comment - Talked to Damien and Jin briefly.
XDCR just read all mutations from storage and try to send to remote side. XDCR itself does not create or parse the document body. All info XDCR need is the key and metadata, not the body of document. The content in doc boy is totally beyond XDCR's control.
Looks like the "garbage in doc body" is persisted in storage because by Marty's testcase, the garbage is something created in memory. So even without XDCR, we may waste potentially a lot of storage space by persisting a bunch of garbage for deleted items. XDCR just expose this issue.
On XDCR side, ofcourse we can parse the doc body before sending it to avoid the garbage, but IHMO it is NOT a good fix, because it just "hides" the issue instead of fixing it. Parsing doc body is not supposed to happen in XDCR layer.
Per Aaron's comment, "ep-engine correctly sets the body to empty when doing deletes", therefore I would suggest to CouchDB/CouchStore team look at this issue first why we read garbage for deleted item.
Hide
Jin Lim
added a comment -
Based on the above communication thread, there are two obvious things we can discuss here:
* overhead of replicating expired items (especially with the race condition mentioned above) - yes waste of bandwidth but can this affect maintaining consistent XDCR performance including latency when the number of expired items are huge?
* Aaron's suggestion - XDCR to consider a deleted/expired item to be empty thus send no body over wire.
We will discuss above two points and make a decision for feasible steps. Thanks.
* overhead of replicating expired items (especially with the race condition mentioned above) - yes waste of bandwidth but can this affect maintaining consistent XDCR performance including latency when the number of expired items are huge?
* Aaron's suggestion - XDCR to consider a deleted/expired item to be empty thus send no body over wire.
We will discuss above two points and make a decision for feasible steps. Thanks.
Show
Jin Lim
added a comment - Based on the above communication thread, there are two obvious things we can discuss here:
* overhead of replicating expired items (especially with the race condition mentioned above) - yes waste of bandwidth but can this affect maintaining consistent XDCR performance including latency when the number of expired items are huge?
* Aaron's suggestion - XDCR to consider a deleted/expired item to be empty thus send no body over wire.
We will discuss above two points and make a decision for feasible steps. Thanks.
Hide
Marty Schoch
added a comment -
@Dipti I'm not sure I understand the part about latency. If you've configured TTL on the Elasticsearch cluster, the expired document will be removed from the index within the "indices.ttl.interval" (default 60s). The only other thing is that at some later time the document expiration will also come across the wire via XDCR. This will involve both a _revs_diff and a _bulk_docs request. The net effect of these is nothing (the document has already been removed from the index) So what latency would be affected?
Show
Marty Schoch
added a comment - @Dipti I'm not sure I understand the part about latency. If you've configured TTL on the Elasticsearch cluster, the expired document will be removed from the index within the "indices.ttl.interval" (default 60s). The only other thing is that at some later time the document expiration will also come across the wire via XDCR. This will involve both a _revs_diff and a _bulk_docs request. The net effect of these is nothing (the document has already been removed from the index) So what latency would be affected?
Hide
Dipti Borkar
added a comment -
Bandwidth for sure, but also latency. We need to optimize for latency.
How hard is the fix?
KV + XDCR (particularly for caching use case is very important)
How hard is the fix?
KV + XDCR (particularly for caching use case is very important)
Show
Dipti Borkar
added a comment - Bandwidth for sure, but also latency. We need to optimize for latency.
How hard is the fix?
KV + XDCR (particularly for caching use case is very important)
Hide
Junyi Xie
added a comment -
@Dipti, XDCR does replicate expired item as a deletion. But since the item can expire on its own on remote cluster, there is a race condition that no guarantee the deletion will be replicated to destination, depending on on which cluster, the expiration happens first. For ES, it does not impact the correctness, but may cause some network bandwidth overhead.
Show
Junyi Xie
added a comment - @Dipti, XDCR does replicate expired item as a deletion. But since the item can expire on its own on remote cluster, there is a race condition that no guarantee the deletion will be replicated to destination, depending on on which cluster, the expiration happens first. For ES, it does not impact the correctness, but may cause some network bandwidth overhead.
Hide
Marty Schoch
added a comment -
@aaron after we discussed the issue and theorized what was happening, the first doc I put through these steps exhibited this behavior. i can't say it always happens, but it has been very easy to reproduce.
@dipti regarding "do we propagate expired documents?" a good question the more i think about it. while the remote cluster should expire the item on its own, when the expiration hits the disk on the source cluster, i would expect that docid/revision to go across in a _revs_diff request. in the case of the elasticsearch plugin this would seem result in the doc/rev being considered missing (we already deleted it), since its missing, the _bulk_docs would then send the delete across the wire (introducing the overhead in your question) but, i'm not sure thats actually how it works in practice
@dipti regarding "do we propagate expired documents?" a good question the more i think about it. while the remote cluster should expire the item on its own, when the expiration hits the disk on the source cluster, i would expect that docid/revision to go across in a _revs_diff request. in the case of the elasticsearch plugin this would seem result in the doc/rev being considered missing (we already deleted it), since its missing, the _bulk_docs would then send the delete across the wire (introducing the overhead in your question) but, i'm not sure thats actually how it works in practice
Show
Marty Schoch
added a comment - @aaron after we discussed the issue and theorized what was happening, the first doc I put through these steps exhibited this behavior. i can't say it always happens, but it has been very easy to reproduce.
@dipti regarding "do we propagate expired documents?" a good question the more i think about it. while the remote cluster should expire the item on its own, when the expiration hits the disk on the source cluster, i would expect that docid/revision to go across in a _revs_diff request. in the case of the elasticsearch plugin this would seem result in the doc/rev being considered missing (we already deleted it), since its missing, the _bulk_docs would then send the delete across the wire (introducing the overhead in your question) but, i'm not sure thats actually how it works in practice
Hide
Dipti Borkar
added a comment -
BTW, I forget, do we propagate expired documents? if yes, then for use cases in which we have TTL, this may be a huge overhead ?
I don't know if this is something we want to fix in 2.0.1 given the timing, perhaps 2.0.2?
I don't know if this is something we want to fix in 2.0.1 given the timing, perhaps 2.0.2?
Show
Dipti Borkar
added a comment - BTW, I forget, do we propagate expired documents? if yes, then for use cases in which we have TTL, this may be a huge overhead ?
I don't know if this is something we want to fix in 2.0.1 given the timing, perhaps 2.0.2?
Hide
Aaron Miller
added a comment -
Couchstore considers it "valid" for deleted items to have bodies, and will use whatever content the upstream user provided. It should be fine for XDCR to consider deleted items to be "empty" and send no body, though, as from the perspective of the rest of the components in the system, a deleted item is effectively empty.
There may be some code in CouchDB/XDCR misinterpreting the empty body and reading some garbage from the beginning of the file erroneously though (Location 0 is considered to be "empty"), as I believe ep-engine correctly sets the body to empty when doing deletes. I will investigate. Do we know if this happens with *every* deleted item, or just some?
There may be some code in CouchDB/XDCR misinterpreting the empty body and reading some garbage from the beginning of the file erroneously though (Location 0 is considered to be "empty"), as I believe ep-engine correctly sets the body to empty when doing deletes. I will investigate. Do we know if this happens with *every* deleted item, or just some?
Show
Aaron Miller
added a comment - Couchstore considers it "valid" for deleted items to have bodies, and will use whatever content the upstream user provided. It should be fine for XDCR to consider deleted items to be "empty" and send no body, though, as from the perspective of the rest of the components in the system, a deleted item is effectively empty.
There may be some code in CouchDB/XDCR misinterpreting the empty body and reading some garbage from the beginning of the file erroneously though (Location 0 is considered to be "empty"), as I believe ep-engine correctly sets the body to empty when doing deletes. I will investigate. Do we know if this happens with *every* deleted item, or just some?
Hide
Marty Schoch
added a comment -
This does not cause any problem for Elasticsearch, when processing a delete we do not look at the binary flag or the body of the base64 data. As Junyi mentioned, it may waste bandwidth transferring bytes thats are not used for anything.
Show
Marty Schoch
added a comment - This does not cause any problem for Elasticsearch, when processing a delete we do not look at the binary flag or the body of the base64 data. As Junyi mentioned, it may waste bandwidth transferring bytes thats are not used for anything.
Hide
Dipti Borkar
added a comment -
is this something that we need to fix in 2.0.1? Do we run into the elasticsearch problems because of this?
Show
Dipti Borkar
added a comment - is this something that we need to fix in 2.0.1? Do we run into the elasticsearch problems because of this?
Hide
Junyi Xie
added a comment -
Hi Aaron,
When XDCR fetched deletion mutation from storage, XDCR would see some garbage in document body as Marty described and it is marked as NON-JSON. This seems waster of bandwidth because destination side (either a regular XDCR receiver or ES cluster) just need the key and metadata, it does not need anything encoded in base64.
Please be free assign to the right person if necessary. Thanks.
When XDCR fetched deletion mutation from storage, XDCR would see some garbage in document body as Marty described and it is marked as NON-JSON. This seems waster of bandwidth because destination side (either a regular XDCR receiver or ES cluster) just need the key and metadata, it does not need anything encoded in base64.
Please be free assign to the right person if necessary. Thanks.
Show
Junyi Xie
added a comment - Hi Aaron,
When XDCR fetched deletion mutation from storage, XDCR would see some garbage in document body as Marty described and it is marked as NON-JSON. This seems waster of bandwidth because destination side (either a regular XDCR receiver or ES cluster) just need the key and metadata, it does not need anything encoded in base64.
Please be free assign to the right person if necessary. Thanks.
Hide
Marty Schoch
added a comment -
Also, although the decoded data here is very text-like, in other tests the base64 data observed was just garbage.
Show
Marty Schoch
added a comment - Also, although the decoded data here is very text-like, in other tests the base64 data observed was just garbage.