Details
-
Type:
Task
-
Status:
Reopened
-
Priority:
Blocker
-
Resolution: Unresolved
-
Affects Version/s: 2.0
-
Fix Version/s: 2.0.2
-
Component/s: documentation
-
Security Level: Public
-
Labels:
-
Flagged:Release Note
Description
Replica Read API implementation is ready but testing and related documentation have not completed up to the product-feature ready state.
Activity
- All
- Comments
- Work Log
- History
- Activity
- Gerrit Reviews
Hide
Permalink
Maria McDuff
added a comment -
JIn LIm to confirm with PM that this feature is needed for 2.0.2 release.
Show
Maria McDuff
added a comment - JIn LIm to confirm with PM that this feature is needed for 2.0.2 release.
Show
Maria McDuff
added a comment - anil to update business justification for this feature.
Show
Maria McDuff
added a comment - ready for qe testing. now in 2.0.2 build.
Show
Maria McDuff
added a comment - Iryna,
pls update with your test progress.
Thanks.
Show
Iryna Mironava
added a comment - CBQE-1107 opened to track progress
Hide
Iryna Mironava
added a comment -
tested cases P0 for centos 64 bit and 32 bit, windows 64 bit, ubuntu 64 and 32 bit. tests passed
Show
Iryna Mironava
added a comment - tested cases P0 for centos 64 bit and 32 bit, windows 64 bit, ubuntu 64 and 32 bit. tests passed
Hide
Maria McDuff
added a comment -
testing done.
iryna --- pls do another round of testing when we get the RC build, end of April. Thanks.
iryna --- pls do another round of testing when we get the RC build, end of April. Thanks.
Show
Maria McDuff
added a comment - testing done.
iryna --- pls do another round of testing when we get the RC build, end of April. Thanks.
Show
Iryna Mironava
added a comment - build 2.0.2-772-rel tested
Hide
Anil Kumar
added a comment -
Please follow up with Jin and Matt for server and client documentation.
Show
Anil Kumar
added a comment - Please follow up with Jin and Matt for server and client documentation.
Hide
Jin Lim
added a comment -
Please review following information
============================
Replica Read
Binary opcode: CMD_GET_REPLICA - 0x83
Description: A new ep_engine specific binary retrieval command that retrieves data correspond to a given key. The command behaves exactly like the existing binary get command, except it returns data for a vbucket that is in replica state (vs active state in case of the normal get state)
Request & Response:
* Both Response and Request header structures for this command is identical to tho ones of the regular get.
* Request example: Replica Get("Hello")
Field (offset) (value)
Magic (0) : 0x80
Opcode (1) : 0x83
Key length (2,3) : 0x0005
Extra length (4) : 0x00
Data type (5) : 0x00
VBucket (6,7) : 0x0000
Total body (8-11) : 0x00000005
Opaque (12-15): 0x00000000
CAS (16-23): 0x0000000000000000
Extras : None
Key (24-29): The textual string: "Hello"
Value : None
* Response example: Replica Get("Hello") response ex.
Field (offset) (value)
Magic (0) : 0x81
Opcode (1) : 0x83
Key length (2,3) : 0x0000
Extra length (4) : 0x04
Data type (5) : 0x00
Status (6,7) :0x0000
Total body (8-11) : 0x00000009
Opaque (12-15): 0x00000000
CAS (16-23): 0x0000000000000001
Key (24-29): The textual string: "Hello"
Value : The textual string: "World"
Response Status:
* ENGINE_NOT_MY_VBUCKET = 0x0c
cannot find vbucket with key or
vbucket is not in replica state
* ENGINE_EWOULDBLOCK = 0x07
EP Engine would block - vbucket is in pending operation
Unit Tests:
* test_get_replica - returns data for a vbucket that is in replica state
* test_get_replica_active_state - returns error for a vbucket that is in active state (ENGINE_NOT_MY_VBUCKET)
* test_get_replica_pending_state - returns error for a vbucket that that is in pending state (ENGINE_EWOULDBLOCK)
* test_get_replica_dead_state - returns error for a vbucket that is in dead state (ENGINE_NOT_MY_VBUCKET)
============================
Replica Read
Binary opcode: CMD_GET_REPLICA - 0x83
Description: A new ep_engine specific binary retrieval command that retrieves data correspond to a given key. The command behaves exactly like the existing binary get command, except it returns data for a vbucket that is in replica state (vs active state in case of the normal get state)
Request & Response:
* Both Response and Request header structures for this command is identical to tho ones of the regular get.
* Request example: Replica Get("Hello")
Field (offset) (value)
Magic (0) : 0x80
Opcode (1) : 0x83
Key length (2,3) : 0x0005
Extra length (4) : 0x00
Data type (5) : 0x00
VBucket (6,7) : 0x0000
Total body (8-11) : 0x00000005
Opaque (12-15): 0x00000000
CAS (16-23): 0x0000000000000000
Extras : None
Key (24-29): The textual string: "Hello"
Value : None
* Response example: Replica Get("Hello") response ex.
Field (offset) (value)
Magic (0) : 0x81
Opcode (1) : 0x83
Key length (2,3) : 0x0000
Extra length (4) : 0x04
Data type (5) : 0x00
Status (6,7) :0x0000
Total body (8-11) : 0x00000009
Opaque (12-15): 0x00000000
CAS (16-23): 0x0000000000000001
Key (24-29): The textual string: "Hello"
Value : The textual string: "World"
Response Status:
* ENGINE_NOT_MY_VBUCKET = 0x0c
cannot find vbucket with key or
vbucket is not in replica state
* ENGINE_EWOULDBLOCK = 0x07
EP Engine would block - vbucket is in pending operation
Unit Tests:
* test_get_replica - returns data for a vbucket that is in replica state
* test_get_replica_active_state - returns error for a vbucket that is in active state (ENGINE_NOT_MY_VBUCKET)
* test_get_replica_pending_state - returns error for a vbucket that that is in pending state (ENGINE_EWOULDBLOCK)
* test_get_replica_dead_state - returns error for a vbucket that is in dead state (ENGINE_NOT_MY_VBUCKET)
Show
Jin Lim
added a comment - Please review following information
============================
Replica Read
Binary opcode: CMD_GET_REPLICA - 0x83
Description: A new ep_engine specific binary retrieval command that retrieves data correspond to a given key. The command behaves exactly like the existing binary get command, except it returns data for a vbucket that is in replica state (vs active state in case of the normal get state)
Request & Response:
* Both Response and Request header structures for this command is identical to tho ones of the regular get.
* Request example: Replica Get("Hello")
Field (offset) (value)
Magic (0) : 0x80
Opcode (1) : 0x83
Key length (2,3) : 0x0005
Extra length (4) : 0x00
Data type (5) : 0x00
VBucket (6,7) : 0x0000
Total body (8-11) : 0x00000005
Opaque (12-15): 0x00000000
CAS (16-23): 0x0000000000000000
Extras : None
Key (24-29): The textual string: "Hello"
Value : None
* Response example: Replica Get("Hello") response ex.
Field (offset) (value)
Magic (0) : 0x81
Opcode (1) : 0x83
Key length (2,3) : 0x0000
Extra length (4) : 0x04
Data type (5) : 0x00
Status (6,7) :0x0000
Total body (8-11) : 0x00000009
Opaque (12-15): 0x00000000
CAS (16-23): 0x0000000000000001
Key (24-29): The textual string: "Hello"
Value : The textual string: "World"
Response Status:
* ENGINE_NOT_MY_VBUCKET = 0x0c
cannot find vbucket with key or
vbucket is not in replica state
* ENGINE_EWOULDBLOCK = 0x07
EP Engine would block - vbucket is in pending operation
Unit Tests:
* test_get_replica - returns data for a vbucket that is in replica state
* test_get_replica_active_state - returns error for a vbucket that is in active state (ENGINE_NOT_MY_VBUCKET)
* test_get_replica_pending_state - returns error for a vbucket that that is in pending state (ENGINE_EWOULDBLOCK)
* test_get_replica_dead_state - returns error for a vbucket that is in dead state (ENGINE_NOT_MY_VBUCKET)
Hide
Karen Zeller
added a comment -
Sent to support:
Hi,
I'm tasked with documenting the newly "productized" replica read API. The current and only information to document is below. I am meeting with Jin Thursday this week so I can get additional information that people need to know about this.
Please let me know if there are specific things a developer wants to know that is not already provided by him. I will get it from him.
Thanks
Karen
Hi,
I'm tasked with documenting the newly "productized" replica read API. The current and only information to document is below. I am meeting with Jin Thursday this week so I can get additional information that people need to know about this.
Please let me know if there are specific things a developer wants to know that is not already provided by him. I will get it from him.
Thanks
Karen
Show
Karen Zeller
added a comment - Sent to support:
Hi,
I'm tasked with documenting the newly "productized" replica read API. The current and only information to document is below. I am meeting with Jin Thursday this week so I can get additional information that people need to know about this.
Please let me know if there are specific things a developer wants to know that is not already provided by him. I will get it from him.
Thanks
Karen
Hide
Questions from Tim:
Karen,
One thing to document explicitly, although I assume it is the only
logical way for this to be implemented: a replica read for an item
that is not resident in the caching layer will cause it to be fetched
from disk and its value cached. So the resident % for replica vbuckets
can have a significant impact on replica read performance.
Question for Jin:
is there a stat to track cache miss ratio for
replica reads?
Answer from Jin: ep_bg_fetch is the closest thing, which tells you the total number of background fetches. The one thing you can check is the change pre-replica-reads compared to replica-read scenario for the same data set.
No distinction in underlying ep engine whether it is fetching active or replica.
bg_fetch for replica reads vs. active reads, or are
those combined in a single stat?
Answer: see above. a single stat for both active and replica reads.
Any other stats that are exposed to
understand performance of replica reads?
Answer: No
I think the most important information for developers depends on the
client SDK, which I guess Matt will have to answer. For example, if
the bucket is configured with more than one replica copy, which
replica server will be queried?
Answer: All true
If the first attempt at a replica read
fails, will it try a 2nd (and 3rd) before returning an error? How can
that behavior be configured?
Answer: All depends on client.....
Karen,
One thing to document explicitly, although I assume it is the only
logical way for this to be implemented: a replica read for an item
that is not resident in the caching layer will cause it to be fetched
from disk and its value cached. So the resident % for replica vbuckets
can have a significant impact on replica read performance.
Question for Jin:
is there a stat to track cache miss ratio for
replica reads?
Answer from Jin: ep_bg_fetch is the closest thing, which tells you the total number of background fetches. The one thing you can check is the change pre-replica-reads compared to replica-read scenario for the same data set.
No distinction in underlying ep engine whether it is fetching active or replica.
bg_fetch for replica reads vs. active reads, or are
those combined in a single stat?
Answer: see above. a single stat for both active and replica reads.
Any other stats that are exposed to
understand performance of replica reads?
Answer: No
I think the most important information for developers depends on the
client SDK, which I guess Matt will have to answer. For example, if
the bucket is configured with more than one replica copy, which
replica server will be queried?
Answer: All true
If the first attempt at a replica read
fails, will it try a 2nd (and 3rd) before returning an error? How can
that behavior be configured?
Answer: All depends on client.....
Show
Karen Zeller
added a comment - - edited Questions from Tim:
Karen,
One thing to document explicitly, although I assume it is the only
logical way for this to be implemented: a replica read for an item
that is not resident in the caching layer will cause it to be fetched
from disk and its value cached. So the resident % for replica vbuckets
can have a significant impact on replica read performance.
Question for Jin:
is there a stat to track cache miss ratio for
replica reads?
Answer from Jin: ep_bg_fetch is the closest thing, which tells you the total number of background fetches. The one thing you can check is the change pre-replica-reads compared to replica-read scenario for the same data set.
No distinction in underlying ep engine whether it is fetching active or replica.
bg_fetch for replica reads vs. active reads, or are
those combined in a single stat?
Answer: see above. a single stat for both active and replica reads.
Any other stats that are exposed to
understand performance of replica reads?
Answer: No
I think the most important information for developers depends on the
client SDK, which I guess Matt will have to answer. For example, if
the bucket is configured with more than one replica copy, which
replica server will be queried?
Answer: All true
If the first attempt at a replica read
fails, will it try a 2nd (and 3rd) before returning an error? How can
that behavior be configured?
Answer: All depends on client.....
Hide
From Frank:
I think key for developers will be to talk about the potential staleness of the data and be explicit when to use it (active node is unreachable and your app is okay with stale data [maybe mentioning observe for replication to counter is a consideration, though performance impact I believe is quite meaningful]) and when not (to try and spread read load, as you often do with other databases).
Input from Jin: The best use case for replica read from server perspective is when, during the 30 seconds of...
takes 30 seconds for server to detect a node is unavailable and initiate auto-failover, if avaiable. During that time, clients may experience get fails, in which case, clients attempt replica. E.g. if 5 get attempts fail, try the replica read....
imagine mutation has not yet replicated to other node and you do a replica node. You may get the data that last replicated to that node, the current set on the active node. May not the version that was on the active node.
should recommend to user that they can use the CAS operation to determine integrity of replicated data. Basically do a set with CAS and compare cas number from active node with your replica read CAS......Basically for each set, keep the cas and compare it with CAS from replica read.....
Still in case of multiple concurrent sets and gets from multiple clients, there will always be some risk that data is "stale"
I think key for developers will be to talk about the potential staleness of the data and be explicit when to use it (active node is unreachable and your app is okay with stale data [maybe mentioning observe for replication to counter is a consideration, though performance impact I believe is quite meaningful]) and when not (to try and spread read load, as you often do with other databases).
Input from Jin: The best use case for replica read from server perspective is when, during the 30 seconds of...
takes 30 seconds for server to detect a node is unavailable and initiate auto-failover, if avaiable. During that time, clients may experience get fails, in which case, clients attempt replica. E.g. if 5 get attempts fail, try the replica read....
imagine mutation has not yet replicated to other node and you do a replica node. You may get the data that last replicated to that node, the current set on the active node. May not the version that was on the active node.
should recommend to user that they can use the CAS operation to determine integrity of replicated data. Basically do a set with CAS and compare cas number from active node with your replica read CAS......Basically for each set, keep the cas and compare it with CAS from replica read.....
Still in case of multiple concurrent sets and gets from multiple clients, there will always be some risk that data is "stale"
Show
Karen Zeller
added a comment - - edited From Frank:
I think key for developers will be to talk about the potential staleness of the data and be explicit when to use it (active node is unreachable and your app is okay with stale data [maybe mentioning observe for replication to counter is a consideration, though performance impact I believe is quite meaningful]) and when not (to try and spread read load, as you often do with other databases).
Input from Jin: The best use case for replica read from server perspective is when, during the 30 seconds of...
takes 30 seconds for server to detect a node is unavailable and initiate auto-failover, if avaiable. During that time, clients may experience get fails, in which case, clients attempt replica. E.g. if 5 get attempts fail, try the replica read....
imagine mutation has not yet replicated to other node and you do a replica node. You may get the data that last replicated to that node, the current set on the active node. May not the version that was on the active node.
should recommend to user that they can use the CAS operation to determine integrity of replicated data. Basically do a set with CAS and compare cas number from active node with your replica read CAS......Basically for each set, keep the cas and compare it with CAS from replica read.....
Still in case of multiple concurrent sets and gets from multiple clients, there will always be some risk that data is "stale"
Hide
From Perry:
Agree that the most critical information will be around how, when and when not to use this feature.
-When: you need data and multiple gets continuously fail, then attempt this scenario. If you have SLA by certain time.
-When not: if you cannot afford to return stale data, do not use, or definitely use CAS to mitigate staleness. If you don't care about availability of data, e.g. user profile.
-how: see binary....
We will also have to bring in both how this is different from a failover (which turns a replica vbucket into an active one)
-Failover: application will continue doing a get because the replicated data becomes available and client will still get it. Functioning nodes with replicated data can still server it. Clients will also automatically know to go to the healthy nodes too.
-Replica read really for scenario that applications cannot function without 30 seconds of downtime and must get data within that timeframe.
and what happens to this functionality after a failover happens.
-if you attempt a replica read on a node that is promoted to 'active' server will send a failure message (because the replica no longer a replica): ENGINE_NOT_MY_VBUCKET.......up to SDKs on how they handle this error.
Point on SDKs: how they handle errors if replica no longer replica and if they reroute request......Up to SDKs
Agree that the most critical information will be around how, when and when not to use this feature.
-When: you need data and multiple gets continuously fail, then attempt this scenario. If you have SLA by certain time.
-When not: if you cannot afford to return stale data, do not use, or definitely use CAS to mitigate staleness. If you don't care about availability of data, e.g. user profile.
-how: see binary....
We will also have to bring in both how this is different from a failover (which turns a replica vbucket into an active one)
-Failover: application will continue doing a get because the replicated data becomes available and client will still get it. Functioning nodes with replicated data can still server it. Clients will also automatically know to go to the healthy nodes too.
-Replica read really for scenario that applications cannot function without 30 seconds of downtime and must get data within that timeframe.
and what happens to this functionality after a failover happens.
-if you attempt a replica read on a node that is promoted to 'active' server will send a failure message (because the replica no longer a replica): ENGINE_NOT_MY_VBUCKET.......up to SDKs on how they handle this error.
Point on SDKs: how they handle errors if replica no longer replica and if they reroute request......Up to SDKs
Show
Karen Zeller
added a comment - - edited From Perry:
Agree that the most critical information will be around how, when and when not to use this feature.
-When: you need data and multiple gets continuously fail, then attempt this scenario. If you have SLA by certain time.
-When not: if you cannot afford to return stale data, do not use, or definitely use CAS to mitigate staleness. If you don't care about availability of data, e.g. user profile.
-how: see binary....
We will also have to bring in both how this is different from a failover (which turns a replica vbucket into an active one)
-Failover: application will continue doing a get because the replicated data becomes available and client will still get it. Functioning nodes with replicated data can still server it. Clients will also automatically know to go to the healthy nodes too.
-Replica read really for scenario that applications cannot function without 30 seconds of downtime and must get data within that timeframe.
and what happens to this functionality after a failover happens.
-if you attempt a replica read on a node that is promoted to 'active' server will send a failure message (because the replica no longer a replica): ENGINE_NOT_MY_VBUCKET.......up to SDKs on how they handle this error.
Point on SDKs: how they handle errors if replica no longer replica and if they reroute request......Up to SDKs
Hide
Karen Zeller
added a comment -
Send this over a
Request & Response:
* Both Response and Request header structures for this command is identical to tho ones of the regular get.
* Request example: Replica Get("Hello")
Field (offset) (value)
Magic (0) : 0x80
Opcode (1) : 0x83
Key length (2,3) : 0x0005
Extra length (4) : 0x00
Data type (5) : 0x00
VBucket (6,7) : 0x0000
Total body (8-11) : 0x00000005
Opaque (12-15): 0x00000000
CAS (16-23): 0x0000000000000000
Extras : None
Key (24-29): The textual string: "Hello"
Value : None
Everything in the request is same as a binary protocol get request except the Opcode which must 0x83....
* Response example: Replica Get("Hello") response ex.
Field (offset) (value)
Magic (0) : 0x81
Opcode (1) : 0x83
Key length (2,3) : 0x0000
Extra length (4) : 0x04
Data type (5) : 0x00
Status (6,7) :0x0000
Total body (8-11) : 0x00000009
Opaque (12-15): 0x00000000
CAS (16-23): 0x0000000000000001
Key (24-29): The textual string: "Hello"
Value : The textual string: "World"
Get the same type of return value as you have with GET(). no flag, etc. indicate this is replica read get
Response Status:
* ENGINE_NOT_MY_VBUCKET = 0x0c
cannot find vbucket with key or
vbucket is not in replica state
occurs if replica node elevated to active.....
* ENGINE_EWOULDBLOCK = 0x07
EP Engine would block - vbucket is in pending operation
can happen if node undergoing rebalance.
Unit Tests:
(internal: informational for support)
* test_get_replica - returns data for a vbucket that is in replica state
* test_get_replica_active_state - returns error for a vbucket that is in active state (ENGINE_NOT_MY_VBUCKET)
* test_get_replica_pending_state - returns error for a vbucket that that is in pending state (ENGINE_EWOULDBLOCK)
* test_get_replica_dead_state - returns error for a vbucket that is in dead state (ENGINE_NOT_MY_VBUCKET)
Request & Response:
* Both Response and Request header structures for this command is identical to tho ones of the regular get.
* Request example: Replica Get("Hello")
Field (offset) (value)
Magic (0) : 0x80
Opcode (1) : 0x83
Key length (2,3) : 0x0005
Extra length (4) : 0x00
Data type (5) : 0x00
VBucket (6,7) : 0x0000
Total body (8-11) : 0x00000005
Opaque (12-15): 0x00000000
CAS (16-23): 0x0000000000000000
Extras : None
Key (24-29): The textual string: "Hello"
Value : None
Everything in the request is same as a binary protocol get request except the Opcode which must 0x83....
* Response example: Replica Get("Hello") response ex.
Field (offset) (value)
Magic (0) : 0x81
Opcode (1) : 0x83
Key length (2,3) : 0x0000
Extra length (4) : 0x04
Data type (5) : 0x00
Status (6,7) :0x0000
Total body (8-11) : 0x00000009
Opaque (12-15): 0x00000000
CAS (16-23): 0x0000000000000001
Key (24-29): The textual string: "Hello"
Value : The textual string: "World"
Get the same type of return value as you have with GET(). no flag, etc. indicate this is replica read get
Response Status:
* ENGINE_NOT_MY_VBUCKET = 0x0c
cannot find vbucket with key or
vbucket is not in replica state
occurs if replica node elevated to active.....
* ENGINE_EWOULDBLOCK = 0x07
EP Engine would block - vbucket is in pending operation
can happen if node undergoing rebalance.
Unit Tests:
(internal: informational for support)
* test_get_replica - returns data for a vbucket that is in replica state
* test_get_replica_active_state - returns error for a vbucket that is in active state (ENGINE_NOT_MY_VBUCKET)
* test_get_replica_pending_state - returns error for a vbucket that that is in pending state (ENGINE_EWOULDBLOCK)
* test_get_replica_dead_state - returns error for a vbucket that is in dead state (ENGINE_NOT_MY_VBUCKET)
Show
Karen Zeller
added a comment - Send this over a
Request & Response:
* Both Response and Request header structures for this command is identical to tho ones of the regular get.
* Request example: Replica Get("Hello")
Field (offset) (value)
Magic (0) : 0x80
Opcode (1) : 0x83
Key length (2,3) : 0x0005
Extra length (4) : 0x00
Data type (5) : 0x00
VBucket (6,7) : 0x0000
Total body (8-11) : 0x00000005
Opaque (12-15): 0x00000000
CAS (16-23): 0x0000000000000000
Extras : None
Key (24-29): The textual string: "Hello"
Value : None
Everything in the request is same as a binary protocol get request except the Opcode which must 0x83....
* Response example: Replica Get("Hello") response ex.
Field (offset) (value)
Magic (0) : 0x81
Opcode (1) : 0x83
Key length (2,3) : 0x0000
Extra length (4) : 0x04
Data type (5) : 0x00
Status (6,7) :0x0000
Total body (8-11) : 0x00000009
Opaque (12-15): 0x00000000
CAS (16-23): 0x0000000000000001
Key (24-29): The textual string: "Hello"
Value : The textual string: "World"
Get the same type of return value as you have with GET(). no flag, etc. indicate this is replica read get
Response Status:
* ENGINE_NOT_MY_VBUCKET = 0x0c
cannot find vbucket with key or
vbucket is not in replica state
occurs if replica node elevated to active.....
* ENGINE_EWOULDBLOCK = 0x07
EP Engine would block - vbucket is in pending operation
can happen if node undergoing rebalance.
Unit Tests:
(internal: informational for support)
* test_get_replica - returns data for a vbucket that is in replica state
* test_get_replica_active_state - returns error for a vbucket that is in active state (ENGINE_NOT_MY_VBUCKET)
* test_get_replica_pending_state - returns error for a vbucket that that is in pending state (ENGINE_EWOULDBLOCK)
* test_get_replica_dead_state - returns error for a vbucket that is in dead state (ENGINE_NOT_MY_VBUCKET)