compared with
Key
This line was removed.
This word was removed. This word was added.
This line was added.

Changes (10)

View Page History
(UNIBROw comes from UNIform BootstRap Operation)

See also: [http://www.youtube.com/watch?v=mdZo_keUoEs&list=TLiOobwz8YsEI]

h2. Name of Document Author/Supplier:
h2. Project Summary:

Change the bootstrapping of all clients to have consistent behavior.  Allow for both deterministic and random bootstrapping from a cluster and for dynamic updates to the bootstrap list past the initial list supplied by the application developer/operator.
Change the bootstrapping of all clients to have consistent behavior. Allow for both deterministic and random bootstrapping from a cluster and to update the bootstrap list to include all nodes considered valid in the cluster supplied configuration.

h2. Project Description:

Current bootstrapping can vary from client to client.  Project CCCP will get to a much more regular and long-term-supportable approach, but it would be good to have a short term approach to cover a few things.  First, it would be good for all client libraries to handle bootstrapping in the same way.  Second, currently there are some situations where random bootstrapping, by which we mean selecting a bootstrap node at random, is preferred to deterministic bootstrapping.  Some clients support this already and do it by default, where other clients do only deterministic bootstrapping.  Third, we want the ability to change the bootstrap list without a restart of an application.  It is somewhat complicated from a maintenance perspective to do a complete cluster swap currently, as the application needs to be reconfigured/restarted first.  Notably, the Java client can't currently handle a complete cluster swap and has no (built-in) method of configuration which can be done dynamically.
Current bootstrapping can vary from client to client. [Project CCCP|couchbase:Cluster Configuration Carrier Publication] will get to a much more regular and long-term-supportable approach, but it would be good to have a short term approach to cover a few problem areas.  

First, it would be good for all client libraries to handle bootstrapping in the same way.  Second, currently there are some situations where random bootstrapping, by which we mean selecting a bootstrap node at random, is preferred to deterministic bootstrapping. Some clients support this already, where other clients do only deterministic bootstrapping.  Third, we want the ability to update the bootstrap list based on cluster topology changes without a restart of an application.  It is somewhat complicated from a maintenance perspective to do a complete cluster swap currently, as the application needs to be reconfigured/restarted first.  Notably, the Java client can't currently handle a complete cluster swap and has no (built-in) method of configuration which can be done dynamically.

h2. Risks and Assumptions:

The changes and feature tests will be integrated in each client library and tested.

Situational tests around full cluster tests will be carried out.
Situational tests around full cluster swaps will be carried out replacing two nodes with two other nodes.

h1. Technical Description
# A configuration file approach, idiomatic to the platform (i.e. yaml for Ruby, properties files for Java) which can allow for the reconfiguration of URIs without restarting the application.

The sequence of operations for a client can be fairly easily described.

h3. Bootstrap of Client With a List of Three Nodes, All Healthy

# Client determines if RANDOM or ORDERED is specified by the application developer via either file based configuration (i.e., .properties \[Java\] or app.config \[.NET\]) or simple code changes (i.e., editing the .php or .py file).
## If RANDOM, client selects one of the nodes in the list at random
## If ORDERED, client selects the first node in the supplied list
# Client connects to and bootstraps against that node, following the hyperlink chain in a proper RESTful style
# After getting a valid configuration from the bootstrap, the client updates it's internal list with all nodes the configuration deems valid.

On topology change...
# Client receives configuration updates from the cluster as nodes are rebalanced in and start supplying services.  When the new configuration is received, the client's internal list of nodes is atomically swapped after configuration validation.

h3. Bootstrap of Client With a List of Four Nodes, One Down\\
# Client determines if RANDOM or ORDERED is specified by the application developer via either file based configuration (i.e., .properties \[Java\] or app.config \[.NET\]) or simple code changes (i.e., editing the .php or .py file).
## If RANDOM, client selects one of the nodes in the list at random
## If ORDERED, client selects the first node in the supplied list
# Client attempts to bootstrap against the _down_ node, and fails to connect after a reasonable time interval of not more than five seconds. If possible, the client logs the problem at the warning level.
# Client revisits the list
## If RANDOM, client selects another node at random, ensuring that the same node is not selected
## If ORDERED, client selects the next node in the list
# Client connects to and bootstraps against that node, following the hyperlink chain in a proper RESTful style
# After getting a valid configuration from the bootstrap, the client updates it's internal list with all nodes the configuration deems valid.

On topology change...
# Client receives configuration updates from the cluster as nodes are rebalanced in and start supplying services.  When the new configuration is received, the client's internal list of nodes is atomically swapped after configuration validation.

h3. Bootstrap of Client With a List of Four Nodes, All Down or Unreachable\\
# Client determines if RANDOM or ORDERED is specified by the application developer via either file based configuration (i.e., .properties \[Java\] or app.config \[.NET\]) or simple code changes (i.e., editing the .php or .py file).
## If RANDOM, client selects one of the nodes in the list at random
## If ORDERED, client selects the first node in the supplied list
# Client attempts to bootstrap against the _down_ node, and fails to connect after a reasonable time interval of not more than five seconds. If possible, the client logs the problem at the warning level.
# Client revisits the list _if_ not all nodes have been tried yet
## If RANDOM, client selects another node at random, ensuring that the same node is not selected
## If ORDERED, client selects the next node in the list
# If all nodes have been tried and the client still cannot get a configuration, client errors appropriately.  Note that this may be* up to 20 seconds later* in this particular case.

On topology change...
# Client receives configuration updates from the cluster as nodes are rebalanced in and start supplying services.  When the new configuration is received, the client's internal list of nodes is atomically swapped after configuration validation.

h2. Issues/Issues to be Opened:


Initial publication for REVIEW on 27 August, 2013.
Updated with more technical description of the sequence, post some discussion on 06 September, 2013.