Hello,
From your sync gateway doc page
Your workers should be idempotent or you should track the last_seq somewhere durable.
I think this is related with the fact that if the server where the worker runs go offline, all changes that happen in downtime will be lost.
So the two solutions suggested are:
- Store the last_seq somewere durable
- Have an idempotent worker
Starting a project from scratch, I would like to implement both for my worker.
Store the last_seq somewere durable
This is the way you handle the problem in the CouchChat-iOS sample (link found at the end of the doc page).
It is a very outdated example, but still useful to understand the logic.
Essentially you first connect directly to the bucket using the Couchbase Node SDK.
Then you retrieve the last document processed using the sdk ( in the sample called _push:seq),
and you start to follow document changes from that seq.
When a document that needs to be handled arrive, you manage some business logic (s.g. sending the post notification to mobile apps) and store the new sequence number into the bucket using the sdk.
Now this seems to work, but we are in a node evironment, based on a non-blocking I/O model.
Since every operation will use a callback (writing/reading something from a bucket, sending an email, sending an API request etc.) I suppose that every changes feed like follow will not block the changes feed until the business logic end.
So this example could happen:
Worker starts from req=0
Doc 1 created [req=1]
Doc 2 created [req=2]
Storing [req=2]
Storing [req=1]
Worker stops
Worker starts from req=1
Doc 2 created [req=2] <-- This is a serious problem!
How we can be sure that the stored req is the real last one?
Have an idempotent worker
From wikipedia an function is called idempotent if f(f(x)) = f(x), meaning that the result will not change if the function is called once or multiple times.
From a mathematic point of view I have understood what it is, from a development view I have absolutedly no idea how to implement it.
There are currently some examples of idempotent workers? And if my woker is idempotent, I could scale it horizontally across multiple server?
UPDATE
Searching about idempotence I found a wonderful article [link] that helped me understood what it means to create an idempotent worker.
In short:
- Always keep in document a state of every business operation you have done (sending a mail, a push notification etc.)
- Always check if the business operation was not done before you execute, this implies that you need to retrieve always the last revision of the document
So if my worker is idempotent the first problem auto-resolve itself, the business operation will not trigger anything because I will check if it was already done. Anyway there is a small possibility that the worker have stop just after the operation have finished and before the state have been changed.
Anyway there are some common problems if we try to scale an idempotent worker across multiple servers?
If yes, there are some solutions?