[MB-7053] Expired items are not excluded from production/dev views until expiry pager runs ( which deletes the items from database permanently ) Created: 30/Oct/12  Updated: 08/Apr/13  Resolved: 21/Nov/12

Status: Closed
Project: Couchbase Server
Component/s: documentation
Affects Version/s: None
Fix Version/s: 2.0
Security Level: Public

Type: Bug Priority: Blocker
Reporter: Deepkaran Salooja Assignee: MC Brown (Inactive)
Resolution: Fixed Votes: 0
Labels: 2.0-release-notes
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified
Environment: build 1908

<manifest>
<remote name="couchbase" fetch="git://github.com/couchbase/"/>
<remote name="membase" fetch="git://github.com/membase/"/>
<remote name="apache" fetch="git://github.com/apache/"/>
<remote name="erlang" fetch="git://github.com/erlang/"/>
<default remote="couchbase" revision="master"/>
<project name="tlm" path="tlm" revision="23261639414808ce20ffc59508346db5efab6832">
<copyfile src="Makefile.top" dest="Makefile"/>
</project>
<project name="bucket_engine" path="bucket_engine" revision="70b3624abc697b7d18bf3d57f331b7674544e1e7"/>
<project name="ep-engine" path="ep-engine" revision="7dff0cf9ee1012382350b3e438682782e279c294"/>
<project name="libconflate" path="libconflate" revision="2cc8eff8e77d497d9f03a30fafaecb85280535d6"/>
<project name="libmemcached" path="libmemcached" revision="ca739a890349ac36dc79447e37da7caa9ae819f5" remote="membase"/>
<project name="libvbucket" path="libvbucket" revision="00d3763593c116e8e5d97aa0b646c42885727398"/>
<project name="membase-cli" path="membase-cli" revision="7fe4121e7e83952a4cb032e25a2cb9fca1709354" remote="membase"/>
<project name="memcached" path="memcached" revision="7ea975a93a0231393502af4ca98976eee8a83386" remote="membase"/>
<project name="moxi" path="moxi" revision="52a5fa887bfff0bf719c4ee5f29634dd8707500e"/>
<project name="ns_server" path="ns_server" revision="c5dab8c2fe5144517b50b4500b4ddcaa9432c517"/>
<project name="portsigar" path="portsigar" revision="1bc865e1622fb93a3fe0d1a4cdf18eb97ed9d600"/>
<project name="sigar" path="sigar" revision="63a3cd1b316d2d4aa6dd31ce8fc66101b983e0b0"/>
<project name="couchbase-examples" path="couchbase-examples" revision="21e6161a1d064979b5c6aa99cd34ccc41c9d7aca"/>
<project name="couchbase-python-client" path="couchbase-python-client" revision="006c1aa8b76f6bce11109af8a309133b57079c4c"/>
<project name="couchdb" path="couchdb" revision="1a5f6290bc21a37a4eb3bf4be75f4e5c995e05cf"/>
<project name="couchdbx-app" path="couchdbx-app" revision="a28314c04c9d89da861b400ff37eae1fdaa693f8"/>
<project name="couchstore" path="couchstore" revision="d0e70d0dece8e4f4d0a782f4ac5452509fb3919b"/>
<project name="geocouch" path="geocouch" revision="849d5443689b1924f097548af864c539bffcc929"/>
<project name="mccouch" path="mccouch" revision="88701cc326bc3dde4ed072bb8441be83adcfb2a5"/>
<project name="testrunner" path="testrunner" revision="78474c34f4bbb457507a0323af4a638db88f05b5"/>
<project name="otp" path="otp" revision="b6dc1a844eab061d0a7153d46e7e68296f15a504" remote="erlang"/>
<project name="icu4c" path="icu4c" revision="26359393672c378f41f2103a8699c4357c894be7" remote="couchbase"/>
<project name="snappy" path="snappy" revision="5681dde156e9d07adbeeab79666c9a9d7a10ec95" remote="couchbase"/>
<project name="v8" path="v8" revision="447decb75060a106131ab4de934bcc374648e7f2" remote="couchbase"/>
<project name="gperftools" path="gperftools" revision="8f60ba949fb8576c530ef4be148bff97106ddc59" remote="couchbase"/>
<project name="pysqlite" path="pysqlite" revision="0ff6e32ea05037fddef1eb41a648f2a2141009ea" remote="couchbase"/>
</manifest>

Attachments: GZip Archive 127.0.0.1-9000-diag.txt.gz    

 Description   

Expired items are not excluded from production/dev views. This happens only after the item is "get".

Steps to reproduce:

1. Create default bucket and create 1 production view
curl -X PUT -H 'Content-Type: application/json' 'http://Administrator:asdasd@127.0.0.1:9500/default/_design/d1' -d '{"views":{"v1":{"map":"function(doc,meta){\nemit(meta.id,doc);\n}"}}}'
{"ok":true,"id":"_design/d1"}

2. Insert 2 items from memcached client with expiry set to 5 seconds

>>> import mc_bin_client
>>> mc = mc_bin_client.MemcachedClient(port=12001)
>>> mc.set("ab", 5, 0, "val")
(3306888435, 7973028514358, '')
>>> mc.set("ab1", 5, 0, "val")
(2896330942, 7997559610975, '')

3. Query the view with stale=false to build the index
curl -X GET 'http://127.0.0.1:9500/default/_design/d1/_view/v1?stale=false&#39;
{"total_rows":2,"rows":[
{"id":"ab","key":"ab","value":"dmFs"},
{"id":"ab1","key":"ab1","value":"dmFs"}
]
}

4. Query the view after couple of minutes and expired items are still returned
curl -X GET 'http://127.0.0.1:9500/default/_design/d1/_view/v1?stale=false&#39;
{"total_rows":2,"rows":[
{"id":"ab","key":"ab","value":"dmFs"},
{"id":"ab1","key":"ab1","value":"dmFs"}
]
}

If memcached get is used with these items, then these are excluded from the views. Otherwise these are always returned in view results.

Diagnostic is attached.

Checked with Mike on this and ep-engine seems to be doing things correctly:

"This does look like an issue, but not in the ep-engine side. Since ep-engine might take an hour to actually remove an expired item, it should be up to the view engine to filter out any expired items too. The reaon why doing a get will cause the item to disappear from the view results is that ep-engine will actually do the deletion."

 Comments   
Comment by Filipe Manana [ 30/Oct/12 ]
Won't work unless the docs are deleted from the vbucket databases.

See:

http://hub.internal.couchbase.com/confluence/display/QA/Debugging+view+engine+issues+and+reporting+them#Debuggingviewengineissuesandreportingthem-section5

Simply doesn't work for several architectural reasons.
See the discussion there.
Comment by Dipti Borkar [ 31/Oct/12 ]
At least a client side solution needs to be put in.
Comment by Deepkaran Salooja [ 31/Oct/12 ]
Another user on the forum facing same issue:
http://www.couchbase.com/forums/thread/expired-items-still-appear-document-list
Comment by Deepkaran Salooja [ 31/Oct/12 ]
Copying full email conversation with Mike on this.

On Oct 29, 2012, at 11:56 AM, Mike Wiederhold <mike@couchbase.com> wrote:

Yes, that's what I would expect.

- Mike

On Oct 29, 2012, at 11:53 AM, Farshid Ghods wrote:


okay in that case view index should be updated and stale=false query should pick that up


On Oct 29, 2012, at 11:46 AM, Mike Wiederhold <mike@couchbase.com> wrote:


Yes expiration time is updated on disk by ep-engine.

- Mike

On Oct 29, 2012, at 11:44 AM, Farshid Ghods wrote:

Mike,

does ep-engine update the expiration time on for this key on the disk ?
if so then view-engine can skip this item when generating the results i guess

-Farshid

On Oct 29, 2012, at 11:41 AM, Mike Wiederhold <mike@couchbase.com> wrote:


This does look like an issue, but not in the ep-engine side. Since ep-engine might take an hour to actually remove an expired item, it should be up to the view engine to filter out any expired items too. The reaon why doing a get will cause the item to disappear from the view results is that ep-engine will actually do the deletion.

- Mike
Comment by Farshid Ghods (Inactive) [ 31/Oct/12 ]
One thing i am not quite sure about ep-engine behavior is whether the disk is updated before expiry pager runs ?

Yaseen,
can we confirm this with ep-engine folks or ideally if there is a spec about how ep-engine handles expirations before/after expiry pager runs
Comment by Farshid Ghods (Inactive) [ 31/Oct/12 ]
Dipti/Yaseen,

this is a change that can be addressed post 2.0 instead of changing the view-engine or ep-engine at this stage , and i agree that it is confusing for the users so maybe we can simply change how often expiry pager runs if they want faster updates ?
Comment by Farshid Ghods (Inactive) [ 31/Oct/12 ]
Dipti/Yaseen

Filipe has also explained the options in more details here as why skipping results in the time of reduction or view results wont work. so this bug is basically a duplicate.

http://www.couchbase.com/issues/browse/MB-6219?focusedCommentId=36489&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-36489
Comment by Steve Yen [ 01/Nov/12 ]
options...

* idea from Damien (might repeat something Filipe had on MB-6219?): track expiration time in secondary index.

* run expiry pager more often - this scans entire hashtable, locking each hashtable partition at a time. This only reduces the window of the issue, but doesn't fundamentally solve the issue. Ask some customers set their expiry pager more often (e.g., once every 10 minutes for small/medium cluster).
Comment by Steve Yen [ 02/Nov/12 ]
For 2.0, options discussed in bug-scrub...

Recommend to impacted customers to run expiry pager more often -- e.g., once every 10 minutes for small/medium clusters. This can help mitigate the inconsistency (but not fully solve) the issue. Running expiry pager more often also might have disk I/O impact / tradeoff, where it would actually write deletes to disk. User should also be aware of expiry pager in tradeoff with all the other dispatcher / tasks.

Frank, Dipti, Yaseen to discuss today (2012/11/02) outside of bug-scrub mtg and resolve.
Comment by Filipe Manana [ 05/Nov/12 ]
Perhaps, I was not fully clear before.

Tracking the expiration times in the indexes (values at the leaf nodes) would solve the issue for map view queries. True.
However, we already have plenty of metadata tracked in the indexes (vbucket id for each value, 1024 bits/128 bytes bitmasks), that makes us lose some performance (deeper trees, smaller branching factor per tree node). At the moment, querying Apache CouchDB is faster than Couchbase Server (single node of course, to be fair).

Now, for reduce views... Excluding values that were contributed by now-expired documents, means going down to all the leaf nodes to find out which values come from expired documents and which ones come from non-expired documents - then grab all the values from non-expired documents, applying reduce function against those values, and going up to the tree applying re-reduces until reaching the root. In other words, we would be doing not better than a linear scan, and defeating the whole purpose of intermediary reductions/efficiency for which CouchDB trees are known for.
Basically doing this would, at the very best, give query response times of a few seconds (being very optimistic here) for any reasonably sized index (even less than than 1M items perhaps).

Comment by MC Brown (Inactive) [ 21/Nov/12 ]
The documentation has been updated at multiple points to highlight the inclusion of documents that may have expired, but not been fully deleted yet.
Comment by kzeller [ 06/Dec/12 ]
Added to Release Notes as:

Couchbase Server does lazy expiration, that is, expired items are flagged as
deleted rather than being immediately erased. Couchbase Server has
a maintenance process that will periodically look through all information and erase expired items.
This means expired items may still be indexed and appear in result sets of views. The workarounds
are available here <ulink url="http://www.couchbase.com/docs/couchbase-devguide-2.0/about-ttl-values.html">
About Document Expiration</ulink>.

Generated at Sun Sep 21 12:20:39 CDT 2014 using JIRA 5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9.