<!-- 
RSS generated by JIRA (5.2.4#845-sha1:c9f4cc41abe72fb236945343a1f485c2c844dac9) at Wed May 22 04:28:52 CDT 2013

It is possible to restrict the fields that are returned in this document by specifying the 'field' parameter in your request.
For example, to request only the issue key and summary add field=key&field=summary to the URL of your request.
For example:
http://www.couchbase.com/issues/si/jira.issueviews:issue-xml/MB-6219/MB-6219.xml?field=key&field=summary
-->
<rss version="0.92" >
<channel>
    <title>Couchbase</title>
    <link>http://www.couchbase.com/issues</link>
    <description>This file is an XML representation of an issue</description>
    <language>en-us</language>    <build-info>
        <version>5.2.4</version>
        <build-number>845</build-number>
        <build-date>26-12-2012</build-date>
    </build-info>

<item>
            <title>[MB-6219] items are not marked as deleted/expired in couchstore after they expire (View query results with stale=false include expired items)</title>
                <link>http://www.couchbase.com/issues/browse/MB-6219</link>
                <project id="10010" key="MB">Couchbase Server</project>
                        <description>&lt;br/&gt;
View query results with stale=false include expired items.&lt;br/&gt;
&lt;br/&gt;
Steps to reproduce(build#1580):&lt;br/&gt;
1. Create default bucket&lt;br/&gt;
2. Load 10 json docs with expiry set to 30 seconds. &lt;br/&gt;
3. Create a view(default map func) and query with stale=false.&lt;br/&gt;
4. Wait for 2-3 minutes.&lt;br/&gt;
5. Query view again with stale=false.&lt;br/&gt;
&lt;br/&gt;
Some of the items are still returned in the query results even when index is rebuilt.&lt;br/&gt;
I observed that the number of rows returned by the view query is always the same as curr_items.&lt;br/&gt;
&lt;br/&gt;
Diagnostics are attached. &lt;br/&gt;
&lt;br/&gt;
</description>
                <environment>build#1580 on Ubuntu 64bit</environment>
            <key id="19013">MB-6219</key>
            <summary>items are not marked as deleted/expired in couchstore after they expire (View query results with stale=false include expired items)</summary>
                <type id="1" iconUrl="http://www.couchbase.com/issues/images/icons/issuetypes/bug.png">Bug</type>
                                <priority id="3" iconUrl="http://www.couchbase.com/issues/images/icons/priorities/major.png">Major</priority>
                    <status id="6" iconUrl="http://www.couchbase.com/issues/images/icons/statuses/closed.png">Closed</status>
                    <resolution id="2">Won&apos;t Fix</resolution>
                    <security id="10011">Public</security>
                        <assignee username="peter">Peter Wansch</assignee>
                                <reporter username="deepkaran.salooja">Deepkaran Salooja</reporter>
                        <labels>
                    </labels>
                <created>Tue, 14 Aug 2012 10:32:01 -0500</created>
                <updated>Wed, 26 Sep 2012 19:37:32 -0500</updated>
                    <resolved>Wed, 5 Sep 2012 12:21:45 -0500</resolved>
                            <version>recent-builds-2.0</version>
                                <fixVersion>2.0</fixVersion>
                                <component>couchbase-bucket</component>
                <component>view-engine</component>
                                <votes>0</votes>
                        <watches>4</watches>
                                                    <comments>
                    <comment id="35408" author="FilipeManana" created="Tue, 14 Aug 2012 10:35:05 -0500"  >That&amp;#39;s not an expected.&lt;br/&gt;
Items are lazily expired by ep-engine, meaning that it will not perform document deletes in the database after the 30 seconds.&lt;br/&gt;
&lt;br/&gt;
There&amp;#39;s no way to control that or know that from the view-engine.</comment>
                    <comment id="35409" author="farshid" created="Tue, 14 Aug 2012 10:55:21 -0500"  >seems like something can be modified in ep-engine so that when items expire we dont see them in views anymore</comment>
                    <comment id="35412" author="peter" created="Tue, 14 Aug 2012 11:16:12 -0500"  >Chiyoung, is this something Jin or Mike can help out with if it&amp;#39;s in ep_engine? If not, it may need to be passed to Aaron. Thank you.</comment>
                    <comment id="35414" author="FilipeManana" created="Tue, 14 Aug 2012 11:24:52 -0500"  >This was discussed internally a few times, but I don&amp;#39;t think any decision was made.&lt;br/&gt;
&lt;br/&gt;
Mike gave some info in the forum to a user about this:&lt;br/&gt;
&lt;br/&gt;
&lt;a href=&quot;http://www.couchbase.com/forums/thread/expiration-time-docs-dp4&quot;&gt;http://www.couchbase.com/forums/thread/expiration-time-docs-dp4&lt;/a&gt;&lt;br/&gt;
&lt;br/&gt;
</comment>
                    <comment id="35419" author="chiyoung" created="Tue, 14 Aug 2012 12:08:44 -0500"  >The item or expiry pager wasn&amp;#39;t scheduled yet to clear up all expired items from memory hashtable and disk. That&amp;#39;s why you still see those expired items in the view query.&lt;br/&gt;
&lt;br/&gt;
The item pager will be scheduled if the current memory usage is above high water mark. The expiry pager will be scheduled once every hour by default, but you can change the expiry pager&amp;#39;s interval to a shorter period (e.g., 5 minutes) at runtime.</comment>
                    <comment id="35426" author="farshid" created="Tue, 14 Aug 2012 12:27:45 -0500"  >Dipti,&lt;br/&gt;
&lt;br/&gt;
this means that users will see the expired items in the index for sometimes up to an hour which is the default value for the expiry pager.</comment>
                    <comment id="35427" author="dipti" created="Tue, 14 Aug 2012 12:48:32 -0500"  >Peter, as discussed this is something we should be able to do at query time. We do need to fix this for 2.0. Can you please help understand the options with the view engine team? &lt;br/&gt;
Let me  know if you need additional feedback from me. </comment>
                    <comment id="35433" author="FilipeManana" created="Tue, 14 Aug 2012 14:34:23 -0500"  >There&amp;#39;s no efficient way to do this in view engine. It would imply for each stale=false request to scan all documents in every vbucket and check if they expired, not to mention other smaller issues.</comment>
                    <comment id="35516" author="peter" created="Wed, 15 Aug 2012 10:17:41 -0500"  >Deep, can you confirm that after an hour, once the expiry pager has run and the next time the indexes are updated, they disappear from the view? If so, then we don&amp;#39;t have a bug. There is still a valid discussion going on about how the situation around queries can be improved but I want to find out if things are working as designed for now.</comment>
                    <comment id="35628" author="deepkaran.salooja" created="Thu, 16 Aug 2012 10:12:39 -0500"  >Yes, that&amp;#39;s correct. Once the expiry pager has run and indexes have been updated, the queries do not return the expired items.</comment>
                    <comment id="36487" author="perry" created="Wed, 22 Aug 2012 06:48:08 -0500"  >Just a thought as I came across this bug.  What if for each query result, the query engine contacted memcached to see if each doc was still valid before including it in the query response?  That way, the view engine wouldn&amp;#39;t have to keep track of all documents in all vbuckets, only the ones that it is sending out.  This would not only take care of expiration (since memcached would return &amp;quot;not_found&amp;quot;) but also deleted documents that have not yet been removed from disk.  Rather than doing a &amp;#39;get&amp;#39; (which would fetch it from disk in DGM), we could use the &amp;quot;stats key&amp;quot; operation to just check whether the key is still valid within memcached.  Since there would be a bit (ableit small) amount of overhead on the query response, this could be an optional check?&lt;br/&gt;
&lt;br/&gt;
The rows would eventually get cleaned from the index, this is just preventing the client from getting a massive amount of already expired items at the minute 59 mark before the hourly process is run.</comment>
                    <comment id="36489" author="FilipeManana" created="Wed, 22 Aug 2012 07:04:12 -0500"  >Thanks for the suggestion Perry.&lt;br/&gt;
Unfortunately it wouldn&amp;#39;t work for several reasons.&lt;br/&gt;
&lt;br/&gt;
First view-engine has no way to communicate with memcached currently.&lt;br/&gt;
&lt;br/&gt;
Second, it would slow things down significantly.&lt;br/&gt;
&lt;br/&gt;
Third, how could that work for reduces? For precomputed reduce values, which is the strength of couchdb&amp;#39;s btrees + mapreduce, how do you &amp;quot;unreduce&amp;quot;, exclude values produced for expired documents, and re-compute reductions? Not only you would need to know the map values produced by expired documents, you would also need to know the map values for the non-expired documents. Not to mention the big performance penalty here.&lt;br/&gt;
&lt;br/&gt;
There&amp;#39;s a lot of other technical issues that would impact either correctness, the incremental view update approach or performance. Those 3 listed above are just the ones people in general not familiar with implementation/design would grasp quickly.</comment>
                    <comment id="37684" author="peter" created="Wed, 5 Sep 2012 12:21:45 -0500"  >Filipe explained in the comments how this works. To speed up deletion from the indexes, the expiry pager interval can be changed which may have an adverse effect in performance.</comment>
                </comments>
                    <attachments>
                    <attachment id="14335" name="10.1.3.73-8091-diag.txt.gz" size="135734" author="deepkaran.salooja" created="Tue, 14 Aug 2012 10:32:01 -0500" />
                </attachments>
            <subtasks>
        </subtasks>
                <customfields>
                                                                        <customfield id="customfield_10180" key="com.atlassian.jira.ext.charting:firstresponsedate">
                <customfieldname>Date of First Response</customfieldname>
                <customfieldvalues>
                    <customfieldvalue>Tue, 14 Aug 2012 10:35:05 -0500</customfieldvalue>

                </customfieldvalues>
            </customfield>
                                                                                                                                                                                                            <customfield id="customfield_10081" key="com.pyxis.greenhopper.jira:gh-global-rank">
                <customfieldname>Rank</customfieldname>
                <customfieldvalues>
                    <customfieldvalue>4362</customfieldvalue>
                </customfieldvalues>
            </customfield>
                                                                                                                            <customfield id="customfield_10050" key="com.atlassian.jira.plugin.system.customfieldtypes:float">
                <customfieldname>Sprint Priority</customfieldname>
                <customfieldvalues>
                    <customfieldvalue>5.0</customfieldvalue>
                </customfieldvalues>
            </customfield>
                                                                                    <customfield id="customfield_10181" key="com.atlassian.jira.ext.charting:timeinstatus">
                <customfieldname>Time In Status</customfieldname>
                <customfieldvalues>
                    
                </customfieldvalues>
            </customfield>
                                                </customfields>
    </item>
</channel>
</rss>