Be aware that Couchbase Server does lazy expiration, that is, expired items are flagged as deleted rather than being immediately erased. Couchbase Server has a maintenance process, called expiry pager that will periodically look through all information and erase expired items. This maintenance process will run every 60 minutes, but it can be configured to run at a different interval. Couchbase Server will immediately remove an item flagged for deletion the next time the item requested; the server will respond that the item does not exist to the requesting process.
The result set from a view will contain any items stored on disk that meet the requirements of your views function. Therefore information that has not yet been removed from disk may appear as part of a result set when you query a view.
Using Couchbase views, you can also perform reduce functions on data, which perform calculations or other aggregations of data. For instance if you want to count the instances of a type of object, you would use a reduce function. Once again, if an item is on disk, it will be included in any calculation performed by your reduce functions. Based on this behavior due to disk persistence, here are guidelines on handling expiration with views:
Detecting Expired Documents in Result
Sets: If you are using views for indexing items
from Couchbase Server, items that have not yet been removed
as part of the expiry pager maintenance process will be part
of a result set returned by querying the view. To exclude
these items from a result set you should use query parameter
include_doc set to
true. This parameter typically includes
all JSON documents associated with the keys in a result set.
For example, if you use the parameter
include_docs=true Couchbase Server will
return a result set with an additional
"doc" object which contains the JSON or
binary data for that key:
{"total_rows":2,"rows":[ {"id":"test","key":"test","value":null,"doc":{"meta":{"id":"test","rev":"4-0000003f04e86b040000000000000000","expiration":0,"flags":0},"json":{"testkey":"testvalue"}}}, {"id":"test2","key":"test2","value":null,"doc":{"meta":{"id":"test2","rev":"3-0000004134bd596f50bce37d00000000","expiration":1354556285,"flags":0},"json":{"testkey":"testvalue"}}} ] }
For expired documents if you set
include_doc=true, Couchbase Server will
return a result set indicating the document does not exist
anymore. Specifically, the key that had expired but had not
yet been removed by the cleanup process will appear in the
result set as a row where "doc":null:
{"total_rows":2,"rows":[ {"id":"test","key":"test","value":null,"doc":{"meta":{"id":"test","rev":"4-0000003f04e86b040000000000000000","expiration":0,"flags":0},"json":{"testkey":"testvalue"}}}, {"id":"test2","key":"test2","value":null,"doc":null} ] }
Reduces and Expired Documents: In some cases, you may want to perform a reduce function to perform aggregations and calculations on data in Couchbase Server 2.0. In this case, Couchbase Server takes pre-calculated values which are stored for an index and derives a final result. This also means that any expired items still on disk will be part of the reduction. This may not be an issue for your final result if the ratio of expired items is proportionately low compared to other items. For instance, if you have 10 expired scores still on disk for an average performed over 1 million players, there may be only a minimal level of difference in the final result. However, if you have 10 expired scores on disk for an average performed over 20 players, you would get very different result than the average you would expect.
In this case, you may want to run the expiry pager process more frequently to ensure that items that have expired are not included in calculations used in the reduce function. We recommend an interval of 10 minutes for the expiry pager on each node of a cluster. Do note that this interval will have some slight impact on node performance as it will be performing cleanup more frequently on the node.
For more information about setting intervals for the maintenance
process, refer to the Couchbase Manual command line tool,
Couchbase Server Manual 2.0, Specifying Disk Cleanup
Interval and refer to the examples on
exp_pager_stime. For more information about
views and view query parameters, see
Finding Data with Views .