Search:

Search all manuals
Search this manual
Manual
Couchbase Server Manual 2.0
Community Wiki and Resources
Download Couchbase Server 2.0
Couchbase Developer Guide 2.0
Client Libraries
Couchbase Server Forum
Additional Resources
Community Wiki
Community Forums
Couchbase SDKs
Parent Section
D Troubleshooting Views (Technical Background)
Chapter Sections
Chapters

D.4. Wrong documents or rows when querying with include_docs=true

Imagine you have the following design document:

{
     "meta": {"id": "_design/test"},
     "views":
     {
         "view1": {
             "map": "function(doc, meta) { emit(meta.id,  doc.value); }"
         }
     }
 }

And the bucket only has 2 documents, document doc1 with JSON value {"value": 1}, and document doc2 with JSON value {"value": 2}, you query the view initially with stale=false and include_docs=true and get:

shell> curl -s 'http://localhost:9500/default/_design/test/_view/view1?include_docs=true&stale=false' | json_xs
 {
    "total_rows" :
2,
    "rows" :
[
       {
          "value" : 1,
          "doc"
: {
             "json" : {
                "value" : 1
             },
             "meta" : {
                "flags" : 0,
                "expiration" : 0,
                "rev" : "1-000000367916708a0000000000000000",
                "id" : "doc1"
             }
          },
          "id"
: "doc1",
          "key"
: "doc1"
       },
       {
          "value" : 2,
          "doc"
: {
             "json" : {
                "value" : 2
             },
             "meta" : {
                "flags" : 0,
                "expiration" : 0,
                "rev" : "1-00000037b8a32e420000000000000000",
                "id" : "doc2"
             }
          },
          "id"
: "doc2",
          "key"
: "doc2"
       }
    ]
 }

Later on you update both documents, such that document doc1 has the JSON value {"value": 111111} and document doc2 has the JSON value {"value": 222222}. You then query the view with stale=update_after (default) or stale=ok and get:

shell> curl -s 'http://localhost:9500/default/_design/test/_view/view1?include_docs=true' | json_xs
 {
    "total_rows" :
2,
    "rows" :
[
       {
          "value" : 1,
          "doc"
: {
             "json" : {
                "value" : 111111
             },
             "meta" : {
                "flags" : 0,
                "expiration" : 0,
                "rev" : "2-0000006657aeed6e0000000000000000",
                "id" : "doc1"
             }
          },
          "id"
: "doc1",
          "key"
: "doc1"
       },
       {
          "value" : 2,
          "doc"
: {
             "json" : {
                "value" : 222222
             },
             "meta" : {
                "flags" : 0,
                "expiration" : 0,
                "rev" : "2-00000067e3ee42620000000000000000",
                "id" : "doc2"
             }
          },
          "id"
: "doc2",
          "key"
: "doc2"
       }
    ]
 }

The documents included in each row don't match the value field of each row, that is, the documents included are the latest (updated) versions but the index row values still reflect the previous (first) version of the documents.

Why this behaviour? Well, include_docs=true works by at query time, for each row, to fetch from disk the latest revision of each document. There's no way to include a previous revision of a document. Previous revisions are not accessible through the latest vbucket databases MVCC snapshots (http://en.wikipedia.org/wiki/Multiversion_concurrency_control), and it's not possible to find efficiently from which previous MVCC snapshots of a vbucket database a specific revision of a document is located. Further, vbucket database compaction removes all previous MVCC snapshots (document revisions). In short, this is a deliberate design limit of the database engine.

The only way to ensure full consistency here is to include the documents themselves in the values emitted by the map function. Queries with stale=false are not 100% reliable either, as just after the index is updated and while rows are being streamed from disk to the client, document updates and deletes can still happen, resulting in the same behaviour as in the given example.