{"id":5777,"date":"2018-09-10T04:27:56","date_gmt":"2018-09-10T11:27:56","guid":{"rendered":"http:\/\/www.couchbase.com\/blog\/?p=5777"},"modified":"2025-06-13T20:19:57","modified_gmt":"2025-06-14T03:19:57","slug":"ycsb-json-benchmarking-json-databases-by-extending-ycsb","status":"publish","type":"post","link":"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/","title":{"rendered":"Using YCSB to Benchmark JSON Databases"},"content":{"rendered":"<p><a href=\"https:\/\/queue.acm.org\/detail.cfm?id=1036486\">Bruce\u00a0Lindsay<\/a>\u00a0<a href=\"https:\/\/sigmod.org\/publications\/interviews\/pdf\/p71-column-winslet.pdf\">once said<\/a>, &#8220;There are three things important in the database world: Performance, Performance, and Performance&#8221;.\u00a0 Most enterprise architects know, as we progress in database features and architectures, it&#8217;s important to measure performance in an open way so they can compare total cost of ownership reliably.<\/p>\n<p><a href=\"https:\/\/www2.cs.duke.edu\/courses\/fall13\/cps296.4\/838-CloudPapers\/ycsb.pdf\">YCSB<\/a> did a great job of benchmarking datastores serving the &#8220;Cloud OLTP&#8221; applications. These data stores were simple with simple get, put, delete operations.\u00a0 The original <a href=\"https:\/\/github.com\/brianfrankcooper\/YCSB\">YCSB benchmark<\/a> consists of a simple insert, update, delete, and scan operations on a simple document of 10 key-values; workloads are defined with a mix of these operations with various percentages.<\/p>\n<p><a href=\"https:\/\/www.json.org\">JSON<\/a> databases like <a href=\"https:\/\/www.couchbase.com\">Couchbase<\/a> and <a href=\"https:\/\/www.mongodb.com\">MongoDB<\/a> have a more advanced data model with scalars, nested objects, arrays, arrays of objects, arrays and arrays of objects.\u00a0 JSON databases also have more sophisticated <a href=\"https:\/\/docs.couchbase.com\/server\/5.5\/n1ql\/n1ql-language-reference\/index.html\">query<\/a> language, indexes, and capabilities. In addition to CRUD operations, applications routinely use the declarative query languages in these databases to search, paginate, and run reports.\u00a0 So, to help architects to evaluate platforms effectively, we need an additional benchmark to measure these capabilities in addition to the basic CRUD operations. <span style=\"font-weight: 400\">This YCSB tutorial explains its capabilities in filling the gap.<\/span><\/p>\n<blockquote><p><a href=\"https:\/\/www.cs.duke.edu\/courses\/fall13\/cps296.4\/838-CloudPapers\/ycsb.pdf\">YCSB paper<\/a> states: We also hope to foster the development of additional cloud benchmark suites that represent other classes of applications by making our benchmark tool available via open source. In this regard, a key feature of the YCSB framework\/tool is that it is extensible\u2014it supports easy definition of new workloads, in addition to making it easy to benchmark new systems.<\/p><\/blockquote>\n<p>This benchmark extends YCSB to JSON databases by extending existing operations to JSON and then defining new operations and new workloads.<\/p>\n<p><strong><span style=\"text-decoration: underline;color: #0000ff\">Here&#8217;s the outline.<\/span><\/strong><\/p>\n<ol>\n<li><span style=\"color: #0000ff\">Introduction<\/span><\/li>\n<li><span style=\"color: #0000ff\">Data Model<\/span><\/li>\n<li><span style=\"color: #0000ff\">Benchmark Operations<\/span><\/li>\n<li><span style=\"color: #0000ff\">Benchmark Workloads<\/span><\/li>\n<li><span style=\"color: #0000ff\">YCSB-JSON implementation<\/span><\/li>\n<li><span style=\"color: #0000ff\">How to run YCSB-JSON?<\/span><\/li>\n<li><span style=\"color: #0000ff\">References<\/span><\/li>\n<\/ol>\n<h5><strong><span style=\"color: #0000ff\">1. Introduction<\/span><\/strong><\/h5>\n<p><span style=\"font-weight: 400\">YCSB was developed to measure the performance of scalable NoSQL key-value datastores. YCSB infrastructure does that job well.\u00a0 YCSB uses a simple flat key-value. Couchbase uses a JSON model, which customers use to massively interactive applications.\u00a0 We\u2019ve built and are building features into the product to enable customers to build these applications effectively. We need performance measurements for these use cases.<\/span><\/p>\n<p><span style=\"font-weight: 400\">There are additional databases supporting JSON model: MongoDB, DocumentDB, DynamoDB, RethinkDB, Oracle NoSQL.\u00a0 When running YCSB on JSON databases (Couchbase, MongoDB, etc), the driver simply stores and retrieves strings in the JSON key-value structure. All of these databases require a new benchmark to measure processing of rich structure of JSON (nested objects, arrays) and operations like paging, grouping, aggregations.<\/span><\/p>\n<p><span style=\"font-weight: 400\">The purpose of YCSB-JSON is to extend the YCSB benchmark to measure JSON database capability to cover these two things: <\/span><\/p>\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Operations representative of massively interactive applications.<\/span>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Operations on the JSON data model, including nested objects, arrays.<\/span><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Create workloads that represent operations from these applications.<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400\">See these customer use cases:<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><a href=\"https:\/\/www.couchbase.com\/customers\/marriott\/\">Marriott<\/a> <\/span><span style=\"font-weight: 400\">built its reservation system on IBM Mainframe and DB2. They\u2019ve run into cost, performance challenges as more and more customer try to browse the available inventory.\u00a0 Systems on DB2 was originally built to take reservations from a phone-in system or from agents. The look to book ratio is low. Today, this ratio is high since the number of lookup requests has gone up exponentially. \u00a0 This has increased the database cost dramatically as well.\u00a0 Marriott moved all of its inventory data to Couchbase with continuous synchronization from its mainframe systems; web applications use Couchbase for the lookup\/search operations.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><a href=\"https:\/\/www.couchbase.com\/customers\/\">Cars.com<\/a> is a portal to list and sell cars. They have the listing data on Oracle.\u00a0 When they serve it up on the web, they not only have to present the basic car information but also provide additional insights like how many users are looking into a car or have saved it in their wish list. This is a way of increasing the engagement and sense of urgency.\u00a0 All the data required for these interactive operations are stored in Couchbase.<\/span><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400\">More generally, the massively interactive applications include the following:<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Browse rooms availability, pricing details, amenities <\/span><i><span style=\"font-weight: 400\">(lookups by end customers) <\/span><\/i><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Browse information on car make\/model or repair shops <\/span><i><span style=\"font-weight: 400\">(enable web-scale consumers &amp; partners)<\/span><\/i><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Provide information to the customer in context \u00a0<\/span><i><span style=\"font-weight: 400\">(location-based services)<\/span><\/i><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Serve both Master Data and Transactional Data <\/span><i><span style=\"font-weight: 400\">(at scale)<\/span><\/i><\/li>\n<\/ol>\n<p><span style=\"font-weight: 400\">To support these requirements, the applications &amp; databases do the following:<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Query offload from high-cost Systems of Record (mainframe, Oracle) databases <\/span>\n<ul>\n<li style=\"font-weight: 400\"><i><span style=\"font-weight: 400\">(reservations &amp; revenue apps)<\/span><\/i><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Opening up back-office functions \u00a0to web \/ mobile access <\/span>\n<ul>\n<li style=\"font-weight: 400\"><i><span style=\"font-weight: 400\">(enable web users to check room details)<\/span><\/i><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Scale database\/queries with better TCO \u00a0<\/span>\n<ul>\n<li style=\"font-weight: 400\"><i><span style=\"font-weight: 400\">(scale mainframes with commodity servers) <\/span><\/i><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Modernize legacy systems with capabilities demanded by new collaboration\/engagement applications <\/span>\n<ul>\n<li style=\"font-weight: 400\"><i><span style=\"font-weight: 400\">(browse inventory, flight, room availability, departmental analysis)<\/span><\/i><\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<p>The new benchmark needs to measure the performance of queries implementing these operations.<\/p>\n<h5><strong><span style=\"color: #0000ff\">2. Data Model<\/span><\/strong><\/h5>\n<p>We&#8217;ve taken customer and orders as two distinct collections of JSON documents.\u00a0 Each order has a reference to its customer.<\/p>\n<p>Below are the sample customer and order document.\u00a0 This has been generated via the\u00a0fakeit data generator.\u00a0 This tool is available at:\u00a0<a href=\"https:\/\/github.com\/bentonam\/fakeit\">https:\/\/github.com\/bentonam\/fakeit<\/a><\/p>\n<p>See the appendix for the YAML file used to define the data model and domain.<\/p>\n<pre class=\"theme:github font-size:14 line-height:16 wrap:true show-plain:1 scroll:true expand:true whitespace-before:2 whitespace-after:2 lang:js range:1-20 decode:true\" title=\"Sample CUSTOMER document\">Sample customer document\r\nDocument Key: 100_advjson\r\n{\r\n  \"_id\": \"100_advjson\",\r\n  \"doc_id\": 100,\r\n  \"gid\": \"48a8e177-15e5-5116-95d0-41478601bbdd\",\r\n  \"first_name\": \"Stella\",\r\n  \"middle_name\": \"Jackson\",\r\n  \"last_name\": \"Toy\",\r\n  \"ballance_current\": \"$1084.94\",\r\n  \"dob\": \"2016-05-11\",\r\n  \"email\": \"Alysson83@yahoo.com\",\r\n  \"isActive\": true,\r\n  \"linear_score\": 31,\r\n  \"weighted_score\": 40,\r\n  \"phone_country\": \"fr\",\r\n  \"phone_by_country\": \"01 80 03 25 39\",\r\n  \"age_group\": \"child\",\r\n  \"age_by_group\": 12,\r\n  \"url_protocol\": \"http\",\r\n  \"url_site\": \"twitter\",\r\n  \"url_domain\": \"gov\",\r\n  \"url\": \"https:\/\/www.twitter.gov\/Stella\",\r\n  \"devices\": [\r\n    \"EE-245\",\r\n    \"FF-012\",\r\n    \"GG-789\",\r\n    \"HH-246\"\r\n  ],\r\n  \"linked_devices\": [\r\n    [\r\n      \"AA-038\",\r\n      \"BB-577\"\r\n    ],\r\n    [\r\n      \"OO-565\",\r\n      \"KK-448\",\r\n      \"FF-281\"\r\n    ],\r\n    [\r\n      \"BB-495\",\r\n      \"AA-374\"\r\n    ],\r\n    [\r\n      \"BB-609\",\r\n      \"VV-899\",\r\n      \"LL-675\",\r\n      \"BB-291\"\r\n    ],\r\n    [\r\n      \"CC-048\"\r\n    ]\r\n  ],\r\n  \"address\": {\r\n    \"street\": \"6392 Crona Rue Curve\",\r\n    \"city\": \"Simeonland\",\r\n    \"zip\": \"98316\",\r\n    \"country\": \"Bahrain\",\r\n    \"prev_address\": {\r\n      \"street\": \"9063 Johns Islands Divide\",\r\n      \"city\": \"South Jayme\",\r\n      \"zip\": \"34950-8194\",\r\n      \"country\": \"Bulgaria\",\r\n      \"property_current_owner\": {\r\n        \"first_name\": \"Weston\",\r\n        \"middle_name\": \"Clyde\",\r\n        \"last_name\": \"Considine\",\r\n        \"phone\": \"(665) 343-9468\"\r\n      }\r\n    }\r\n  },\r\n  \"children\": [\r\n    {\r\n      \"first_name\": \"Darrel\",\r\n      \"gender\": null,\r\n      \"age\": 10\r\n    },\r\n    {\r\n      \"first_name\": \"Shea\",\r\n      \"gender\": null,\r\n      \"age\": 6\r\n    }\r\n  ],\r\n  \"visited_places\": [\r\n    {\r\n      \"country\": \"Iran\",\r\n      \"cities\": [\r\n        \"Heidenreichshire\",\r\n        \"West Luciano\",\r\n        \"Haroldmouth\",\r\n        \"West Jakeburgh\"\r\n      ]\r\n    },\r\n    {\r\n      \"country\": \"Comoros\",\r\n      \"cities\": [\r\n        \"New Valliemouth\",\r\n        \"East Kaleighland\"\r\n      ]\r\n    },\r\n    {\r\n      \"country\": \"Israel\",\r\n      \"cities\": [\r\n        \"East Kali\",\r\n        \"Pabloport\"\r\n      ]\r\n    },\r\n    {\r\n      \"country\": \"French Guiana\",\r\n      \"cities\": [\r\n        \"North Zachary\",\r\n        \"Kielmouth\"\r\n      ]\r\n    }\r\n  ]\r\n}\r\n\r\nSee the appendix for the YAML file used to define the data model and domain.\r\n<\/pre>\n<h5><strong><span style=\"color: #0000ff\">3. Benchmark Operations:<\/span><\/strong><\/h5>\n<p><span style=\"font-weight: 400\">The first four operations are the same as standard YCSB, except this is on JSON documents. Rest of the operations are new.<\/span><\/p>\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><strong>Insert<\/strong>: Insert a new JSON document. <\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><strong>Update<\/strong>: Update a JSON document by replacing the value of one scalar field. <\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><strong>Read<\/strong>: Read a JSON document, either one randomly chosen field or all fields.<\/span><\/li>\n<li><span style=\"font-weight: 400\"><strong>Delete<\/strong>: Delete a JSON document with a given key.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><strong>Scan<\/strong>: Scan JSON documents in order, starting at a randomly chosen record key. The number of records to scan is randomly chosen (LIMIT).<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><span style=\"color: #0000ff\"><strong>Search<\/strong><\/span>: Search JSON documents based on range predicates on 3 fields (customizable to n fields).<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><span style=\"color: #0000ff\"><strong>Page<\/strong><\/span>: Paginate result set of a query with predicate on a field in the document.<\/span>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">All customers in zip with randomly chosen OFFSET and LIMIT in SQL, N1QL.<\/span><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><span style=\"color: #0000ff\"><strong>NestScan<\/strong><\/span>: Query JSON documents based on a predicate on a 1-level nested field.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><strong><span style=\"color: #0000ff\">ArrayScan<\/span><\/strong>: Query JSON documents based on a predicate within the single-level array field.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><strong><span style=\"color: #0000ff\">ArrayDeepScan<\/span><\/strong>: Query JSON documents based on a predicate within a two-level array field (array of arrays).<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><span style=\"color: #0000ff\"><strong>Report<\/strong><\/span>: Query customer order details for customers in specific zipcode.<\/span>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Each customer has multiple orders.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Order document has order details.<\/span><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><strong><span style=\"color: #0000ff\">Report2<\/span><\/strong>: Generate sales order summary for a given day, group by zip.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><span style=\"color: #0000ff\"><strong>Load<\/strong><\/span>: Data loading.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><span style=\"color: #0000ff\"><strong>Sync<\/strong><\/span>: Data streaming and synchronization from another system.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\"><span style=\"color: #0000ff\"><strong>Aggregate<\/strong><\/span>: Do some grouping and aggregation.<\/span><\/li>\n<\/ol>\n<h5><span style=\"text-decoration: underline;color: #0000ff\"><strong>For Couchbase: Benchmark Operations implementation examples<\/strong><\/span><\/h5>\n<p><span style=\"font-weight: 400\">The first four operations are the same as standard YCSB, except this is on JSON documents. Rest of the operations are new.<\/span><\/p>\n<p>Couchbase implements YCSB in two modes.<\/p>\n<p>KV=true.\u00a0 KV stands for key-value. The simple YCSB operations INSERT, UPDATE, and DELETE can be implemented via KV APIs instead of queries.\u00a0 Setting KV=true means, use the KV API and KV=false means use the <a href=\"https:\/\/docs.couchbase.com\/server\/5.5\/n1ql\/n1ql-language-reference\/index.html\">N1QL<\/a> (SQL for JSON) query. See the tutorial for N1QL at <a href=\"https:\/\/query-tutorial.couchbase.com\">https:\/\/query-tutorial.couchbase.com<\/a><\/p>\n<ol>\n<li><span style=\"font-weight: 400\"><strong>Insert<\/strong>: Insert a new JSON document. <\/span><\/li>\n<\/ol>\n<pre class=\"theme:github font-size:14 line-height:16 lang:mysql decode:true\" title=\"INSERT implementation.\">KV=true: KV call to insert\r\nKV=false: INSERT INTO customer VALUES(...)<\/pre>\n<p><span style=\"font-weight: 400\"><span style=\"font-weight: 400\"><strong>2. Update<\/strong>: Update a JSON document by replacing the value of one scalar field.<\/span><\/span><\/p>\n<pre class=\"theme:github font-size:14 line-height:16 whitespace-before:1 whitespace-after:1 lang:mysql decode:true\">KV=true: KV call to UPDATE a single document.\r\nKV=false: UPDATE customer SET field1 = value USE KEYS [documentkey]<span style=\"font-weight: 400\"><strong>Read<\/strong>: Read a JSON document, either one randomly chosen field in the document or all the fields.<\/span><\/pre>\n<pre class=\"theme:github font-size:14 line-height:16 whitespace-before:1 whitespace-after:1 lang:mysql decode:true\">KV=true: KV call to fetch a single document.\r\nKV=false: SELECT * FROM customer USE KEYS [documentkey]<\/pre>\n<p><strong>3. Read: <\/strong>Fetch a JSON document with a given key.<\/p>\n<pre class=\"theme:github font-size:14 line-height:16 whitespace-before:1 whitespace-after:1 lang:mysql decode:true\">KV=true: KV call to fetch a single document.\r\nKV=false: SELECT * FROM customer USE KEYS [documentkey]<\/pre>\n<p><strong>4. Delete: <\/strong>Delete a JSON document with a given key.<\/p>\n<pre class=\"theme:github font-size:14 line-height:16 whitespace-before:1 whitespace-after:1 lang:mysql decode:true\">KV=true: KV call to fetch a single document.\r\nKV=false: DELETE FROM customer USE KEYS [documentkey]<\/pre>\n<p><span style=\"font-weight: 400\"><strong>5. Scan<\/strong>: Scan JSON documents in order, starting at a randomly chosen record key. The number of records to scan is randomly chosen (LIMIT).<\/span><\/p>\n<pre class=\"theme:github font-size:14 line-height:16 wrap:true whitespace-before:1 whitespace-after:1 lang:mysql decode:true\">KV=TRUE:\r\nSELECT META().id FROM customer WHERE META().id &gt; \u201cval\u201d ORDER BY META().id LIMIT &lt;num&gt;\r\nFetch the actual documents directly using KV calls from the benchmark driver.\r\n\r\nKV=false: SELECT * FROM customer WHERE META().id &gt; \u201cval\u201d ORDER BY META().id LIMIT &lt;num&gt;<\/pre>\n<p><span style=\"font-weight: 400\"><strong><span style=\"color: #0000ff\">6. Page<\/span><\/strong>: Paginate result set of a query with predicate on a field in the document.<\/span><\/p>\n<pre class=\"font-size:14 line-height:17 wrap:true scroll:true whitespace-before:2 whitespace-after:2 lang:mysql decode:true\">All customers in address.zip with randomly chosen OFFSET and LIMIT in SQL, N1QL\r\nKV=TRUE:\r\nSELECT META().id FROM customer WHERE address.zip = \u201cvalue\u201d OFFSET &lt;num&gt; LIMIT &lt;num&gt;\r\nFetch the actual documents directly using KV calls from the benchmark driver.\r\n\r\nKV=false: SELECT * FROM customer WHERE address.zip = \u201cvalue\u201d OFFSET &lt;num&gt; LIMIT &lt;num&gt;<\/pre>\n<p><span style=\"font-weight: 400\"><span style=\"color: #0000ff\"><strong>7. Search<\/strong><\/span>: Search JSON documents based on range predicates on <\/span><span style=\"font-weight: 400\">3 fields <\/span><span style=\"font-weight: 400\">(customizable to n fields).<\/span><\/p>\n<pre class=\"theme:github font-size:14 line-height:17 wrap:true scroll:true whitespace-before:2 whitespace-after:2 lang:default decode:true\">All customers WHERE (country = \u201cvalue1\u201d AND age_group = \u201cvalue2\u201d and YEAR(dob) = \u201cvalue\u201d )\r\nAll customers retrieved with randomly chosen OFFSET and LIMIT in SQL, N1QL\r\n\r\nKV=TRUE:\r\nSELECT META().id FROM customer WHERE country = \u201cvalue1\u201d AND age_group = \u201cvalue2\u201d and YEAR(dob) = \u201cvalue\u201d ORDER BY country, age_group, YEAR(dob) OFFSET &lt;num&gt; LIMIT &lt;num&gt;\r\nFetch the actual documents directly using KV calls from the benchmark driver.\r\n\r\nKV=false: SELECT * FROM customer WHERE WHERE country = \u201cvalue1\u201d AND age_group = \u201cvalue2\u201d and YEAR(dob) = \u201cvalue\u201d ORDER BY country, age_group, YEAR(dob) OFFSET &lt;num&gt; LIMIT &lt;num&gt;\r\n<\/pre>\n<p><span style=\"font-weight: 400\"><span style=\"color: #0000ff\"><strong>8. NestScan<\/strong><\/span>: Query JSON documents based on a predicate on a 1-level nested field.<\/span><\/p>\n<pre class=\"theme:github font-size:14 line-height:17 wrap:true scroll:true whitespace-before:2 whitespace-after:2 lang:mysql decode:true\">KV=TRUE:\r\nSELECT META().id FROM customer WHERE address.prev_address.zip = \u201cvalue\u201d LIMIT &lt;num&gt;\r\nFetch the actual documents directly using KV calls from the benchmark driver.\r\n\r\nKV=false: SELECT * FROM customer WHERE address.prev_address.zip = \u201cvalue\u201d LIMIT &lt;num&gt;\r\n<\/pre>\n<p><span style=\"font-weight: 400\"><span style=\"color: #0000ff\"><strong>9. ArrayScan<\/strong><\/span>: Query JSON documents based on a predicate within the single-level array field.<\/span><\/p>\n<pre class=\"font-size:14 line-height:17 wrap:true scroll:true whitespace-before:2 whitespace-after:2 lang:default decode:true\">Find all customers who have devices with a value. E.g. FF-012\r\nSample devices field\r\n\u00a0\"devices\": [\r\n\u00a0\u00a0\u00a0\"EE-245\",\r\n\u00a0\u00a0\u00a0\"FF-012\",\r\n\u00a0\u00a0\u00a0\"GG-789\",\r\n\u00a0\u00a0\u00a0\"HH-246\"\r\n\u00a0],\r\nKV=TRUE:\r\nSELECT META().id FROM customer WHERE ANY v IN devices SATISFIES v = \u201cFF-012\u201d END ORDER BY META().id LIMIT &lt;num&gt;\r\nFetch the actual documents directly using KV calls from the benchmark driver.\r\nKV=false: SELECT * FROM customer WHERE ANY v IN devices SATISFIES v = \u201cFF-012\u201d ORDER BY META().id END LIMIT &lt;num&gt;<\/pre>\n<p><span style=\"font-weight: 400\"><span style=\"color: #0000ff\"><strong>10. ArrayDeepscan<\/strong><\/span>: Query JSON documents based on a predicate within a two-level array field (array of arrays).<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Get me list of all customers who have visited Paris, France.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\"><strong>KV=true:<\/strong> <\/span><\/p>\n<pre class=\"theme:github font-size:14 line-height:16 whitespace-before:1 whitespace-after:1 lang:js decode:true\">SELECT META().id FROM customer\r\nWHERE ANY v in visited_places SATISFIES\r\nv.country = \u201cFrance\u201d AND\r\nANY c in v.cities SATISFIES c = \u201cParis\u201d END\r\nORDER BY META().id\r\nLIMIT &lt;num&gt;<\/pre>\n<p><span style=\"font-weight: 400\">Fetch the actual documents directly using KV calls from the benchmark driver.<\/span><\/p>\n<p><strong>KV=false: <\/strong><\/p>\n<pre class=\"theme:github font-size:14 line-height:16 whitespace-before:1 whitespace-after:1 lang:js decode:true\">SELECT * FROM customer\r\nWHERE ANY v in visited_places SATISFIES v.country = \u201cFrance\u201d AND\r\n           ANY c in v.cities SATISFIES c = \u201cParis\u201d END\r\n      END\r\nORDER BY META().id\r\nLIMIT &lt;num&gt;<\/pre>\n<p><span style=\"font-weight: 400\"><span style=\"color: #0000ff\"><strong>11. Report<\/strong><\/span>: Query customer order details for customers in specific zipcode.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\">\n<pre class=\"theme:github font-size:14 line-height:16 wrap:true whitespace-before:1 whitespace-after:1 lang:default decode:true\">Each customer has multiple orders.\r\nOrder document has order details.\r\nKV=TRUE:\r\nNot possible (easily without significant perf impact.\r\nKV=false:\r\n\r\nSELECT *\r\nFROM customer c INNER JOIN orders o \u00a0\r\nON (META(id) IN c.order_list)\r\nWHERE address.zip = \"val\" \u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\r\n\r\nANSI JOIN with HASH join:\r\nSELECT *\r\nFROM customer c INNER JOIN orders o USE HASH (probe)\r\nON (META(id) IN c.order_list)\r\nWHERE address.zip = \u201cval\u201d\r\n\r\n<\/pre>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\"><strong><span style=\"color: #0000ff\">12. Report2<\/span><\/strong>: Generate sales order summary for a given day, group by zip.<\/span><\/p>\n<pre class=\"theme:github font-size:14 line-height:17 wrap:true scroll:true lang:mysql decode:true\">KV=TRUE:\r\nNeed to write a program\r\nKV=false:\r\nSELECT \u00a0o.day, c.zip, SUM(o.salesamt)\r\nFROM customer c INNER JOIN orders o \u00a0\r\nON (META(id) IN c.order_list)\r\nWHERE c.zip = \u201cvalue\u201d\r\nAND o.day = \u201cvalue\u201d\r\nGROUP BY c.day, c.zip\r\nORDER BY SUM(o.sales_amt)\r\n\r\n\r\n\r\n----ANSI join\r\n\r\nSELECT \u00a0o.day, c.zip, SUM(o.salesamt)\r\nFROM customer c INNER JOIN orders o\r\nON (META(id) IN c.order_list)\r\nWHERE c.zip = \u201cvalue\u201d\r\nAND o.day = \u201cvalue\u201d\r\nGROUP BY c.day, c.zip\r\nORDER BY SUM(o.sales_amt)\r\n\r\n------ANSI join with HASH join\r\n\r\nSELECT \u00a0o.day, c.zip, SUM(o.salesamt)\r\nFROM customer c INNER JOIN orders o USE HASH (probe)\r\nON (META(id) IN c.order_list)\r\nWHERE c.zip = \u201cvalue\u201d\r\nAND o.day = \u201cvalue\u201d\r\nGROUP BY c.day, c.zip\r\nORDER BY SUM(o.sales_amt)<\/pre>\n<p><span style=\"font-weight: 400\"><span style=\"color: #0000ff\"><strong>13. Load<\/strong><\/span>: Data loading.<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">LOAD 1 million documents.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">LOAD 10 million documents.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\"><span style=\"color: #0000ff\"><strong>14. Sync<\/strong><\/span>: Data streaming and synchronization from another system<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Need to measure the data sync performance.<\/span>\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Sync 1 million documents. 50% update, 50% insert.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Sync 10 million documents. 80% update, 20% insert.<\/span><\/li>\n<\/ol>\n<\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Ideally, this sync would be done from Kafka or some other connector pulling data from a different source.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\"><span style=\"color: #0000ff\"><strong>15. Aggregate<\/strong><\/span>: Do some grouping and aggregation.<\/span><\/p>\n<pre class=\"lang:default decode:true \">---Group Query 1\r\n\r\nSELECT c.zip, COUNT(1)\r\nFROM customer c\r\nWHERE c.zip between \"value1\" and \"value2\"\r\nGROUP BY c.zip<\/pre>\n<p>&nbsp;<\/p>\n<pre class=\"font-size:14 line-height:17 wrap:true scroll:true whitespace-before:2 whitespace-after:2 lang:mysql decode:true\">---GROUP BY query 2\r\n\r\nSELECT o.day, SUM(o.salesamt)\r\nFROM orders o\r\nWHERE o.day \u00a0between \u201cvalue1\u201d and \u201cvalue2\u201d\r\nGROUP BY o.day;<\/pre>\n<h5><span style=\"color: #0000ff\"><strong>4. Benchmark Workloads<\/strong><\/span><\/h5>\n<p><span style=\"font-weight: 400\">Workloads are a combination of these operations.<\/span><\/p>\n<p><span style=\"font-weight: 400\">To begin with, the workload definition can reuse the definitions of the YCSB definition: workload-A through workload-E. Details are available at <\/span><a href=\"https:\/\/github.com\/brianfrankcooper\/YCSB\/wiki\/Core-Workloads\"><span style=\"font-weight: 400\">https:\/\/github.com\/brianfrankcooper\/YCSB\/wiki\/Core-Workloads<\/span><\/a><span style=\"font-weight: 400\">.\u00a0 We\u2019ll need to define additional workloads with a combination of operations defined above.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Workload SA is the same as workload A on the new model. <\/span><span style=\"font-weight: 400\">Ditto with workload B through F. \u00a0We\u2019ll call them SB through SF to differentiate from the workload B through F.<\/span><\/p>\n<table>\n<tbody>\n<tr>\n<td><span style=\"color: #0000ff\"><strong>Workload<\/strong><\/span><\/td>\n<td><span style=\"color: #0000ff\"><strong>Operations<\/strong><\/span><\/td>\n<td><span style=\"color: #0000ff\"><strong>Record selection<\/strong><\/span><\/td>\n<td><span style=\"color: #0000ff\"><strong>Application Example<\/strong><\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SA &#8212; Update heavy<\/span><\/td>\n<td><span style=\"font-weight: 400\">Read: 50%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Update 50%<\/span><\/td>\n<td><span style=\"font-weight: 400\">Zipfian<\/span><\/td>\n<td><span style=\"font-weight: 400\">Session store recording recent actions in a user session<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SB &#8212; Read heavy<\/span><\/td>\n<td><span style=\"font-weight: 400\">Read: 95%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Update: 5%<\/span><\/td>\n<td><span style=\"font-weight: 400\">Zipfian<\/span><\/td>\n<td><span style=\"font-weight: 400\">Photo tagging; add a tag is an update, but most operations<\/span><\/p>\n<p><span style=\"font-weight: 400\">Update: 5% are to read tags<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SC &#8212; Read only<\/span><\/td>\n<td><span style=\"font-weight: 400\">Read: 100%<\/span><\/td>\n<td><span style=\"font-weight: 400\">Zipfian<\/span><\/td>\n<td><span style=\"font-weight: 400\">User profile cache, where profiles are constructed elsewhere (e.g., Hadoop)<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SD &#8212; Read latest<\/span><\/td>\n<td><span style=\"font-weight: 400\">Read: 95%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Insert 5%<\/span><\/td>\n<td><span style=\"font-weight: 400\">Latest<\/span><\/td>\n<td><span style=\"font-weight: 400\">User status updates; people want to read the latest statuses<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SE &#8212; Short ranges<\/span><\/td>\n<td><span style=\"font-weight: 400\">Scan: 95%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Insert: 5%<\/span><\/td>\n<td><span style=\"font-weight: 400\">Zipfian\/Uniform<\/span><\/td>\n<td><span style=\"font-weight: 400\">Threaded conversations, where each scan is for the posts in a given thread (assumed to be clustered by thread id)<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SF &#8212; Read, modify, write<\/span><\/td>\n<td><span style=\"font-weight: 400\">Read: 50%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Write: 50%<\/span><\/td>\n<td><span style=\"font-weight: 400\">Zipfian<\/span><\/td>\n<td><span style=\"font-weight: 400\">user database, where user records are read and modified by the user or to record user activity.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SG &#8212; Page heavy<\/span><\/td>\n<td><span style=\"font-weight: 400\">Page: 90%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Insert: 5%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Update:5%<\/span><\/td>\n<td><span style=\"font-weight: 400\">Zipfian<\/span><\/td>\n<td><span style=\"font-weight: 400\">User database, where new users are added, existing records are updated, pagination queries on the system.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SH &#8212; Search heavy<\/span><\/td>\n<td><span style=\"font-weight: 400\">Search: 90%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Insert: 5%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Update: 5%<\/span><\/td>\n<td><span style=\"font-weight: 400\">Zipfian<\/span><\/td>\n<td><span style=\"font-weight: 400\">User database, where new users are added, existing records are updated, search queries on the system.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SI &#8212; NestScan heavy<\/span><\/td>\n<td><span style=\"font-weight: 400\">Nestscan: 90%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Insert: 5%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Update: 5%<\/span><\/td>\n<td><span style=\"font-weight: 400\">Zipfian<\/span><\/td>\n<td><span style=\"font-weight: 400\">User database, where new users are added, existing records are updated, nestscan queries on the system.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SJ &#8212; Arrayscan heavy<\/span><\/td>\n<td><span style=\"font-weight: 400\">Arrayscan: 90%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Insert: 5%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Update: 5%<\/span><\/td>\n<td><span style=\"font-weight: 400\">Zipfian<\/span><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SK &#8212; ArrayDeepscan heavy<\/span><\/td>\n<td><span style=\"font-weight: 400\">ArrayDeepScan: 90%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Insert: 5%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Update: 5%<\/span><\/td>\n<td><span style=\"font-weight: 400\">Zipfian<\/span><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SL &#8212; Report<\/span><\/td>\n<td><span style=\"font-weight: 400\">Report: 100%<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SL &#8212; Report2<\/span><\/td>\n<td><span style=\"font-weight: 400\">Report2: 100%<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SLoad &#8212; Load <\/span><\/td>\n<td><span style=\"font-weight: 400\">Load: 100%<\/span><\/td>\n<td><span style=\"font-weight: 400\">Everything<\/span><\/td>\n<td><span style=\"font-weight: 400\">Data load to setup SoE<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SN &#8212; Aggregate<\/span><\/p>\n<p><span style=\"font-weight: 400\">(SN1, SN2)<\/span><\/td>\n<td><span style=\"font-weight: 400\">Aggregation: 90%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Insert: 5%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Update: 5%<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SMIX &#8212; Mixed workload<\/span><\/td>\n<td><span style=\"font-weight: 400\">Page:20%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Search:20%<\/span><span style=\"font-weight: 400\"><br \/>\n<\/span><span style=\"font-weight: 400\">Nestscan:15%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Arrayscan:15%<\/span><\/p>\n<p><span style=\"font-weight: 400\">ArrayDeepscan:10%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Aggregate: 10%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Report: 10%<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">See below.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">SSync &#8212; Sync<\/span><\/td>\n<td><span style=\"font-weight: 400\">Sync: 100%<\/span><\/p>\n<p><span style=\"font-weight: 400\">Merge\/Update: 70%<\/span><\/p>\n<p><span style=\"font-weight: 400\">New\/Insert: 30%<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">Continuous sync of data from other systems to systems of engagement. See below.<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>&nbsp;<\/p>\n<p><span style=\"text-decoration: underline\">Example Configuration for YCSB\/JSON Workload<\/span><\/p>\n<pre class=\"theme:github font-size:14 line-height:17 wrap:true scroll:true whitespace-before:2 whitespace-after:2 lang:tex decode:true \">recordcount=1000\r\noperationcount=1000\r\nworkload=com.yahoo.ycsb.workloads.CoreWorkload\r\nFilternumlow = 2\r\nFilternumhigh = 14\r\nSortnumlow = 3\r\nSortnumhigh = 6\r\npage1propotion=0.95\r\ninsertproportion=0.05\r\nrequestdistribution=zipfian\r\nmaxscanlength=100\r\nscanlengthdistribution=uniform<\/pre>\n<p><span style=\"color: #0000ff\"><strong>Acknowledgments<\/strong><\/span><\/p>\n<p>Thanks to <strong>Raju Suravarjjala,\u00a0<\/strong>Couchbase Senior director for QE and Performance, for pushing us to do this and the entire performance team for supporting this effort.\u00a0The YCSB-JSON benchmark was developed in collaboration with\u00a0<strong>Alex Gyryk,\u00a0<\/strong>Couchbase Principal Performance Engineer.\u00a0 He developed the data models for customer and orders used in this paper and implemented the operations and workloads in YCSB-JSON for Couchbase and MongoDB.\u00a0 The YCSB-JSON implementation is available at:\u00a0<a href=\"https:\/\/github.com\/couchbaselabs\/YCSB\">https:\/\/github.com\/couchbaselabs\/YCSB<\/a><\/p>\n<p>Thanks to\u00a0<strong>Aron Benton,\u00a0<\/strong>Couchase Solution Architect, for developing an easy to use and efficient JSON data generator, fakeit.\u00a0 He developed this prior to joining Couchbase. It is available at:\u00a0<a href=\"https:\/\/github.com\/bentonam\/fakeit\">https:\/\/github.com\/bentonam\/fakeit<\/a><\/p>\n<h5><span style=\"color: #0000ff\"><strong>Next part<\/strong><\/span><\/h5>\n<h5>In the next article on YCSB-JSON, Alex will explain the implementations of this benchmark for Couchbase and MongoDB.\u00a0 The source code for the implementation is available\u00a0at:\u00a0<a href=\"https:\/\/github.com\/couchbaselabs\/YCSB\">https:\/\/github.com\/couchbaselabs\/YCSB<\/a><\/h5>\n<h5><span style=\"color: #0000ff\"><strong>References<\/strong><\/span><\/h5>\n<ol>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Benchmarking Cloud Serving Systems with YCSB: <\/span><a href=\"https:\/\/www.cs.duke.edu\/courses\/fall13\/cps296.4\/838-CloudPapers\/ycsb.pdf\"><span style=\"font-weight: 400\">https:\/\/www.cs.duke.edu\/courses\/fall13\/cps296.4\/838-CloudPapers\/ycsb.pdf<\/span><\/a><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">JSON: <\/span><a href=\"https:\/\/json.org\"><span style=\"font-weight: 400\">https:\/\/json.org<\/span><\/a><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">JSON Generator: <\/span><span style=\"font-weight: 400\"><a href=\"https:\/\/www.json-generator.com\/\">https:\/\/www.json-generator.com\/<\/a><\/span><\/li>\n<li style=\"font-weight: 400\">YCSB-JSON Implementation: <a href=\"https:\/\/github.com\/couchbaselabs\/YCSB\">https:\/\/github.com\/couchbaselabs\/YCSB<\/a><\/li>\n<\/ol>\n<h5><span style=\"color: #0000ff\"><strong>Appendix<\/strong><\/span><\/h5>\n<p><b>YAML to generate the customer dataset.<\/b><\/p>\n<pre class=\"theme:github font-size:14 line-height:17 wrap:true scroll:true whitespace-before:1 whitespace-after:1 lang:yaml range:1-25 decode:true \">name: AdvJSON\r\ntype: object\r\nkey: _id\r\ndata:\r\n  fixed: 10000\r\nproperties:\r\n  _id:\r\n    type: string\r\n    data:\r\n      post_build: \"return '' + this.doc_id + '_advjson';\"\r\n  doc_id:\r\n    type: integer\r\n    description: The document id\r\n    data:\r\n      build: \"return document_index + 1\"\r\n  gid:\r\n    type:\r\n    description: \"guid\"\r\n    data:\r\n        build: \"return chance.guid();\"\r\n  first_name:\r\n    type: string\r\n    description: \"First name - string, linked to url as the personal page\"\r\n    data:\r\n      fake: \"{{name.firstName}}\"\r\n  middle_name:\r\n    type: string\r\n    description: \"Middle name - string\"\r\n    data:\r\n      build: \"return chance.bool() ? chance.name({middle: true}).split(' ')[1] : null;\"\r\n  last_name:\r\n    type: string\r\n    description: \"Last name - string\"\r\n    data:\r\n      fake: \"{{name.lastName}}\"\r\n  ballance_current:\r\n    type: string\r\n    description: \"currency\"\r\n    data:\r\n      build: \"return chance.dollar();\"\r\n  dob:\r\n    type: string\r\n    description: \"Date\"\r\n    data:\r\n      build: \"return chance.bool() ? new Date(faker.date.past()).toISOString().split('T')[0] : null;\"\r\n  email:\r\n    type: string\r\n    description: \"email\"\r\n    data:\r\n      fake: \"{{internet.email}}\"\r\n  isActive:\r\n    type: boolean\r\n    description: \"active boolean\"\r\n    data:\r\n      build: \"return chance.bool();\"\r\n  linear_score:\r\n    type: integer\r\n    description: \"integer 0 - 100\"\r\n    data:\r\n      build: \"return chance.integer({min: 0, max: 100});\"\r\n  weighted_score:\r\n    type: integer\r\n    description: \"integer 0 - 100 with zipf distribution\"\r\n    data:\r\n      build: \"return chance.weighted([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [1, 0.4, 0.3, 0.25, 0.2, 0.17, 0.13, 0.11, 0.1, 0.09]) * 10 + chance.integer({min: 0, max: 10});\"\r\n  phone_country:\r\n    type: string\r\n    description: \"field linked to phone, choices: us, uk, fr\"\r\n    data:\r\n      build: \"return  chance.pickone(['us', 'uk', 'fr']);\"\r\n  phone_by_country:\r\n    type: string\r\n    description: \"phone number by country code, linked to phone_country field\"\r\n    data:\r\n      post_build: \"return chance.phone({country: this.phone_country});\"\r\n  age_group:\r\n    type: string\r\n    description: \"field linked to age, choices: child, teen, adult, senior\"\r\n    data:\r\n      build: \"return  chance.pickone(['child', 'teen', 'adult', 'senior']);\"\r\n  age_by_group:\r\n    type: integer\r\n    description: \"age by group, linked to age_group field\"\r\n    data:\r\n      post_build: \"return chance.age({type: this.age_group});\"\r\n  url_protocol:\r\n    type: string\r\n    description: \"lined to url\"\r\n    data:\r\n      build: \"return  chance.pickone(['http', 'https']);\"\r\n  url_site:\r\n    type: string\r\n    description: \"lined to url\"\r\n    data:\r\n      build: \"return  chance.pickone(['twitter', 'facebook', 'flixter', 'instagram', 'last', 'linkedin', 'xing', 'google', 'snapchat', 'tumblr', 'pinterest', 'youtube', 'vine', 'whatsapp']);\"\r\n  url_domain:\r\n    type: string\r\n    description: \"lined to url\"\r\n    data:\r\n      build: \"return  chance.pickone(['com', 'org', 'net', 'int', 'edu', 'gov', 'mil', 'us', 'uk', 'ft', 'it', 'de']);\"\r\n  url:\r\n    type: string\r\n    description: \"user profile url, linked to other document fields\"\r\n    data:\r\n      post_build: \"return '' + this.url_protocol + ':\/\/www.' + this.url_site + '.' + this.url_domain + '\/' + this.first_name;\"\r\n  devices:\r\n    type: array\r\n    description: \"Array of strings - device\"\r\n    items:\r\n      $ref: '#\/definitions\/Device'\r\n      data:\r\n        min: 2\r\n        max: 6\r\n  linked_devices:\r\n    type: array\r\n    description: \"Array of array of string\"\r\n    items:\r\n      $ref: '#\/definitions\/Device'\r\n      data:\r\n        min: 3\r\n        max: 6\r\n        submin: 1\r\n        submax: 4\r\n  address:\r\n    type: object\r\n    description: An object of the Address\r\n    schema:\r\n      $ref: '#\/definitions\/Address'\r\n  children:\r\n    type: array\r\n    description: \"An array of Children objects\"\r\n    items:\r\n      $ref: '#\/definitions\/Children'\r\n      data:\r\n        min: 0\r\n        max: 5\r\n  visited_places:\r\n    type: array\r\n    description: \"Array of objects with arrays\"\r\n    items:\r\n      $ref: '#\/definitions\/Visited_places'\r\n      data:\r\n        min: 1\r\n        max: 4\r\n\r\ndefinitions:\r\n  Device:\r\n    type: string\r\n    description: \"string AA-001 with zipf step distribution\"\r\n    data:\r\n      build: \"return chance.weighted(['AA', 'BB', 'CC', 'DD', 'EE', 'FF', 'GG', 'HH', 'II', 'JJ', 'KK', 'LL', 'MM', 'NN', 'OO', 'PP', 'QQ', 'RR', 'SS', 'TT', 'UU', 'VV', 'WW', 'XX', 'YY', 'ZZ'], [1, 0.5, 0.333, 0.25, 0.2, 0.167, 0.143, 0.125, 0.111, 0.1, 0.091, 0.083, 0.077, 0.071, 0.067, 0.063, 0.059, 0.056, 0.053, 0.050, 0.048, 0.045, 0.043, 0.042, 0.04, 0.038]).concat('-').concat(chance.string({length: 3, pool: '0123456789'}));\"\r\n  Address:\r\n    type: object\r\n    properties:\r\n      street:\r\n        type: string\r\n        description: The address 1\r\n        data:\r\n          build: \"return faker.address.streetAddress() + ' ' + faker.address.streetSuffix();\"\r\n      city:\r\n        type: string\r\n        description: The locality\r\n        data:\r\n          build: \"return faker.address.city();\"\r\n      zip:\r\n        type: string\r\n        description: The zip code \/ postal code\r\n        data:\r\n          build: \"return faker.address.zipCode();\"\r\n      country:\r\n        type: string\r\n        description: The country\r\n        data:\r\n          build: \"return faker.address.country();\"\r\n      prev_address:\r\n        type: object\r\n        description: An object of the Address\r\n        schema:\r\n          $ref: '#\/definitions\/Previous_address'\r\n  Previous_address:\r\n    type: object\r\n    properties:\r\n      street:\r\n        type: string\r\n        description: The address 1\r\n        data:\r\n          build: \"return faker.address.streetAddress() + ' ' + faker.address.streetSuffix();\"\r\n      city:\r\n        type: string\r\n        description: The locality\r\n        data:\r\n          build: \"return faker.address.city();\"\r\n      zip:\r\n        type: string\r\n        description: The zip code \/ postal code\r\n        data:\r\n          build: \"return faker.address.zipCode();\"\r\n      country:\r\n        type: string\r\n        description: The country\r\n        data:\r\n          build: \"return faker.address.country();\"\r\n      property_current_owner:\r\n        type: object\r\n        description: \"owner object\"\r\n        schema:\r\n          $ref: '#\/definitions\/Property_owner'\r\n  Children:\r\n    type: object\r\n    properties:\r\n      first_name:\r\n        type: string\r\n        description: \"first name - string\"\r\n        data:\r\n          fake: \"{{name.firstName}}\"\r\n      gender:\r\n        type: string\r\n        description: \"gender M or F\"\r\n        data:\r\n          build: \"return chance.bool({likelihood: 50})? faker.random.arrayElement(['M', 'F']) : null;\"\r\n      age:\r\n        type: integer\r\n        description: \"age - 1 to 17\"\r\n        data:\r\n          build: \"return chance.integer({min: 1, max: 17})\"\r\n  Visited_cities:\r\n    type: string\r\n    description: \"city\"\r\n    data:\r\n      build: \"return faker.address.city();\"\r\n  Visited_places:\r\n    type: object\r\n    properties:\r\n      country:\r\n        type: string\r\n        data:\r\n          build: \"return faker.address.country();\"\r\n      cities:\r\n        type: array\r\n        description: \"Array of strings - device id\"\r\n        items:\r\n          $ref: '#\/definitions\/Visited_cities'\r\n          data:\r\n            min: 1\r\n            max: 5\r\n  Property_owner:\r\n    type: object\r\n    properties:\r\n      first_name:\r\n        type: string\r\n        description: \"First name - string, linked to url as the personal page\"\r\n        data:\r\n          fake: \"{{name.firstName}}\"\r\n      middle_name:\r\n        type: string\r\n        description: \"Middle name - string\"\r\n        data:\r\n          build: \"return chance.bool() ? chance.name({middle: true}).split(' ')[1] : null;\"\r\n      last_name:\r\n        type: string\r\n        description: \"Last name - string\"\r\n        data:\r\n          fake: \"{{name.lastName}}\"\r\n      phone:\r\n        type: string\r\n        description: \"phone\"\r\n        data:\r\n          build: \"return chance.phone();\"\r\n<\/pre>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Bruce\u00a0Lindsay\u00a0once said, &#8220;There are three things important in the database world: Performance, Performance, and Performance&#8221;.\u00a0 Most enterprise architects know, as we progress in database features and architectures, it&#8217;s important to measure performance in an open way so they can compare [&hellip;]<\/p>\n","protected":false},"author":55,"featured_media":5800,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"categories":[1814,1821,1816,1819,9417,1812,2201],"tags":[2279,1572,1261,1309,1725,2278],"ppma_author":[8929],"class_list":["post-5777","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-application-design","category-couchbase-architecture","category-couchbase-server","category-data-modeling","category-performance","category-n1ql-query","category-tools-sdks","tag-benchmark","tag-database","tag-json","tag-mongodb","tag-nosql-database","tag-ycsb"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v27.3 (Yoast SEO v27.3) - https:\/\/yoast.com\/product\/yoast-seo-premium-wordpress\/ -->\n<title>Using YCSB to Benchmark JSON Databases - The Couchbase Blog<\/title>\n<meta name=\"description\" content=\"This post covers the YCSB benchmark and provides examples of benchmark operations, workloads, YCSB-JSON implementation and explains how to run YCSB-JSON.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Using YCSB to Benchmark JSON Databases\" \/>\n<meta property=\"og:description\" content=\"This post covers the YCSB benchmark and provides examples of benchmark operations, workloads, YCSB-JSON implementation and explains how to run YCSB-JSON.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/\" \/>\n<meta property=\"og:site_name\" content=\"The Couchbase Blog\" \/>\n<meta property=\"article:published_time\" content=\"2018-09-10T11:27:56+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-06-14T03:19:57+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2018\/09\/Screen-Shot-2018-09-10-at-4.21.45-AM.png\" \/>\n\t<meta property=\"og:image:width\" content=\"2048\" \/>\n\t<meta property=\"og:image:height\" content=\"498\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Keshav Murthy\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@rkeshavmurthy\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Keshav Murthy\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"9 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\\\/\"},\"author\":{\"name\":\"Keshav Murthy\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/person\\\/c261644262bf98e146372fe647682636\"},\"headline\":\"Using YCSB to Benchmark JSON Databases\",\"datePublished\":\"2018-09-10T11:27:56+00:00\",\"dateModified\":\"2025-06-14T03:19:57+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\\\/\"},\"wordCount\":1962,\"commentCount\":6,\"publisher\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/1\\\/2018\\\/09\\\/Screen-Shot-2018-09-10-at-4.21.45-AM.png\",\"keywords\":[\"benchmark\",\"database\",\"JSON\",\"mongodb\",\"NoSQL Database\",\"ycsb\"],\"articleSection\":[\"Application Design\",\"Couchbase Architecture\",\"Couchbase Server\",\"Data Modeling\",\"High Performance\",\"SQL++ \\\/ N1QL Query\",\"Tools &amp; SDKs\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\\\/\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\\\/\",\"name\":\"Using YCSB to Benchmark JSON Databases - The Couchbase Blog\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\\\/#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/1\\\/2018\\\/09\\\/Screen-Shot-2018-09-10-at-4.21.45-AM.png\",\"datePublished\":\"2018-09-10T11:27:56+00:00\",\"dateModified\":\"2025-06-14T03:19:57+00:00\",\"description\":\"This post covers the YCSB benchmark and provides examples of benchmark operations, workloads, YCSB-JSON implementation and explains how to run YCSB-JSON.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\\\/#primaryimage\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/1\\\/2018\\\/09\\\/Screen-Shot-2018-09-10-at-4.21.45-AM.png\",\"contentUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/sites\\\/1\\\/2018\\\/09\\\/Screen-Shot-2018-09-10-at-4.21.45-AM.png\",\"width\":2048,\"height\":498},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Using YCSB to Benchmark JSON Databases\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#website\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/\",\"name\":\"The Couchbase Blog\",\"description\":\"Couchbase, the NoSQL Database\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#organization\",\"name\":\"The Couchbase Blog\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/2023\\\/04\\\/admin-logo.png\",\"contentUrl\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/wp-content\\\/uploads\\\/2023\\\/04\\\/admin-logo.png\",\"width\":218,\"height\":34,\"caption\":\"The Couchbase Blog\"},\"image\":{\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/logo\\\/image\\\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/#\\\/schema\\\/person\\\/c261644262bf98e146372fe647682636\",\"name\":\"Keshav Murthy\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/af74df754db27152971d0aed2f323ead5a1f9fe5afd0209af91e12e784451224?s=96&d=mm&r=g4e51d72fc07c662aa791316deafffac4\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/af74df754db27152971d0aed2f323ead5a1f9fe5afd0209af91e12e784451224?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/af74df754db27152971d0aed2f323ead5a1f9fe5afd0209af91e12e784451224?s=96&d=mm&r=g\",\"caption\":\"Keshav Murthy\"},\"description\":\"Keshav Murthy is a Vice President at Couchbase R&amp;D. Previously, he was at MapR, IBM, Informix, Sybase, with more than 20 years of experience in database design &amp; development. He lead the SQL and NoSQL R&amp;D team at IBM Informix. He has received two President's Club awards at Couchbase, two Outstanding Technical Achievement Awards at IBM. Keshav has a bachelor's degree in Computer Science and Engineering from the University of Mysore, India, and has received twenty four US patents.\",\"sameAs\":[\"https:\\\/\\\/blog.planetnosql.com\\\/\",\"https:\\\/\\\/x.com\\\/rkeshavmurthy\"],\"url\":\"https:\\\/\\\/www.couchbase.com\\\/blog\\\/author\\\/keshav-murthy\\\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Using YCSB to Benchmark JSON Databases - The Couchbase Blog","description":"This post covers the YCSB benchmark and provides examples of benchmark operations, workloads, YCSB-JSON implementation and explains how to run YCSB-JSON.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/","og_locale":"en_US","og_type":"article","og_title":"Using YCSB to Benchmark JSON Databases","og_description":"This post covers the YCSB benchmark and provides examples of benchmark operations, workloads, YCSB-JSON implementation and explains how to run YCSB-JSON.","og_url":"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/","og_site_name":"The Couchbase Blog","article_published_time":"2018-09-10T11:27:56+00:00","article_modified_time":"2025-06-14T03:19:57+00:00","og_image":[{"width":2048,"height":498,"url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2018\/09\/Screen-Shot-2018-09-10-at-4.21.45-AM.png","type":"image\/png"}],"author":"Keshav Murthy","twitter_card":"summary_large_image","twitter_creator":"@rkeshavmurthy","twitter_misc":{"Written by":"Keshav Murthy","Est. reading time":"9 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/#article","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/"},"author":{"name":"Keshav Murthy","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/c261644262bf98e146372fe647682636"},"headline":"Using YCSB to Benchmark JSON Databases","datePublished":"2018-09-10T11:27:56+00:00","dateModified":"2025-06-14T03:19:57+00:00","mainEntityOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/"},"wordCount":1962,"commentCount":6,"publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2018\/09\/Screen-Shot-2018-09-10-at-4.21.45-AM.png","keywords":["benchmark","database","JSON","mongodb","NoSQL Database","ycsb"],"articleSection":["Application Design","Couchbase Architecture","Couchbase Server","Data Modeling","High Performance","SQL++ \/ N1QL Query","Tools &amp; SDKs"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/","url":"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/","name":"Using YCSB to Benchmark JSON Databases - The Couchbase Blog","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/#primaryimage"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2018\/09\/Screen-Shot-2018-09-10-at-4.21.45-AM.png","datePublished":"2018-09-10T11:27:56+00:00","dateModified":"2025-06-14T03:19:57+00:00","description":"This post covers the YCSB benchmark and provides examples of benchmark operations, workloads, YCSB-JSON implementation and explains how to run YCSB-JSON.","breadcrumb":{"@id":"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/#primaryimage","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2018\/09\/Screen-Shot-2018-09-10-at-4.21.45-AM.png","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2018\/09\/Screen-Shot-2018-09-10-at-4.21.45-AM.png","width":2048,"height":498},{"@type":"BreadcrumbList","@id":"https:\/\/www.couchbase.com\/blog\/ycsb-json-benchmarking-json-databases-by-extending-ycsb\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.couchbase.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Using YCSB to Benchmark JSON Databases"}]},{"@type":"WebSite","@id":"https:\/\/www.couchbase.com\/blog\/#website","url":"https:\/\/www.couchbase.com\/blog\/","name":"The Couchbase Blog","description":"Couchbase, the NoSQL Database","publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.couchbase.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.couchbase.com\/blog\/#organization","name":"The Couchbase Blog","url":"https:\/\/www.couchbase.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png","width":218,"height":34,"caption":"The Couchbase Blog"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/c261644262bf98e146372fe647682636","name":"Keshav Murthy","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/af74df754db27152971d0aed2f323ead5a1f9fe5afd0209af91e12e784451224?s=96&d=mm&r=g4e51d72fc07c662aa791316deafffac4","url":"https:\/\/secure.gravatar.com\/avatar\/af74df754db27152971d0aed2f323ead5a1f9fe5afd0209af91e12e784451224?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/af74df754db27152971d0aed2f323ead5a1f9fe5afd0209af91e12e784451224?s=96&d=mm&r=g","caption":"Keshav Murthy"},"description":"Keshav Murthy is a Vice President at Couchbase R&amp;D. Previously, he was at MapR, IBM, Informix, Sybase, with more than 20 years of experience in database design &amp; development. He lead the SQL and NoSQL R&amp;D team at IBM Informix. He has received two President's Club awards at Couchbase, two Outstanding Technical Achievement Awards at IBM. Keshav has a bachelor's degree in Computer Science and Engineering from the University of Mysore, India, and has received twenty four US patents.","sameAs":["https:\/\/blog.planetnosql.com\/","https:\/\/x.com\/rkeshavmurthy"],"url":"https:\/\/www.couchbase.com\/blog\/author\/keshav-murthy\/"}]}},"acf":[],"authors":[{"term_id":8929,"user_id":55,"is_guest":0,"slug":"keshav-murthy","display_name":"Keshav Murthy","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/af74df754db27152971d0aed2f323ead5a1f9fe5afd0209af91e12e784451224?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"_links":{"self":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts\/5777","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/users\/55"}],"replies":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/comments?post=5777"}],"version-history":[{"count":0,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts\/5777\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/media\/5800"}],"wp:attachment":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/media?parent=5777"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/categories?post=5777"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/tags?post=5777"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=5777"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}