{"id":16790,"date":"2025-01-23T10:44:46","date_gmt":"2025-01-23T18:44:46","guid":{"rendered":"https:\/\/www.couchbase.com\/blog\/?p=16790"},"modified":"2025-01-28T08:19:12","modified_gmt":"2025-01-28T16:19:12","slug":"synthetic-data-generation-capella-datastudio","status":"publish","type":"post","link":"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/","title":{"rendered":"Synthetic Data Generation with Capella DataStudio"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">If you\u2019re a developer working with Couchbase or Capella, you\u2019ll want to know about <\/span><a href=\"https:\/\/capelladatastudio.com\/\"><b>Capella DataStudio<\/b><\/a><span style=\"font-weight: 400;\">. It\u2019s a free, community-supported tool with a slick, single-pane-of-glass UI for managing <\/span><b>Capella Operational<\/b><span style=\"font-weight: 400;\">, <\/span><b>Capella Columnar<\/b><span style=\"font-weight: 400;\">, and <\/span><b>Couchbase Server Clusters<\/b><span style=\"font-weight: 400;\">. Not only does it boost developer productivity, but it also makes your experience a whole lot smoother (and cooler).\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Now, it comes with a brand new feature: <\/span><b>Synthetic Data Generator.<\/b><\/p>\n<p><b>Capella DataStudio&#8217;s Synthetic Data Generator<\/b><span style=\"font-weight: 400;\"> is designed to empower developers with a seamless, no-code way to create realistic and meaningful data for their projects. Whether you\u2019re testing applications, training machine learning models, or simulating large-scale systems, this feature provides unparalleled flexibility and power.<\/span><\/p>\n<h2><span style=\"font-weight: 400;\">What is Synthetic Data?<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">Synthetic data is not just &#8220;fake&#8221; data; it\u2019s designed to mimic the properties, distributions, and relationships of real-world data. While fake data might generate random values without context, synthetic data aims to:<\/span><\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Maintain logical relationships between fields (e.g., city and state are consistent)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Follow realistic distributions, such as generating values that adhere to normal or weighted distributions<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Be statistically relevant for testing, analysis, and simulation<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">This makes synthetic data incredibly useful in scenarios where real data is unavailable, sensitive, or insufficient<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p>Read on to dig into synthetic data generation or watch this video to see it in action.<\/p>\n<p><iframe loading=\"lazy\" title=\"Capella DataStudio Synthetic Data Generator\" width=\"900\" height=\"506\" src=\"https:\/\/www.youtube.com\/embed\/_21RzBoCA_0?feature=oembed&#038;enablejsapi=1&#038;origin=https:\/\/www.couchbase.com\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe><\/p>\n<h2><span style=\"font-weight: 400;\">Key features of Capella DataStudio&#8217;s Synthetic Data Generator<\/span><\/h2>\n<p style=\"padding-left: 40px;\"><b>Realistic, correlated data<\/b><\/p>\n<p style=\"padding-left: 40px;\"><span style=\"font-weight: 400;\">Our generator ensures data relationships are meaningful. For example, addresses include matched city, state, zip code, latitude, and longitude values. Names and demographics are logically consistent.<\/span><\/p>\n<p style=\"padding-left: 40px;\"><b>Built-in typesets, fully configurable<\/b><\/p>\n<p style=\"padding-left: 40px;\"><span style=\"font-weight: 400;\">Choose from a wide array of built-in typesets to jumpstart your data generation. Each type can be customized to suit your specific needs, whether it\u2019s names, locations, dates, or numeric fields.<\/span><\/p>\n<p style=\"padding-left: 40px;\"><b>Extendible: bring your own typesets<\/b><\/p>\n<p style=\"padding-left: 40px;\"><span style=\"font-weight: 400;\">Got your own datasets or specific requirements? Import custom typesets to extend the generator\u2019s capabilities and create tailored data that fits your unique use case.<\/span><\/p>\n<p style=\"padding-left: 40px;\"><b>Primary Key \/ Foreign Key relationship<\/b><span style=\"font-weight: 400;\">s<\/span><\/p>\n<p style=\"padding-left: 40px;\"><span style=\"font-weight: 400;\">Model complex datasets with ease by defining relationships between fields. Foreign keys can reference primary key data, enabling realistic relational data structures.<\/span><\/p>\n<p style=\"padding-left: 40px;\"><b>Expression Handling with Powerful Functions<\/b><\/p>\n<p style=\"padding-left: 40px;\"><span style=\"font-weight: 400;\">Leverage built-in functions to create complex expressions without writing a single line of code. Combine and manipulate fields dynamically for ultimate control over your data.<\/span><\/p>\n<p style=\"padding-left: 40px;\"><b>No restrictions on data size<\/b><\/p>\n<p style=\"padding-left: 40px;\"><span style=\"font-weight: 400;\">Generate data at any scale, from a few rows for small tests to millions of documents for large-scale simulations. There are no limits to what you can create.<\/span><\/p>\n<p style=\"padding-left: 40px;\"><b>Seamless integration with Capella Operational and Couchbase Server<\/b><\/p>\n<p style=\"padding-left: 40px;\"><span style=\"font-weight: 400;\">Take your synthetic data further by importing it directly into Capella Operational or Couchbase Server. This ensures a streamlined workflow from generation to deployment.<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">Why choose Capella DataStudio for synthetic data generation?<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">With its intuitive UI and robust feature set, Capella DataStudio\u2019s Synthetic Data Generator is the ultimate tool for creating high-quality, meaningful datasets. Whether you\u2019re a developer, data scientist, or tester, this feature will save time, reduce complexity, and enhance your projects with realistic data. Explore its endless possibilities and redefine your data creation experience.<\/span><\/p>\n<hr \/>\n<h2><span style=\"font-weight: 400;\">Synthetic data generation<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">Let&#8217;s look at how the Synthetic Data Generator works.\u00a0<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">Schema builder<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">The schema is built Field by Field, one row at a time.\u00a0<\/span><span style=\"font-weight: 400;\">Each row has a minimum of two attributes:<\/span><\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The field name<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The Data Type of the Field &#8211; t<\/span><span style=\"font-weight: 400;\">his could come from the <\/span><i><span style=\"font-weight: 400;\">core<\/span><\/i><span style=\"font-weight: 400;\"> or <\/span><i><span style=\"font-weight: 400;\">user<\/span><\/i><span style=\"font-weight: 400;\"> typeset<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">Depending on the Data Type, more attributes may be exposed:<\/span><\/p>\n<div id=\"attachment_16791\" style=\"width: 910px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image1-2.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16791\" class=\"wp-image-16791 size-large\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image1-2-1024x573.png\" alt=\"\" width=\"900\" height=\"504\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image1-2-1024x573.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image1-2-300x168.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image1-2-768x430.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image1-2-1536x860.png 1536w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image1-2-1320x739.png 1320w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image1-2.png 1999w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/a><p id=\"caption-attachment-16791\" class=\"wp-caption-text\">Example of the orders schema<\/p><\/div>\n<h4><span style=\"font-weight: 400;\">Field name<\/span><\/h4>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Field names can be any JSON compliant field name<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Nested JSON objects are specified by dotted format<\/span>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">Deeply nested JSON supported<\/span><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Field names with a double dash prefix will be treated as a Primary Key<\/span>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">When Generating Datasets, these keys will also be exported and will be saved as <\/span><i><span style=\"font-weight: 400;\">localStore\/SyntheticData\/DataSets\/schemaName.pk<\/span><\/i><span style=\"font-weight: 400;\"> file<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">Primary Keys can be specified only in fields in root document<\/span>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"3\"><span style=\"font-weight: 400;\">JSON Objects, Nested Fields, and Hidden Fields cannot be Primary Keys<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Field names with a single dash prefix will be treated as a Hidden Field<\/span>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">Hidden fields are used as temporary storage used in field reference<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">Hidden fields cannot be Primary Key<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">Hidden fields will not appear in JSON document<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">JSON Objects cannot be hidden<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">Nested fields can be hidden<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h4><span style=\"font-weight: 400;\">Data type<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">The data type is selected from a dialog box:<\/span><\/p>\n<div id=\"attachment_16792\" style=\"width: 910px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image3-2.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16792\" class=\"size-large wp-image-16792\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image3-2-1024x830.png\" alt=\"\" width=\"900\" height=\"729\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image3-2-1024x830.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image3-2-300x243.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image3-2-768x622.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image3-2-1536x1244.png 1536w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image3-2-1320x1069.png 1320w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image3-2.png 1802w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/a><p id=\"caption-attachment-16792\" class=\"wp-caption-text\">Picture shows both the core typesets and a user supplied typeset (acme.pizzas)<\/p><\/div>\n<h4><span style=\"font-weight: 400;\">Core typesets<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">Provided by Capella DataStudio:<\/span><\/p>\n<p><a href=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image2-2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-16793\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image2-2-1024x278.png\" alt=\"\" width=\"900\" height=\"244\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image2-2-1024x278.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image2-2-300x81.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image2-2-768x208.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image2-2-1536x417.png 1536w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image2-2-1320x358.png 1320w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image2-2.png 1748w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/a><\/p>\n<h4><span style=\"font-weight: 400;\">User typesets<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">Provided by you to extend the functionality of the Data Generator.\u00a0<\/span><span style=\"font-weight: 400;\">You need to provide two files:<\/span><\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">A CSV file with data<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">A manifest file describing the Typeset<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h4><span style=\"font-weight: 400;\">User typeset process<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">When a document is generated with user typesets, the following happens:<\/span><\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">One random row is read from the file<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The row is cached in a row-cache<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The fields are then read from this row-cache<\/span>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">Once any field is read, that field is nulled in the row-cache<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">If the field is null, the entire row-cache is invalidated and a new random row is read<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">The fields are read from the row cache and for a given document, the data is correlated<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">Each document starts with a new row-cache<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<div id=\"attachment_16794\" style=\"width: 910px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image5-1.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16794\" class=\"size-large wp-image-16794\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image5-1-1024x319.png\" alt=\"\" width=\"900\" height=\"280\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image5-1-1024x319.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image5-1-300x93.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image5-1-768x239.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image5-1-1536x478.png 1536w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image5-1-1320x411.png 1320w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image5-1.png 1999w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/a><p id=\"caption-attachment-16794\" class=\"wp-caption-text\">Example of pizzas typeset<\/p><\/div>\n<p><a href=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image7-1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-large wp-image-16795\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image7-1-1024x159.png\" alt=\"\" width=\"900\" height=\"140\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image7-1-1024x159.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image7-1-300x47.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image7-1-768x119.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image7-1-1536x239.png 1536w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image7-1-1320x205.png 1320w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image7-1.png 1750w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/a><\/p>\n<h3><span style=\"font-weight: 400;\">Core function<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">There are three special data type:<\/span><\/p>\n<ol>\n<li style=\"list-style-type: none;\">\n<ol>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">expression<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">foreignKey<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">jsonArray<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<h3><span style=\"font-weight: 400;\">1. core.function.expression<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">Expressions are a powerful way of customizing the schema:<\/span><\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Expressions are just strings<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">They can have embedded <\/span><b>references<\/b><span style=\"font-weight: 400;\"> (enclosed in <em>%%<\/em>) and <\/span><b>functions<\/b><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h4><span style=\"font-weight: 400;\">Document and expression architecture<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">Let&#8217;s see how the document is built:<\/span><\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The document is built, top down, row by row.<\/span>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">We always have a partial document at every row stage.<\/span><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">First, the expression is a string<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">It goes to an <\/span><b>Expression Evaluator<\/b>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">The partial document, with its fields and values is supplied to the evaluator.<\/span>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"3\"><span style=\"font-weight: 400;\">This means, the previous fields and their evaluated values are now available.<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The string is then examined for <\/span><b>references<\/b>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">References are field names, previously used, and their values, from the partial document.<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">References are replaced by the values<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">This means that references can also be inside of <\/span><b>functions<\/b><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The string is then examined for <\/span><b>functions<\/b>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">The functions are then executed and their values are replaced in the partial document.<\/span><\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The Evaluator finally returns back the output.<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h3><span style=\"font-weight: 400;\">2. core.function.foreignKey<\/span><\/h3>\n<h4><span style=\"font-weight: 400;\">Foreign keys and data correlation<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">When working with relational data, maintaining referential integrity through foreign keys is crucial. Here&#8217;s how our synthetic data generator handles foreign key relationships:<\/span><\/p>\n<h4><span style=\"font-weight: 400;\">How foreign keys work<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">First, you&#8217;ll need to generate your primary dataset. Let&#8217;s say you have a schema for <em>Departments<\/em>\u00a0that generates a CSV file containing department IDs and names. These department IDs serve as primary keys in the Departments dataset.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">When you create another schema, say for <em>Employees<\/em>, you can specify fields that reference these existing primary keys. The schema builder provides two drop-down menus:<\/span><\/p>\n<ol>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">A dropdown to select the source dataset (e.g., &#8220;Departments&#8221;)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">A dropdown to select which primary key field to reference (e.g., &#8220;id&#8221;)<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ol>\n<h4><span style=\"font-weight: 400;\">Data generation process<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">When generating data with foreign key references, the system:<\/span><\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Randomly selects a row from the source dataset<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Reads the primary key value(s) from that row<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Uses these values in the new dataset being generated<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h4><span style=\"font-weight: 400;\">Maintaining data correlation<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">An important feature is how we handle multiple foreign key references. If your schema references multiple columns from the same source dataset, the values are pulled from the same row to maintain logical correlation.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">For example, if your Employee schema references both department_id and department_location from the Departments dataset, both values will come from the same department record. This ensures that the synthetic data maintains realistic relationships between related fields.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This approach helps create more realistic synthetic datasets by preserving the referential integrity and logical relationships present in real-world data.<\/span><\/p>\n<h3><span style=\"font-weight: 400;\">3. core.function.jsonArray<\/span><\/h3>\n<h4><span style=\"font-weight: 400;\">JSON array configuration<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">When configuring a JSON array field, you can specify:<\/span><\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Minimum number of objects in the array<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Maximum number of objects in the array<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">The generator will then create arrays with a random number of objects within your specified range.<\/span><\/p>\n<h4><span style=\"font-weight: 400;\">Structure and limitations<\/span><\/h4>\n<p><span style=\"font-weight: 400;\">The JSON arrays follow these rules:<\/span><\/p>\n<ol>\n<li style=\"list-style-type: none;\">\n<ol>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Each array contains simple, flat JSON objects<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Nesting of arrays is not supported (no arrays within arrays)<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Each object in the array follows the same structure<\/span><\/li>\n<\/ol>\n<\/li>\n<\/ol>\n<h2><span style=\"font-weight: 400;\">Data generation<\/span><\/h2>\n<p><span style=\"font-weight: 400;\">Once the schema has been built to your satisfaction, it&#8217;s time to generate data.<\/span><\/p>\n<div id=\"attachment_16796\" style=\"width: 910px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image8-1.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16796\" class=\"size-large wp-image-16796\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image8-1-1024x477.png\" alt=\"\" width=\"900\" height=\"419\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image8-1-1024x477.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image8-1-300x140.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image8-1-768x358.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image8-1.png 1094w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/a><p id=\"caption-attachment-16796\" class=\"wp-caption-text\">Picture shows generating synthetic dataset<\/p><\/div>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Dataset is generated and written to <\/span><i><span style=\"font-weight: 400;\">LocalStore\/SyntheticData\/DataSets\/<\/span><\/i><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">The dataset filename is <\/span><i><span style=\"font-weight: 400;\">schemaName.json<\/span><\/i>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">This is a <\/span><i><span style=\"font-weight: 400;\">JSON Lines<\/span><\/i><span style=\"font-weight: 400;\"> file<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"2\"><span style=\"font-weight: 400;\">If the document has fields marked as Primary Key (prefixed with double-dash), then, a <\/span><i><span style=\"font-weight: 400;\">schemaName.pk<\/span><\/i><span style=\"font-weight: 400;\"> is also produced<\/span>\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"3\"><span style=\"font-weight: 400;\">The .pk file is a CSV file<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">If any field has the <\/span><i><span style=\"font-weight: 400;\">seq()<\/span><\/i><span style=\"font-weight: 400;\"> function, the sequences are incremented by 1<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">There is no limit to the number of documents<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<h3><span style=\"font-weight: 400;\">Example datasets<\/span><\/h3>\n<p><i><span style=\"font-weight: 400;\">customer.json<\/span><\/i><\/p>\n<pre class=\"nums:false wrap:true lang:js decode:true\">[\r\n\r\n{\"id\":\"customer_1\",\"name\":\"Lula Kuhic\",\"gender\":\"Demi-man\",\"age\":65,\"email\":\"Electa29@yahoo.com\",\"address\":{\"street\":\"46938 VonRueden Village Suite 474\",\"city\":\"Los Angeles\",\"state\":\"California\",\"zip\":\"90001\",\"geo\":{\"latitude\":33.7423,\"longitude\":-117.4412}},\"phones\":{\"home\":\"(310) 788-5382\",\"cell\":\"(310) 923-5319\"}},\r\n\r\n{\"id\":\"customer_2\",\"name\":\"Chelsea Wilderman\",\"gender\":\"Transsexual female\",\"age\":58,\"email\":\"Augusta_Mann27@yahoo.com\",\"address\":{\"street\":\"8409 Jesse Mill Apt. 289\",\"city\":\"Sacramento\",\"state\":\"California\",\"zip\":\"95814\",\"geo\":{\"latitude\":38.8607,\"longitude\":-121.0356}},\"phones\":{\"home\":\"(916) 879-6009\",\"cell\":\"(916) 503-2269\"}},\r\n\r\n\u2026\r\n\r\n]<\/pre>\n<p><i><span style=\"font-weight: 400;\">customer.pk<\/span><\/i><\/p>\n<pre class=\"nums:false lang:js decode:true \">id,name\r\n\"customer_1\",\"Lula Kuhic\"\r\n\"customer_2\",\"Chelsea Wilderman\"\r\n\u2026\r\n<\/pre>\n<h3><span style=\"font-weight: 400;\">Dataset preview<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">You can preview the generated datasets. The preview panel supports previewing the data either in JSON format or table format.<\/span><\/p>\n<div id=\"attachment_16797\" style=\"width: 910px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image6-1.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16797\" class=\"wp-image-16797 size-large\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image6-1-1024x571.png\" alt=\"\" width=\"900\" height=\"502\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image6-1-1024x571.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image6-1-300x167.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image6-1-768x428.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image6-1-1536x856.png 1536w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image6-1-1320x736.png 1320w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image6-1.png 1999w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/a><p id=\"caption-attachment-16797\" class=\"wp-caption-text\">Picture shows the preview panel and the table preview<\/p><\/div>\n<h3><span style=\"font-weight: 400;\">Import<\/span><\/h3>\n<p><span style=\"font-weight: 400;\">You can import the generated dataset into your Couchbase collection:<\/span><\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">Import uses the cbimport utility and offers all its import options<\/span><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><span style=\"font-weight: 400;\">There is no file limit to import<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<div id=\"attachment_16798\" style=\"width: 910px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image4-1.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-16798\" class=\"size-large wp-image-16798\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image4-1-1024x859.png\" alt=\"\" width=\"900\" height=\"755\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image4-1-1024x859.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image4-1-300x252.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image4-1-768x645.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image4-1-1536x1289.png 1536w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image4-1-1320x1108.png 1320w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/image4-1.png 1804w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/a><p id=\"caption-attachment-16798\" class=\"wp-caption-text\">Picture shows the import dialog box and options<\/p><\/div>\n<hr \/>\n<h2>Ready to boost your productivity?<\/h2>\n<p><span style=\"font-weight: 400;\">Capella DataStudio is the tool developers have been waiting for. Whether you\u2019re managing Couchbase Server, Capella Operational, or Capella Columnar clusters, this app makes your job easier, faster, and yes\u2014cooler.<\/span><\/p>\n<p><b>Try <a href=\"https:\/\/capelladatastudio.com\/\">Capella DataStudio for free<\/a><\/b> <span style=\"font-weight: 400;\">and check out our <\/span><b>tutorial videos<\/b><span style=\"font-weight: 400;\">:<\/span><\/p>\n<ul>\n<li style=\"list-style-type: none;\">\n<ul>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><a href=\"https:\/\/www.youtube.com\/watch?v=IqMLtgl84-E\"><span style=\"font-weight: 400;\">Capella Operational<\/span><\/a><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><a href=\"https:\/\/www.youtube.com\/watch?v=LSh26boiHdQ\"><span style=\"font-weight: 400;\">Capella Columnar<\/span><\/a><\/li>\n<li style=\"font-weight: 400;\" aria-level=\"1\"><a href=\"https:\/\/www.youtube.com\/watch?v=_21RzBoCA_0\"><span style=\"font-weight: 400;\">Synthetic Data Generator<\/span><\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">With Capella DataStudio, managing data has never been this fun or productive!<\/span><\/p>\n<hr \/>\n<h2><span style=\"font-weight: 400;\">Appendix &#8211; f<\/span><span style=\"font-weight: 400;\">unctions supported in expressions<\/span><\/h2>\n<p><i><span style=\"font-weight: 400;\">Table shows list of available functions to use in expressions:<\/span><\/i><\/p>\n<table>\n<tbody>\n<tr>\n<td><b>Type<\/b><\/td>\n<td><b>Example<\/b><\/td>\n<td><b>Output<\/b><\/td>\n<\/tr>\n<tr>\n<td>int(min,max)<\/td>\n<td>int(1,10)<\/td>\n<td>6<\/td>\n<\/tr>\n<tr>\n<td>float(min,max)<\/td>\n<td>float(1.234,10.587)<\/td>\n<td>5.824<\/td>\n<\/tr>\n<tr>\n<td>float(min,max,dec)<\/td>\n<td>float(1,10,2)<\/td>\n<td>5.82<\/td>\n<\/tr>\n<tr>\n<td>normal(mean,std,dec)<\/td>\n<td>normal(50,10,3)<\/td>\n<td>56.48<\/td>\n<\/tr>\n<tr>\n<td>bool()<\/td>\n<td>bool()<\/td>\n<td>FALSE<\/td>\n<\/tr>\n<tr>\n<td>bool(bias)<\/td>\n<td>bool(0.8)<\/td>\n<td>TRUE<\/td>\n<\/tr>\n<tr>\n<td>date(from,to)<\/td>\n<td>date(01\/01\/2024,12\/31\/2024)<\/td>\n<td>&#8220;02\/02\/2024&#8221;<\/td>\n<\/tr>\n<tr>\n<td>time(from,to)<\/td>\n<td>time(08:00 am, 5:00 pm)<\/td>\n<td>&#8220;08:47 AM&#8221;<\/td>\n<\/tr>\n<tr>\n<td>arrayItem(array)<\/td>\n<td>arrayItem([&#8220;cat&#8221;,&#8221;mouse&#8221;,&#8221;dog&#8221;])<\/td>\n<td>&#8220;cat&#8221;<\/td>\n<\/tr>\n<tr>\n<td>arrayItem(array)<\/td>\n<td>arrayItem([&#8220;cat:2&#8243;,&#8221;mouse:1&#8243;,&#8221;dog:7&#8221;])<\/td>\n<td>&#8220;dog&#8221;<\/td>\n<\/tr>\n<tr>\n<td>arrayItems(array,length)<\/td>\n<td>arrayItems([&#8220;cat&#8221;,&#8221;mouse&#8221;,&#8221;dog&#8221;],2)<\/td>\n<td>[&#8220;cat&#8221;,&#8221;mouse&#8221;]<\/td>\n<\/tr>\n<tr>\n<td>arrayItems(array,length)<\/td>\n<td>arrayItems([&#8220;cat:2&#8243;,&#8221;mouse:1&#8243;,&#8221;dog:7&#8221;])<\/td>\n<td>[&#8220;cat&#8221;,&#8221;dog&#8221;]<\/td>\n<\/tr>\n<tr>\n<td>arrayKV(array,field)<\/td>\n<td>arrayKV([&#8220;cat:2&#8243;,&#8221;mouse:1&#8243;,&#8221;dog:7&#8243;],&#8221;cat&#8221;)<\/td>\n<td>2<\/td>\n<\/tr>\n<tr>\n<td>gps(latitude,longitude)<\/td>\n<td>gps(37.3382,-121.8863)<\/td>\n<td>gpsObject<\/td>\n<\/tr>\n<tr>\n<td>gpsNearby(gps,radius)<\/td>\n<td>gpsNearby(%gps%,20)<\/td>\n<td>gpsObject<\/td>\n<\/tr>\n<tr>\n<td>seq(startNumber)<\/td>\n<td>seq(1000)<\/td>\n<td>1030<\/td>\n<\/tr>\n<tr>\n<td>uuid()<\/td>\n<td>uuid()<\/td>\n<td>&#8220;e46b493a-&#8230;&#8221;<\/td>\n<\/tr>\n<tr>\n<td>add(num1,num2)<\/td>\n<td>add(1.23,3.45)<\/td>\n<td>4.68<\/td>\n<\/tr>\n<tr>\n<td>subtract(num1,num2)<\/td>\n<td>subtract(1.23,3.45)<\/td>\n<td>-2.22<\/td>\n<\/tr>\n<tr>\n<td>multiply(num1,num2)<\/td>\n<td>multiply(1.23,3.45)<\/td>\n<td>4.24<\/td>\n<\/tr>\n<tr>\n<td>percent(num,den)<\/td>\n<td>percent(1.23,3.45)<\/td>\n<td>&#8220;35.65%&#8221;<\/td>\n<\/tr>\n<tr>\n<td>accumulate(num,name)<\/td>\n<td>accumulate(%orders.subTotal%,sale)<\/td>\n<td>1304.84<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><br style=\"font-weight: 400;\" \/><br style=\"font-weight: 400;\" \/><\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you\u2019re a developer working with Couchbase or Capella, you\u2019ll want to know about Capella DataStudio. It\u2019s a free, community-supported tool with a slick, single-pane-of-glass UI for managing Capella Operational, Capella Columnar, and Couchbase Server Clusters. Not only does it [&hellip;]<\/p>\n","protected":false},"author":57747,"featured_media":16803,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"categories":[1815,2225,1816,1819],"tags":[10080,9984,10081],"ppma_author":[9106],"class_list":["post-16790","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-best-practices-and-tutorials","category-cloud","category-couchbase-server","category-data-modeling","tag-capella-datastudio","tag-orm","tag-synthetic-data"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.7.1 (Yoast SEO v25.7) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Synthetic Data Generation with Capella DataStudio - The Couchbase Blog<\/title>\n<meta name=\"description\" content=\"Generate realistic data effortlessly with Capella DataStudio&#039;s Synthetic Data Generator. Perfect for testing, machine learning, and simulations.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Synthetic Data Generation with Capella DataStudio\" \/>\n<meta property=\"og:description\" content=\"Generate realistic data effortlessly with Capella DataStudio&#039;s Synthetic Data Generator. Perfect for testing, machine learning, and simulations.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/\" \/>\n<meta property=\"og:site_name\" content=\"The Couchbase Blog\" \/>\n<meta property=\"article:published_time\" content=\"2025-01-23T18:44:46+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-01-28T16:19:12+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/blog-synthetic-data-generation.png\" \/>\n\t<meta property=\"og:image:width\" content=\"2400\" \/>\n\t<meta property=\"og:image:height\" content=\"1256\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Prasad Doddi\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Prasad Doddi\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"10 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/\"},\"author\":{\"name\":\"Prasad Doddi\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/7870a85b21341a1cdbdd737ba6e6e077\"},\"headline\":\"Synthetic Data Generation with Capella DataStudio\",\"datePublished\":\"2025-01-23T18:44:46+00:00\",\"dateModified\":\"2025-01-28T16:19:12+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/\"},\"wordCount\":1792,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/blog-synthetic-data-generation.png\",\"keywords\":[\"Capella DataStudio\",\"orm\",\"synthetic data\"],\"articleSection\":[\"Best Practices and Tutorials\",\"Couchbase Capella\",\"Couchbase Server\",\"Data Modeling\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/\",\"url\":\"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/\",\"name\":\"Synthetic Data Generation with Capella DataStudio - The Couchbase Blog\",\"isPartOf\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/blog-synthetic-data-generation.png\",\"datePublished\":\"2025-01-23T18:44:46+00:00\",\"dateModified\":\"2025-01-28T16:19:12+00:00\",\"description\":\"Generate realistic data effortlessly with Capella DataStudio's Synthetic Data Generator. Perfect for testing, machine learning, and simulations.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#primaryimage\",\"url\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/blog-synthetic-data-generation.png\",\"contentUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/blog-synthetic-data-generation.png\",\"width\":2400,\"height\":1256},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.couchbase.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Synthetic Data Generation with Capella DataStudio\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#website\",\"url\":\"https:\/\/www.couchbase.com\/blog\/\",\"name\":\"The Couchbase Blog\",\"description\":\"Couchbase, the NoSQL Database\",\"publisher\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.couchbase.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#organization\",\"name\":\"The Couchbase Blog\",\"url\":\"https:\/\/www.couchbase.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png\",\"contentUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png\",\"width\":218,\"height\":34,\"caption\":\"The Couchbase Blog\"},\"image\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/7870a85b21341a1cdbdd737ba6e6e077\",\"name\":\"Prasad Doddi\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/image\/eefad0ed7be820b285621aa4d67f7578\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/a9ce547feba43afcbcf1425142725c663678810966eaa0ddc7d38702e647ee63?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/a9ce547feba43afcbcf1425142725c663678810966eaa0ddc7d38702e647ee63?s=96&d=mm&r=g\",\"caption\":\"Prasad Doddi\"},\"description\":\"Prasad is a Senior Product Manager in Couchbase Cloud. Prior to Couchbase, he worked at IBM in various departments including Development, QA, Support and Technical Sales. Prasad holds a master\u2019s degree in Chem. Engg. from Clarkson University, NY.\",\"sameAs\":[\"www.linkedin.com\/in\/krishna-prasad-doddi\"],\"url\":\"https:\/\/www.couchbase.com\/blog\/author\/prasad-doddi\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Synthetic Data Generation with Capella DataStudio - The Couchbase Blog","description":"Generate realistic data effortlessly with Capella DataStudio's Synthetic Data Generator. Perfect for testing, machine learning, and simulations.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/","og_locale":"en_US","og_type":"article","og_title":"Synthetic Data Generation with Capella DataStudio","og_description":"Generate realistic data effortlessly with Capella DataStudio's Synthetic Data Generator. Perfect for testing, machine learning, and simulations.","og_url":"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/","og_site_name":"The Couchbase Blog","article_published_time":"2025-01-23T18:44:46+00:00","article_modified_time":"2025-01-28T16:19:12+00:00","og_image":[{"width":2400,"height":1256,"url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/blog-synthetic-data-generation.png","type":"image\/png"}],"author":"Prasad Doddi","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Prasad Doddi","Est. reading time":"10 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#article","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/"},"author":{"name":"Prasad Doddi","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/7870a85b21341a1cdbdd737ba6e6e077"},"headline":"Synthetic Data Generation with Capella DataStudio","datePublished":"2025-01-23T18:44:46+00:00","dateModified":"2025-01-28T16:19:12+00:00","mainEntityOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/"},"wordCount":1792,"commentCount":0,"publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/blog-synthetic-data-generation.png","keywords":["Capella DataStudio","orm","synthetic data"],"articleSection":["Best Practices and Tutorials","Couchbase Capella","Couchbase Server","Data Modeling"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/","url":"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/","name":"Synthetic Data Generation with Capella DataStudio - The Couchbase Blog","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#primaryimage"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/blog-synthetic-data-generation.png","datePublished":"2025-01-23T18:44:46+00:00","dateModified":"2025-01-28T16:19:12+00:00","description":"Generate realistic data effortlessly with Capella DataStudio's Synthetic Data Generator. Perfect for testing, machine learning, and simulations.","breadcrumb":{"@id":"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#primaryimage","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/blog-synthetic-data-generation.png","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2025\/01\/blog-synthetic-data-generation.png","width":2400,"height":1256},{"@type":"BreadcrumbList","@id":"https:\/\/www.couchbase.com\/blog\/synthetic-data-generation-capella-datastudio\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.couchbase.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Synthetic Data Generation with Capella DataStudio"}]},{"@type":"WebSite","@id":"https:\/\/www.couchbase.com\/blog\/#website","url":"https:\/\/www.couchbase.com\/blog\/","name":"The Couchbase Blog","description":"Couchbase, the NoSQL Database","publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.couchbase.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.couchbase.com\/blog\/#organization","name":"The Couchbase Blog","url":"https:\/\/www.couchbase.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png","width":218,"height":34,"caption":"The Couchbase Blog"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/7870a85b21341a1cdbdd737ba6e6e077","name":"Prasad Doddi","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/image\/eefad0ed7be820b285621aa4d67f7578","url":"https:\/\/secure.gravatar.com\/avatar\/a9ce547feba43afcbcf1425142725c663678810966eaa0ddc7d38702e647ee63?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/a9ce547feba43afcbcf1425142725c663678810966eaa0ddc7d38702e647ee63?s=96&d=mm&r=g","caption":"Prasad Doddi"},"description":"Prasad is a Senior Product Manager in Couchbase Cloud. Prior to Couchbase, he worked at IBM in various departments including Development, QA, Support and Technical Sales. Prasad holds a master\u2019s degree in Chem. Engg. from Clarkson University, NY.","sameAs":["www.linkedin.com\/in\/krishna-prasad-doddi"],"url":"https:\/\/www.couchbase.com\/blog\/author\/prasad-doddi\/"}]}},"authors":[{"term_id":9106,"user_id":57747,"is_guest":0,"slug":"prasad-doddi","display_name":"Prasad Doddi","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/a9ce547feba43afcbcf1425142725c663678810966eaa0ddc7d38702e647ee63?s=96&d=mm&r=g","author_category":"","last_name":"Doddi","first_name":"Prasad","job_title":"","user_url":"","description":"Prasad is a Senior Product Manager for Couchbase Supportability, Manageability and Tools. Prior to Couchbase, he worked at IBM in various departments including Development, QA, Support and Technical Sales. Prasad holds a master\u2019s degree in Chem. Engg. from Clarkson University, NY."}],"_links":{"self":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts\/16790","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/users\/57747"}],"replies":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/comments?post=16790"}],"version-history":[{"count":0,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts\/16790\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/media\/16803"}],"wp:attachment":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/media?parent=16790"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/categories?post=16790"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/tags?post=16790"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=16790"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}