{"id":13763,"date":"2022-09-27T12:14:50","date_gmt":"2022-09-27T19:14:50","guid":{"rendered":"https:\/\/www.couchbase.com\/blog\/?p=13763"},"modified":"2024-05-08T06:23:23","modified_gmt":"2024-05-08T13:23:23","slug":"databricks-couchbase-spark-sql-quickstart","status":"publish","type":"post","link":"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/","title":{"rendered":"QuickStart: Couchbase with Apache Spark on Databricks"},"content":{"rendered":"<p><span style=\"font-weight: 400\">Couchbase is the world\u2019s leading NoSQL document database. It offers unmatched performance, flexibility and scalability on the edge, on-premise and in the cloud. Spark is one of the most popular in-memory computing environments. The two platforms can be combined to execute blazingly fast query, data engineering, data science and machine learning functions.<\/span><\/p>\n<p><span style=\"font-weight: 400\">In this QuickStart, I will guide you through the simple steps to set up Couchbase with Databricks* and run Couchbase data queries and Spark SQL queries.<\/span><\/p>\n<p><i><span style=\"font-weight: 400\">*Note: The steps in this QuickStart have been validated against Databricks runtime 10.4 LTS.<\/span><\/i><\/p>\n<h2><span style=\"font-weight: 400\">Setup<\/span><\/h2>\n<h3><span style=\"font-weight: 400\">Prerequisites<\/span><\/h3>\n<p><span style=\"font-weight: 400\">To complete this QuickStart, you will need the following:<\/span><\/p>\n<ul>\n<li style=\"list-style-type: none\">\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">A Couchbase cluster and <em>travel-sample<\/em> bucket accessible to the Databricks cluster. I used a Couchbase cluster on an AWS EC2 machine.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">A<\/span>\u00a0<a href=\"https:\/\/databricks.com\/try-databricks?utm_medium=paid+search&amp;utm_source=google&amp;utm_campaign=14272820537&amp;utm_adgroup=133937946470&amp;utm_content=trial&amp;utm_offer=try-databricks&amp;utm_ad=564241575790&amp;utm_term=databricks%20login&amp;gclid=CjwKCAiAgvKQBhBbEiwAaPQw3JW8bFOKAc13o921CzNkQf0KwC39UQ_x8NZEBFiihDxpvNutxbSpNhoCjIQQAvD_BwE\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400\">Databricks account<\/span><\/a><span style=\"font-weight: 400\"> &#8211; free trials that require an AWS, Azure, or GCP account are available.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">The Couchbase <em>spark-connector<\/em> library, version 3.2.2 \u2013 available via<\/span>\u00a0<a href=\"https:\/\/mvnrepository.com\/artifact\/com.couchbase.client\/spark-connector\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400\">Maven<\/span><\/a><span style=\"font-weight: 400\">:\u00a0<\/span>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">In the cluster creation screen under the <strong>Libraries<\/strong> tab.\u00a0 Select <strong>Install<\/strong> new and search for the package on Maven Central.\u00a0 See the example below:<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-13764\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2022\/09\/image1-1024x356.png\" alt=\"\" width=\"900\" height=\"313\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/image1-1024x356.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/image1-300x104.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/image1-768x267.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/image1-1536x533.png 1536w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/image1-1320x458.png 1320w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/image1.png 1768w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><\/p>\n<ul>\n<li style=\"list-style-type: none\">\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">The <strong>Install<\/strong> library setting will be configured as in the example below:<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\"><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-large wp-image-13767\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2022\/09\/image4-1024x562.png\" alt=\"\" width=\"900\" height=\"494\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/image4-1024x562.png 1024w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/image4-300x165.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/image4-768x421.png 768w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/image4.png 1192w\" sizes=\"auto, (max-width: 900px) 100vw, 900px\" \/><br \/>\n<\/span><\/p>\n<h3><span style=\"font-weight: 400\">Configuration<\/span><\/h3>\n<p><span style=\"font-weight: 400\">Before we begin, we need to configure the following parameters in the Databricks cluster<\/span> <span style=\"font-weight: 400\"><strong>advanced options<\/strong> Spark config. This can be done<\/span> <span style=\"font-weight: 400\">when you create a cluster (please see screen print below):<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-13765\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2022\/09\/image2.png\" alt=\"\" width=\"337\" height=\"172\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/image2.png 337w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/image2-300x153.png 300w\" sizes=\"auto, (max-width: 337px) 100vw, 337px\" \/><\/p>\n<p><span style=\"font-weight: 400\">You can copy and paste the settings below and replace parameters in <em>&lt;&gt;<\/em> with the values for your Couchbase cluster in the <\/span><span style=\"font-weight: 400\"><strong>advanced options<\/strong> Spark config<\/span><span style=\"font-weight: 400\">:\u00a0<\/span><\/p>\n<pre class=\"lang:default decode:true\">\u00a0 spark.couchbase.password &lt;password&gt;\r\n\u00a0 spark.couchbase.implicitBucket &lt;travel-sample&gt;\r\n\u00a0 spark.couchbase.connectionString &lt;hostname&gt;\r\n\u00a0 spark.couchbase.username &lt;username&gt;\r\n\u00a0 spark.databricks.delta.preview.enabled true<\/pre>\n<p><span style=\"font-weight: 400\">First, let\u2019s run the necessary imports. Copy the sample code below to a blank notebook attached to a cluster with the configuration above<\/span><\/p>\n<pre class=\"lang:default decode:true\">  import com.couchbase.spark._\r\n\u00a0 import org.apache.spark.sql._\r\n\u00a0 import com.couchbase.client.scala.json.JsonObject\r\n\u00a0 import com.couchbase.spark.kv.Get\r\n\u00a0 import com.couchbase.client.scala.kv.MutateInSpec\r\n\u00a0 import com.couchbase.spark.kv.MutateIn\r\n\u00a0 import com.couchbase.client.scala.kv.LookupInSpec\r\n\u00a0 import com.couchbase.spark.kv.LookupIn\r\n\u00a0 import com.couchbase.client.scala.query.QueryOptions\r\n  import com.couchbase.spark.query.QueryOptions\r\n  import com.couchbase.client.scala.analytics.AnalyticsOptions<\/pre>\n<p><span style=\"font-weight: 400\">Now, let\u2019s get some documents by keys from the Couchbase <em>travel-sample<\/em> database using the code below:<\/span><\/p>\n<pre class=\"lang:default decode:true\"> sc\r\n\u00a0 .couchbaseGet(Seq(Get(\"airline_10\"), Get(\"airline_10642\")))\r\n\u00a0 .collect()\r\n\u00a0 .foreach(result =&gt; println(result.contentAs[JsonObject]))<\/pre>\n<p>Great, we have connected to the cluster and returned our first RDD (Resilient Distributed Dataset).<\/p>\n<p><span style=\"font-weight: 400\">We can query the data using SQL++ (Couchbase Query language based on SQL).\u00a0 Run the code below as an example:<\/span><\/p>\n<pre class=\"lang:default decode:true\"> sc\r\n\u00a0 .couchbaseQuery[JsonObject](\"select country, count(*) as count from `travel-sample` where type = 'airport' group by country order by count desc\")\r\n\u00a0 .collect()\r\n\u00a0 .foreach(println)<\/pre>\n<h2><span style=\"font-weight: 400\">Analytics Service Query<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Couchbase also offers an Analytics service for operational analytics and real-time analytics below is an example of an analytics query:<\/span><\/p>\n<pre class=\"lang:default decode:true\">val query = \"SELECT ht.city,ht.state,COUNT(*) AS num_hotels FROM `travel-sample`.inventory.hotel ht GROUP BY ht.city,ht.state HAVING COUNT(*) &gt; 30\"\r\nsc.couchbaseAnalyticsQuery[JsonObject](query).collect().foreach(println)<\/pre>\n<h2><span style=\"font-weight: 400\">Now on to some Spark SQL<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Use the code below to create temp views for <em>airlines<\/em> and <em>airports<\/em> DataFrames:<\/span><\/p>\n<pre class=\"lang:default decode:true\">val airlines = spark.read.format(\"couchbase.query\")\r\n\u00a0 .option(QueryOptions.Filter, \"type = 'airline'\")\r\n\u00a0 .load()\r\nairlines.createOrReplaceTempView(\"airlines\")\r\n\u00a0\r\nval airports = spark.read.format(\"couchbase.query\")\r\n\u00a0 .option(QueryOptions.Filter, \"type = 'airport'\")\r\n\u00a0 .load()\r\nairports.createOrReplaceTempView(\"airports\")<\/pre>\n<p><span style=\"font-weight: 400\">We can now run Spark SQL queries on the views, for example:<\/span><\/p>\n<p><span style=\"font-weight: 400\">Get airlines in ascending order:<\/span><\/p>\n<pre class=\"nums:false lang:default decode:true\">%sql select * from airlines order by name asc limit 10<\/pre>\n<p><span style=\"font-weight: 400\">Get airlines grouped by country:<\/span><\/p>\n<pre class=\"nums:false lang:default decode:true\">%sql select country, count(*) from airlines group by country;<\/pre>\n<p><span style=\"font-weight: 400\">And finally, let\u2019s visualize the airports per country using a <\/span><a href=\"https:\/\/docs.couchbase.com\/server\/current\/guides\/create-user-defined-function.html\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400\">UDF<\/span><\/a><span style=\"font-weight: 400\"> (User Defined Function) along with the Databricks mapping feature.\u00a0 Create the UDF using the SQL++ below:<\/span><\/p>\n<pre class=\"lang:default decode:true \">val countrymap = (s: String) =&gt; {\r\n s match {\r\n  case \"France\" =&gt; \"FRA\"\r\n  case \"United States\" =&gt; \"USA\"\r\n  case \"United Kingdom\" =&gt; \"GBR\"\r\n }\r\n}\r\nspark.udf.register(\"countrymap\", countrymap)<\/pre>\n<p><span style=\"font-weight: 400\">Select the airport counts by country and visualize the results:<\/span><\/p>\n<pre class=\"nums:false lang:default decode:true\">%sql select countrymap(country), count(*) from airports group by country;<\/pre>\n<p><span style=\"font-weight: 400\">After completing this Quickstart, your result should be similar to the visualization below:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-13766\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2022\/09\/image3.png\" alt=\"\" width=\"469\" height=\"191\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/image3.png 469w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/image3-300x122.png 300w\" sizes=\"auto, (max-width: 469px) 100vw, 469px\" \/><\/p>\n<h2><span style=\"font-weight: 400\">What we have accomplished<\/span><\/h2>\n<p><span style=\"font-weight: 400\">In this QuickStart, I have outlined how to utilize the Couchbase spark-connector with Databricks to create RDDs, run Couchbase and Spark SQL queries, create a UDF, and utilize the Databricks mapping feature to visualize the results. These steps demonstrate the process used to access, analyze and visualize data in a Couchbase cluster from a Databricks notebook interface.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">Next steps<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Learn more about<\/span><a href=\"https:\/\/www.couchbase.com\/products\/capella\" target=\"_blank\" rel=\"noopener\"> <span style=\"font-weight: 400\">Couchbase Capella<\/span><\/a><span style=\"font-weight: 400\">:<\/span><\/p>\n<ul>\n<li style=\"list-style-type: none\">\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Take Capella for a test drive by signing up for a<\/span><a href=\"https:\/\/cloud.couchbase.com\/?href=Playground\" target=\"_blank\" rel=\"noopener\"> <span style=\"font-weight: 400\">free 30-day trial<\/span><\/a><span style=\"font-weight: 400\">.<\/span><\/li>\n<li style=\"font-weight: 400\"><a href=\"https:\/\/cloud.couchbase.com\/sign-up\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400\">Connect your trial cluster to the Playground<\/span><\/a> <span style=\"font-weight: 400\">or connect a project to test it out for yourself.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Vist the<\/span><a href=\"https:\/\/www.couchbase.com\/developers\/\" target=\"_blank\" rel=\"noopener\"> <span style=\"font-weight: 400\">Couchbase Developer Portal<\/span><\/a> <span style=\"font-weight: 400\">which tons of<\/span><a href=\"https:\/\/developer.couchbase.com\/tutorials\" target=\"_blank\" rel=\"noopener\"> <span style=\"font-weight: 400\">tutorials\/quickstart guides<\/span><\/a> <span style=\"font-weight: 400\">and learning paths to help you get started!<\/span><\/li>\n<li style=\"font-weight: 400\"><a href=\"https:\/\/docs.couchbase.com\/home\/sdk.html\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400\">See the documentation<\/span><\/a> <span style=\"font-weight: 400\">to learn more about the Couchbase SDKs.<\/span><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<ul>\n<li style=\"list-style-type: none\"><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Thank you for reading this post! If you have any questions or comments, please connect with us on the<\/span>\u00a0<a href=\"https:\/\/www.couchbase.com\/forums\/\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400\">Couchbase<\/span><\/a> <span style=\"font-weight: 400\">Forums!<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-weight: 400\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">\u00a0<\/span><\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Couchbase is the world\u2019s leading NoSQL document database. It offers unmatched performance, flexibility and scalability on the edge, on-premise and in the cloud. Spark is one of the most popular in-memory computing environments. The two platforms can be combined to [&hellip;]<\/p>\n","protected":false},"author":70772,"featured_media":13769,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"categories":[2242,2225,1818,1812,2201],"tags":[9719,1610],"ppma_author":[9208],"class_list":["post-13763","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-connectors","category-cloud","category-java","category-n1ql-query","category-tools-sdks","tag-databricks","tag-spark"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.8 (Yoast SEO v25.8) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>QuickStart: Couchbase with Apache Spark on Databricks<\/title>\n<meta name=\"description\" content=\"Get started quickly with Apache Spark SQL and more using the Couchbase provider on the Databricks platform.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"QuickStart: Couchbase with Apache Spark on Databricks\" \/>\n<meta property=\"og:description\" content=\"Get started quickly with Apache Spark SQL and more using the Couchbase provider on the Databricks platform.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/\" \/>\n<meta property=\"og:site_name\" content=\"The Couchbase Blog\" \/>\n<meta property=\"article:published_time\" content=\"2022-09-27T19:14:50+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-05-08T13:23:23+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2022\/09\/databricks-couchbase-spark-sql-scaled.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"2560\" \/>\n\t<meta property=\"og:image:height\" content=\"1610\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Rick Jacobs\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Rick Jacobs\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/\"},\"author\":{\"name\":\"Rick Jacobs\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/ecb4001e1e4b88a5c44d20c7bf39fcd3\"},\"headline\":\"QuickStart: Couchbase with Apache Spark on Databricks\",\"datePublished\":\"2022-09-27T19:14:50+00:00\",\"dateModified\":\"2024-05-08T13:23:23+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/\"},\"wordCount\":593,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/databricks-couchbase-spark-sql-scaled.jpg\",\"keywords\":[\"databricks\",\"spark\"],\"articleSection\":[\"Connectors\",\"Couchbase Capella\",\"Java\",\"SQL++ \/ N1QL Query\",\"Tools &amp; SDKs\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/\",\"url\":\"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/\",\"name\":\"QuickStart: Couchbase with Apache Spark on Databricks\",\"isPartOf\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/databricks-couchbase-spark-sql-scaled.jpg\",\"datePublished\":\"2022-09-27T19:14:50+00:00\",\"dateModified\":\"2024-05-08T13:23:23+00:00\",\"description\":\"Get started quickly with Apache Spark SQL and more using the Couchbase provider on the Databricks platform.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#primaryimage\",\"url\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/databricks-couchbase-spark-sql-scaled.jpg\",\"contentUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/databricks-couchbase-spark-sql-scaled.jpg\",\"width\":2560,\"height\":1610,\"caption\":\"an overview of retrieval augmentation generation\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.couchbase.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"QuickStart: Couchbase with Apache Spark on Databricks\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#website\",\"url\":\"https:\/\/www.couchbase.com\/blog\/\",\"name\":\"The Couchbase Blog\",\"description\":\"Couchbase, the NoSQL Database\",\"publisher\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.couchbase.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#organization\",\"name\":\"The Couchbase Blog\",\"url\":\"https:\/\/www.couchbase.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png\",\"contentUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png\",\"width\":218,\"height\":34,\"caption\":\"The Couchbase Blog\"},\"image\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/ecb4001e1e4b88a5c44d20c7bf39fcd3\",\"name\":\"Rick Jacobs\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/image\/398e492dda1c41103d3dfa60dfd80cfe\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/4df9d5daa89732e9e520a2ded9e366daf2b32b5aea74313c561073fbc3784be9?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/4df9d5daa89732e9e520a2ded9e366daf2b32b5aea74313c561073fbc3784be9?s=96&d=mm&r=g\",\"caption\":\"Rick Jacobs\"},\"description\":\"Rick Jacobs is the Technical Product Marketing Manager at Couchbase. His varied background includes experience at many of the world\u2019s leading organizations such as Computer Sciences Corporation, IBM, Cloudera etc. He comes with over 15 years of general technology experience garnered from serving in development, consulting, data science, sales engineering and technical marketing roles. He holds several academic degrees including an MS in Computational Science from George Mason University.\",\"url\":\"https:\/\/www.couchbase.com\/blog\/author\/rick\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"QuickStart: Couchbase with Apache Spark on Databricks","description":"Get started quickly with Apache Spark SQL and more using the Couchbase provider on the Databricks platform.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/","og_locale":"en_US","og_type":"article","og_title":"QuickStart: Couchbase with Apache Spark on Databricks","og_description":"Get started quickly with Apache Spark SQL and more using the Couchbase provider on the Databricks platform.","og_url":"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/","og_site_name":"The Couchbase Blog","article_published_time":"2022-09-27T19:14:50+00:00","article_modified_time":"2024-05-08T13:23:23+00:00","og_image":[{"width":2560,"height":1610,"url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2022\/09\/databricks-couchbase-spark-sql-scaled.jpg","type":"image\/jpeg"}],"author":"Rick Jacobs","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Rick Jacobs","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#article","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/"},"author":{"name":"Rick Jacobs","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/ecb4001e1e4b88a5c44d20c7bf39fcd3"},"headline":"QuickStart: Couchbase with Apache Spark on Databricks","datePublished":"2022-09-27T19:14:50+00:00","dateModified":"2024-05-08T13:23:23+00:00","mainEntityOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/"},"wordCount":593,"commentCount":0,"publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/databricks-couchbase-spark-sql-scaled.jpg","keywords":["databricks","spark"],"articleSection":["Connectors","Couchbase Capella","Java","SQL++ \/ N1QL Query","Tools &amp; SDKs"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/","url":"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/","name":"QuickStart: Couchbase with Apache Spark on Databricks","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#primaryimage"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/databricks-couchbase-spark-sql-scaled.jpg","datePublished":"2022-09-27T19:14:50+00:00","dateModified":"2024-05-08T13:23:23+00:00","description":"Get started quickly with Apache Spark SQL and more using the Couchbase provider on the Databricks platform.","breadcrumb":{"@id":"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#primaryimage","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/databricks-couchbase-spark-sql-scaled.jpg","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2022\/09\/databricks-couchbase-spark-sql-scaled.jpg","width":2560,"height":1610,"caption":"an overview of retrieval augmentation generation"},{"@type":"BreadcrumbList","@id":"https:\/\/www.couchbase.com\/blog\/databricks-couchbase-spark-sql-quickstart\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.couchbase.com\/blog\/"},{"@type":"ListItem","position":2,"name":"QuickStart: Couchbase with Apache Spark on Databricks"}]},{"@type":"WebSite","@id":"https:\/\/www.couchbase.com\/blog\/#website","url":"https:\/\/www.couchbase.com\/blog\/","name":"The Couchbase Blog","description":"Couchbase, the NoSQL Database","publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.couchbase.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.couchbase.com\/blog\/#organization","name":"The Couchbase Blog","url":"https:\/\/www.couchbase.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png","width":218,"height":34,"caption":"The Couchbase Blog"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/ecb4001e1e4b88a5c44d20c7bf39fcd3","name":"Rick Jacobs","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/image\/398e492dda1c41103d3dfa60dfd80cfe","url":"https:\/\/secure.gravatar.com\/avatar\/4df9d5daa89732e9e520a2ded9e366daf2b32b5aea74313c561073fbc3784be9?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/4df9d5daa89732e9e520a2ded9e366daf2b32b5aea74313c561073fbc3784be9?s=96&d=mm&r=g","caption":"Rick Jacobs"},"description":"Rick Jacobs is the Technical Product Marketing Manager at Couchbase. His varied background includes experience at many of the world\u2019s leading organizations such as Computer Sciences Corporation, IBM, Cloudera etc. He comes with over 15 years of general technology experience garnered from serving in development, consulting, data science, sales engineering and technical marketing roles. He holds several academic degrees including an MS in Computational Science from George Mason University.","url":"https:\/\/www.couchbase.com\/blog\/author\/rick\/"}]}},"authors":[{"term_id":9208,"user_id":70772,"is_guest":0,"slug":"rick","display_name":"Rick Jacobs","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/4df9d5daa89732e9e520a2ded9e366daf2b32b5aea74313c561073fbc3784be9?s=96&d=mm&r=g","author_category":"","last_name":"Jacobs","first_name":"Rick","job_title":"","user_url":"","description":"Rick Jacobs is the Technical Product Marketing Manager at Couchbase.  His varied background includes experience at many of the world\u2019s leading organizations such as Computer Sciences Corporation, IBM, Cloudera etc. He comes with over 15 years of general technology experience garnered from serving in development, consulting, data science, sales engineering and technical marketing roles.  He holds several academic degrees including an MS in Computational Science from George Mason University."}],"_links":{"self":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts\/13763","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/users\/70772"}],"replies":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/comments?post=13763"}],"version-history":[{"count":0,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts\/13763\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/media\/13769"}],"wp:attachment":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/media?parent=13763"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/categories?post=13763"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/tags?post=13763"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=13763"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}