{"id":9061,"date":"2020-08-12T08:42:37","date_gmt":"2020-08-12T15:42:37","guid":{"rendered":"https:\/\/www.couchbase.com\/blog\/?p=9061"},"modified":"2025-06-13T20:19:28","modified_gmt":"2025-06-14T03:19:28","slug":"external-datasets-extend-your-reach-with-couchbase-analytics","status":"publish","type":"post","link":"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/","title":{"rendered":"External Datasets: Accessing AWS S3 in Couchbase Analytics"},"content":{"rendered":"<h3>Introduction to external datasets<\/h3>\r\n<p>Couchbase is very excited to announce its new \u201cExternal Datasets\u201d <a href=\"https:\/\/www.couchbase.com\/products\/analytics\/\">Analytics Services<\/a> feature in the latest <a href=\"https:\/\/www.couchbase.com\/blog\/whats-new-and-improved-in-couchbase-server-6-6\/\">Couchbase Server 6.6 release<\/a>. External datasets empower customers to access externally stored data in real-time from Amazon Web Services (AWS) Simple Storage Service (S3) and to combine S3-resident data with existing Couchbase data for analysis.<\/p>\r\n<h3>Customer use case<\/h3>\r\n<p>Some customers use AWS S3 to reduce storage costs and store data (e.g., multiple years of historical data, offline business data for machine learning, product reviews, etc.). They have expressed a desire to combine, query, and utilize S3 data in real-time to make this data available to business users for analytics. <span style=\"font-weight: 400\">You can read more about other Analytics use cases <\/span><span style=\"font-weight: 400\"><a href=\"https:\/\/www.couchbase.com\/blog\/analytics-customer-use-cases\/\">here<\/a>.<\/span><\/p>\r\n<h3>How do external datasets work?<\/h3>\r\n<p>External datasets provide the ability to dynamically query and analyze data residing in AWS S3, allowing users to easily combine data in real-time from both inside and outside their Couchbase analytics nodes. This is achieved in three simple steps:<\/p>\r\n<ol>\r\n<li>Set up an S3 link by using a <a href=\"https:\/\/docs.couchbase.com\/server\/6.6\/analytics\/rest-links.html\">REST API call<\/a> or the <a href=\"https:\/\/docs.couchbase.com\/server\/6.6\/cli\/cbcli\/couchbase-cli-analytics-link-setup.html\">command-line interface (CLI)<\/a><\/li>\r\n<li>Create an external dataset on the S3 link<\/li>\r\n<li>Query the dataset using <a href=\"https:\/\/www.couchbase.com\/sqlplusplus\/\">SQL++<\/a> (or your favorite BI tool)<\/li>\r\n<\/ol>\r\n<p>Let\u2019s walk through a simple example. iMaz, an e-commerce company, sells consumer products online. Their order, product, and user data are stored on a Couchbase cluster with both data and analytics services (on separate sets of nodes in the cluster). They use the Analytics Service to run ad hoc and complex queries to analyze their business. iMaz also stores their product reviews on AWS S3, and they would like to combine and analyze the top 3 most highly rated products using the Couchbase Analytics Service.<\/p>\r\n<p>Sample product data:<\/p>\r\n<pre class=\"\">[\r\n{\r\n\"id\": \"Product_1\",\r\n\"docType\": \"Product\",\r\n\"productId\": 1,\r\n\"price\": 811.76,\r\n\"salePrice\": 70.14,\r\n\"productName\": \"Ergonomic Cotton Ball\",\r\n\"desc\": \"Plastic fused metallic Ergonomic Cotton Ball\",\r\n}\r\n]<\/pre>\r\n<p>Sample review data:<\/p>\r\n<pre class=\"\">{\r\n\"id\": \"Review_0001764a17a844279a2227e137cc4e36\",\r\n\"docType\": \"Review\",\r\n\"reviewId\": \"0001764a17a844279a2227e137cc4e36\",\r\n\"productId\": 1,\r\n\"userId\": 5862,\r\n\"reviewerName\": \"M. Schaefer\",\r\n\"reviewerEmail\": \"...@mmail.com\",\r\n\"rating\": 5,\r\n\"title\": \"Works well and meets expectations.\",\r\n\"review\": \"Product works great and will buy one more for my extended family.\",\r\n\"reviewDate\": 1597273484\r\n}<\/pre>\r\n<p>Let\u2019s follow the three steps from above with sample setup code along with a SQL++ query.<\/p>\r\n<h4>Step 1: Set up S3 link<\/h4>\r\n<p>We\u2019ll create an S3 link using a <a href=\"https:\/\/docs.couchbase.com\/server\/6.6\/analytics\/rest-links.html\">REST API call<\/a>. (Alternatively, you can use the <a href=\"https:\/\/docs.couchbase.com\/server\/6.6\/cli\/cbcli\/couchbase-cli-analytics-link-setup.html\">CLI to create S3 links<\/a>.). We\u2019ll need to provide:<\/p>\r\n<ul>\r\n<li>Analytics Service hostname<\/li>\r\n<li>Analytics user credentials<\/li>\r\n<li>S3 link name (in this case myS3Link)<\/li>\r\n<li>Dataverse name (if different from default)<\/li>\r\n<li>Link type (S3)<\/li>\r\n<li>AWS S3 required access key ID<\/li>\r\n<li>AWS S3 required secret access key<\/li>\r\n<li>AWS S3 required region (e.g., us-west-2)<\/li>\r\n<\/ul>\r\n<pre class=\"decode-attributes:false lang:default decode:true\">curl -u &lt;username&gt;:&lt;pwd&gt;\r\n-X POST \"https:\/\/&lt;analytics_hostname&gt;\/analytics\/link\" \r\n-d dataverse=Default\r\n-d name=myS3Link\r\n-d type=S3\r\n-d accessKeyId=...\r\n-d secretAccessKey=...\r\n-d region=us-west-2<\/pre>\r\n<h4>Step 2: Create an External Dataset<\/h4>\r\n<p>Using the Analytics workbench, we\u2019ll now create an external dataset named \u201cS3productreviews\u201d. We\u2019ll need to specify:<\/p>\r\n<ul>\r\n<li>S3 bucket name<\/li>\r\n<li>Dataverse name (if different from default) and S3 bucket name (in this case cb-analytics-6.6-demo)<\/li>\r\n<li>Directory location (optionally) inside the bucket where files will be read from and recursively collected (in this case product reviews are stored in a \u201creviews\u201d folder)<\/li>\r\n<li>File format (in this case we\u2019ll use JSON) with the ability to specify a search pattern (in this case *.json indicates that all JSON files will be included when querying data)<\/li>\r\n<\/ul>\r\n<pre class=\"\">CREATE EXTERNAL DATASET S3productreviews\r\nON cb-analytics-6.6-demo\r\nAT myS3Link\r\nUSING \u201creviews\u201d\r\nWITH { \"format\": \"json\", \"include\": \"*.json\" } ;<\/pre>\r\n<p>Currently, the external datasets feature supports the <strong>json<\/strong>, <strong>csv <span style=\"font-weight: 400\">(comma-separated values), and <\/span><b>tsv <\/b><span style=\"font-weight: 400\">(tab-separated values) <\/span><\/strong>\u00a0file formats, including compressed gzip files (filenames ending with .gz or .gzip). Both the csv and tsv formats require you to specify an inlined type definition (more about this shortly). Additional file formats will be supported in future releases. You can read more about that here.<\/p>\r\n<h4>Step 3: Query using SQL++<\/h4>\r\n<p>As the last step, we can now run the SQL++ query listed below (which looks exactly like SQL :)). It joins the existing products dataset from the Couchbase Analytics Service and the product reviews data from AWS S3 to get the top 3 highly rated products.<\/p>\r\n<pre class=\"\">SELECT p.productName, AVG(s.rating) AS \u2018Rating\u2019\r\nFROM   S3productreviews s, products p\r\nWHERE  s.productId = p.productId\r\nGROUP\r\nBY.    p.productName\r\nORDER\r\nBY     AVG(s.rating) DESC\r\nLIMIT  3;<\/pre>\r\n<p>Here are the json query results:<\/p>\r\n<pre class=\"\">[\r\n{ \"Rating\": 4.33, \"productName\": \"Licensed Rubber Tuna\"},\r\n{ \"Rating\": 4.29, \"productName\": \"Gorgeous Plastic Salad\"},\r\n{ \"Rating\": 3.86, \"productName\": \"Intelligent Cotton Bike\"}\r\n]<\/pre>\r\n<p>This is great \u2013 we\u2019re now able to combine and analyze external data located in AWS S3 from the Couchbase Analytics Service. Notice how few steps it took to enable us to analyze our data; no ETL was involved, and the data was immediately available!<\/p>\r\n<p>You might now be wondering: How would this have worked if the S3 reviews file format had been of type csv instead of JSON? The answer is simple; you simply would have constructed your external dataset accordingly. Below, we show what the create external dataset statement from above would look like to support csv:<\/p>\r\n<pre class=\"\">CREATE EXTERNAL DATASET S3productreviews\r\n(\r\nid STRING NOT UNKNOWN, \r\ndocType STRING NOT UNKNOWN,\r\nreviewId STRING NOT UNKNOWN,\r\nproductId BIGINT,\r\nuserId BIGINT,\r\nreviewerName STRING NOT UNKNOWN,\r\nreviewerEmail STRING NOT UNKNOWN,\r\nrating BIGINT,\r\ntitle STRING NOT UNKNOWN,\r\nreview STRING NOT UNKNOWN,\r\nreviewDate BIGINT\r\n) \r\nON `cb-analytics-6.6-demo`\r\nAT myS3link\r\nUSING \"reviews\"\r\nWITH { \"format\": \"csv\", \"include\": \"*.csv\", \"header\": false };<\/pre>\r\n<p>Notice how the create statement now includes inlined type information. This is needed to tell Analytics how to interpret the csv data (e.g., not just as strings).<\/p>\r\n<p>The SQL++ query remains exactly the same. That\u2019s right, no change at all! External datasets are easy to set up, flexible, and simple to use thanks to the power of the SQL++ language. Users can develop complex ad hoc queries for further data exploration, answer new business questions, and combine external data with data from <a href=\"https:\/\/www.couchbase.com\/blog\/remote-links-analyze-your-enterprise-with-couchbase-analytics\/\">Remote Links<\/a> to bring in other Couchbase data sources as well.<\/p>\r\n<h3>Benefits<\/h3>\r\n<p>Here are key benefits that come from using external datasets:<\/p>\r\n<ol>\r\n<li>Data enrichment. Couchbase data can now be enriched with additional information obtained from files that reside in an enterprise\u2019s existing S3-based data lake.<\/li>\r\n<li>Dynamic data access. The latest data can be dynamically retrieved, streamed, combined, and analyzed from any S3 bucket in any AWS region during Analytics query execution.<\/li>\r\n<li>Parallel query processing. Users can configure and arrange access to S3 data using Analytics\u2019 massively parallel processing (MPP) query processing architecture for fast response to queries involving external data.<\/li>\r\n<\/ol>\r\n<h3>Summary<\/h3>\r\n<p>External Datasets unlock the value of external live and archived data residing in S3-based data lakes. Users can combine and analyze data in real-time, sourced from both AWS S3 and Couchbase Analytics Service. This enables faster and more comprehensive data analysis and agile decision making.<\/p>\r\n<h3>Resources<\/h3>\r\n<p>You can learn more about External Datasets statements <a href=\"https:\/\/docs.couchbase.com\/server\/6.6\/analytics\/5_ddl.html\">here<\/a>. Register <a href=\"https:\/\/event.on24.com\/wcc\/r\/2566405\/9DB74CF2A4251458E10D64B86B68C0EF?partnerref=blog\">here<\/a> for our upcoming \u201cWhat\u2019s new in Couchbase Server release 6.6\u201d webinar.<\/p>\r\n<h3>Explore Couchbase Server 6.6 resources<\/h3>\r\n<table width=\"624\">\r\n<tbody>\r\n<tr>\r\n<td>\r\n<p><strong>Blogs<\/strong><\/p>\r\n<\/td>\r\n<td>\r\n<p><strong>Docs and Tutorials<\/strong><\/p>\r\n<\/td>\r\n<td>\r\n<p><strong>Webpages and Webinars<\/strong><\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>\r\n<p><a href=\"https:\/\/www.couchbase.com\/blog\/whats-new-and-improved-in-couchbase-server-6-6\/\">What\u2019s New in Couchbase Server 6.6<\/a><\/p>\r\n<\/td>\r\n<td>\r\n<p><a href=\"https:\/\/docs.couchbase.com\/server\/current\/introduction\/whats-new.html\">What\u2019s New in Couchbase Server 6.6?<\/a><\/p>\r\n<\/td>\r\n<td>\r\n<p><a href=\"https:\/\/event.on24.com\/eventRegistration\/EventLobbyServlet?target=reg20.jsp&amp;partnerref=website&amp;eventid=2566405&amp;sessionid=1&amp;key=9DB74CF2A4251458E10D64B86B68C0EF&amp;regTag=&amp;sourcepage=register\">New Features in Couchbase Server 6.6: Analytics, Backup, Query, and More<\/a><\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>\r\n<p><a href=\"https:\/\/www.couchbase.com\/blog\/eventing-improvements-timers-handlers-and-statistics\/\">Eventing Improvements (Timers, Handlers, and Statistics)<\/a><\/p>\r\n<\/td>\r\n<td>\r\n<p><a href=\"https:\/\/docs.couchbase.com\/server\/6.6\/release-notes\/relnotes.html\">Couchbase Server 6.6 Release Notes<\/a><\/p>\r\n<\/td>\r\n<td>\r\n<p><a href=\"https:\/\/www.couchbase.com\/products\/analytics\/\">Couchbase Analytics Service<\/a><\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>\r\n<p><a href=\"https:\/\/www.couchbase.com\/blog\/remote-links-analyze-your-enterprise-with-couchbase-analytics\/\">Remote Links \u2013 Analyze Your Enterprise With Couchbase Analytics<\/a><\/p>\r\n<\/td>\r\n<td>\r\n<p><a href=\"https:\/\/index-advisor.couchbase.com\/indexadvisor\/#1\">Try the Couchbase Index Advisor Service<\/a><\/p>\r\n<\/td>\r\n<td>\r\n<p><a href=\"https:\/\/www.couchbase.com\/products\/server\/whats-new\/\">What\u2019s New in Couchbase Server (Product Page)<\/a><\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>\r\n<p><a href=\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/\">External Datasets \u2013 Extend Your Reach With Couchbase Analytics<\/a><\/p>\r\n<\/td>\r\n<td>\r\n<p><a href=\"https:\/\/docs.couchbase.com\/server\/current\/analytics\/rest-links.html\">Set Up Analytics Remote and S3 Links Using REST API<\/a><\/p>\r\n<\/td>\r\n<td>\r\n<p><a href=\"https:\/\/www.couchbase.com\/products\/editions\/\">Compare Editions<\/a><\/p>\r\n<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>\r\n<p><a href=\"https:\/\/www.couchbase.com\/blog\/announcing-flex-index-with-couchbase\/\">Announcing Flex Index With Couchbase<\/a><\/p>\r\n<\/td>\r\n<td>\r\n<p><a href=\"https:\/\/docs.couchbase.com\/server\/current\/analytics\/5_ddl.html\">Create External Datasets Using Data Definition Language (DDL)<\/a><\/p>\r\n<\/td>\r\n<td>\u00a0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>\r\n<p><a href=\"https:\/\/www.couchbase.com\/blog\/introducing-backing-up-to-object-store-s3\/\">Introducing Backing Up to Object Store (S3)<\/a><\/p>\r\n<\/td>\r\n<td>\r\n<p><a href=\"https:\/\/docs.couchbase.com\/server\/current\/cli\/cbcli\/couchbase-cli-analytics-link-setup.html\">Set Up Analytics Remote and S3 Links Using CLI<\/a><\/p>\r\n<\/td>\r\n<td>\u00a0<\/td>\r\n<\/tr>\r\n<tr>\r\n<td>\r\n<p><a href=\"https:\/\/www.couchbase.com\/blog\/import-documents-with-admin-ui\/\">Import Documents With the Web Admin Console<\/a><\/p>\r\n<\/td>\r\n<td>\u00a0<\/td>\r\n<td>\u00a0<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<p><em style=\"color: inherit;font-size: 1em;font-weight: 600\">Thanks Till Westmann for co-authoring and Michael Carey for valuable contributions and review of this post.<\/em><\/p>\r\n\r\n<div class=\"wp-block-group alignwide has-very-light-gray-background-color has-background\">\r\n<div class=\"wp-block-group__inner-container is-layout-flow wp-block-group-is-layout-flow\">\r\n<div class=\"wp-block-media-text alignwide\" style=\"grid-template-columns: 30% auto\">\r\n<figure class=\"wp-block-media-text__media\"><img loading=\"lazy\" decoding=\"async\" width=\"300\" height=\"300\" class=\"wp-image-9084\" src=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2020\/08\/Till_Westmann-removebg-300px.png\" alt=\"till westman engineering director analytics\" srcset=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2020\/08\/Till_Westmann-removebg-300px.png 300w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2020\/08\/Till_Westmann-removebg-300px-150x150.png 150w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2020\/08\/Till_Westmann-removebg-300px-65x65.png 65w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2020\/08\/Till_Westmann-removebg-300px-50x50.png 50w, https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2020\/08\/Till_Westmann-removebg-300px-20x20.png 20w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/figure>\r\n<div class=\"wp-block-media-text__content\">\r\n<p>&nbsp;<\/p>\r\n\r\n\r\n\r\n<p style=\"font-size: 14px\"><strong><em>Co-author<\/em><\/strong><\/p>\r\n\r\n\r\n\r\n<p style=\"font-size: 12px\"><em>Till Westmann, Engineering Director at Couchbase<\/em><\/p>\r\n\r\n\r\n\r\n<p class=\"has-small-font-size\">Till Westmann is an Engineering Director at Couchbase working on the Analytics Service. Before joining Couchbase Till built data management software at Oracle, 28msec, SAP, BEA Systems, XQRL, and Xyleme. He is a member of the Apache Software Foundation and the Vice President of the Apache AsterixDB project. Till holds a PhD from the University of Mannheim in Germany.<\/p>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n","protected":false},"excerpt":{"rendered":"<p>Introduction to external datasets Couchbase is very excited to announce its new \u201cExternal Datasets\u201d Analytics Services feature in the latest Couchbase Server 6.6 release. External datasets empower customers to access externally stored data in real-time from Amazon Web Services (AWS) [&hellip;]<\/p>\n","protected":false},"author":58630,"featured_media":10426,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"inline_featured_image":false,"footnotes":""},"categories":[2294,1816,9417,1812],"tags":[1572],"ppma_author":[8967],"class_list":["post-9061","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-analytics","category-couchbase-server","category-performance","category-n1ql-query","tag-database"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v25.8 (Yoast SEO v25.8) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Accessing AWS S3 with External Datasets in Couchbase Analytics<\/title>\n<meta name=\"description\" content=\"External datasets provide the ability to dynamically query and analyze data residing in AWS S3. Combine data in real-time with Couchbase analytics.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"External Datasets: Accessing AWS S3 in Couchbase Analytics\" \/>\n<meta property=\"og:description\" content=\"External datasets provide the ability to dynamically query and analyze data residing in AWS S3. Combine data in real-time with Couchbase analytics.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/\" \/>\n<meta property=\"og:site_name\" content=\"The Couchbase Blog\" \/>\n<meta property=\"article:published_time\" content=\"2020-08-12T15:42:37+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-06-14T03:19:28+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2020\/08\/external-links-blog-2.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1588\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Idris Motiwala\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Idris Motiwala\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/\"},\"author\":{\"name\":\"Idris Motiwala\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/2fc07a18d91ce2e4e0f1f7c5c9e620b8\"},\"headline\":\"External Datasets: Accessing AWS S3 in Couchbase Analytics\",\"datePublished\":\"2020-08-12T15:42:37+00:00\",\"dateModified\":\"2025-06-14T03:19:28+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/\"},\"wordCount\":1164,\"commentCount\":1,\"publisher\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2020\/08\/external-links-blog-2.png\",\"keywords\":[\"database\"],\"articleSection\":[\"Couchbase Analytics\",\"Couchbase Server\",\"High Performance\",\"SQL++ \/ N1QL Query\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/\",\"url\":\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/\",\"name\":\"Accessing AWS S3 with External Datasets in Couchbase Analytics\",\"isPartOf\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2020\/08\/external-links-blog-2.png\",\"datePublished\":\"2020-08-12T15:42:37+00:00\",\"dateModified\":\"2025-06-14T03:19:28+00:00\",\"description\":\"External datasets provide the ability to dynamically query and analyze data residing in AWS S3. Combine data in real-time with Couchbase analytics.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#primaryimage\",\"url\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2020\/08\/external-links-blog-2.png\",\"contentUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2020\/08\/external-links-blog-2.png\",\"width\":1588,\"height\":628},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.couchbase.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"External Datasets: Accessing AWS S3 in Couchbase Analytics\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#website\",\"url\":\"https:\/\/www.couchbase.com\/blog\/\",\"name\":\"The Couchbase Blog\",\"description\":\"Couchbase, the NoSQL Database\",\"publisher\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.couchbase.com\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#organization\",\"name\":\"The Couchbase Blog\",\"url\":\"https:\/\/www.couchbase.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png\",\"contentUrl\":\"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png\",\"width\":218,\"height\":34,\"caption\":\"The Couchbase Blog\"},\"image\":{\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/2fc07a18d91ce2e4e0f1f7c5c9e620b8\",\"name\":\"Idris Motiwala\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/image\/28d4b56674680cd3d7fe940321c3e98a\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/41b4ee771dab1b1ff8152be7b5545a13ff3cca8ca7e9021e762e3d7af21763f0?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/41b4ee771dab1b1ff8152be7b5545a13ff3cca8ca7e9021e762e3d7af21763f0?s=96&d=mm&r=g\",\"caption\":\"Idris Motiwala\"},\"description\":\"Idris is a Principal Product Manager, Analytics at Couchbase with 20+ years experience in design, development and execution of software products at both Fortune 500s and startups leading teams in digital transformation, cloud and analytics. Idris holds an MS in Technology Management and certifications in product management .\",\"sameAs\":[\"https:\/\/www.linkedin.com\/in\/idrismotiwala\/\"],\"url\":\"https:\/\/www.couchbase.com\/blog\/author\/idris-motiwala\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Accessing AWS S3 with External Datasets in Couchbase Analytics","description":"External datasets provide the ability to dynamically query and analyze data residing in AWS S3. Combine data in real-time with Couchbase analytics.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/","og_locale":"en_US","og_type":"article","og_title":"External Datasets: Accessing AWS S3 in Couchbase Analytics","og_description":"External datasets provide the ability to dynamically query and analyze data residing in AWS S3. Combine data in real-time with Couchbase analytics.","og_url":"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/","og_site_name":"The Couchbase Blog","article_published_time":"2020-08-12T15:42:37+00:00","article_modified_time":"2025-06-14T03:19:28+00:00","og_image":[{"width":1588,"height":628,"url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2020\/08\/external-links-blog-2.png","type":"image\/png"}],"author":"Idris Motiwala","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Idris Motiwala","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#article","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/"},"author":{"name":"Idris Motiwala","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/2fc07a18d91ce2e4e0f1f7c5c9e620b8"},"headline":"External Datasets: Accessing AWS S3 in Couchbase Analytics","datePublished":"2020-08-12T15:42:37+00:00","dateModified":"2025-06-14T03:19:28+00:00","mainEntityOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/"},"wordCount":1164,"commentCount":1,"publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2020\/08\/external-links-blog-2.png","keywords":["database"],"articleSection":["Couchbase Analytics","Couchbase Server","High Performance","SQL++ \/ N1QL Query"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/","url":"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/","name":"Accessing AWS S3 with External Datasets in Couchbase Analytics","isPartOf":{"@id":"https:\/\/www.couchbase.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#primaryimage"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#primaryimage"},"thumbnailUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2020\/08\/external-links-blog-2.png","datePublished":"2020-08-12T15:42:37+00:00","dateModified":"2025-06-14T03:19:28+00:00","description":"External datasets provide the ability to dynamically query and analyze data residing in AWS S3. Combine data in real-time with Couchbase analytics.","breadcrumb":{"@id":"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#primaryimage","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2020\/08\/external-links-blog-2.png","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/sites\/1\/2020\/08\/external-links-blog-2.png","width":1588,"height":628},{"@type":"BreadcrumbList","@id":"https:\/\/www.couchbase.com\/blog\/external-datasets-extend-your-reach-with-couchbase-analytics\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.couchbase.com\/blog\/"},{"@type":"ListItem","position":2,"name":"External Datasets: Accessing AWS S3 in Couchbase Analytics"}]},{"@type":"WebSite","@id":"https:\/\/www.couchbase.com\/blog\/#website","url":"https:\/\/www.couchbase.com\/blog\/","name":"The Couchbase Blog","description":"Couchbase, the NoSQL Database","publisher":{"@id":"https:\/\/www.couchbase.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.couchbase.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.couchbase.com\/blog\/#organization","name":"The Couchbase Blog","url":"https:\/\/www.couchbase.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png","contentUrl":"https:\/\/www.couchbase.com\/blog\/wp-content\/uploads\/2023\/04\/admin-logo.png","width":218,"height":34,"caption":"The Couchbase Blog"},"image":{"@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/2fc07a18d91ce2e4e0f1f7c5c9e620b8","name":"Idris Motiwala","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.couchbase.com\/blog\/#\/schema\/person\/image\/28d4b56674680cd3d7fe940321c3e98a","url":"https:\/\/secure.gravatar.com\/avatar\/41b4ee771dab1b1ff8152be7b5545a13ff3cca8ca7e9021e762e3d7af21763f0?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/41b4ee771dab1b1ff8152be7b5545a13ff3cca8ca7e9021e762e3d7af21763f0?s=96&d=mm&r=g","caption":"Idris Motiwala"},"description":"Idris is a Principal Product Manager, Analytics at Couchbase with 20+ years experience in design, development and execution of software products at both Fortune 500s and startups leading teams in digital transformation, cloud and analytics. Idris holds an MS in Technology Management and certifications in product management .","sameAs":["https:\/\/www.linkedin.com\/in\/idrismotiwala\/"],"url":"https:\/\/www.couchbase.com\/blog\/author\/idris-motiwala\/"}]}},"authors":[{"term_id":8967,"user_id":58630,"is_guest":0,"slug":"idris-motiwala","display_name":"Idris Motiwala","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/41b4ee771dab1b1ff8152be7b5545a13ff3cca8ca7e9021e762e3d7af21763f0?s=96&d=mm&r=g","author_category":"","last_name":"Motiwala","first_name":"Idris","job_title":"","user_url":"","description":"Idris is a Principal Product Manager, Analytics at Couchbase with 20+ years experience in design, development and execution of software products at both Fortune 500s and startups leading teams in digital transformation, cloud and analytics. Idris holds an MS in Technology Management and certifications in product management ."}],"_links":{"self":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts\/9061","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/users\/58630"}],"replies":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/comments?post=9061"}],"version-history":[{"count":0,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/posts\/9061\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/media\/10426"}],"wp:attachment":[{"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/media?parent=9061"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/categories?post=9061"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/tags?post=9061"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.couchbase.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=9061"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}