Couchbase 6.6 comes with a much needed feature, Import documents using the Couchbase Admin Web Console. This provides an easy way to quickly import small datasets in a variety of formats to compliment cbimport which is a more comprehensive command-line solution, with a lot more data import options. 

In this blog post, we will look at some use cases and some gotchas when Importing data.

Checking out the feature

Import Documents is accessed by Clicking on the Documentos link on the left panel and the Import Document button in the blue panel at the top of the page. 

The fields are all self explanatory, but I’ll take a look at importing a small dataset (just 5 lines) to demonstrate the feature. I have created 4 files with different formats on my laptop to demonstrate the feature. Note that we do not necessarily need an empty Destination Bucket, but for this test, I created a bucket teste which does not have any documents yet.

Let’s import our JSON List dataset.

  1. I clicked on the Select File to Import button and selected airport.json, a file with 5 JSON documents on my laptop.
  2. Before actually importing the data, the screen shows a sample of the File Contents in different formats. This serves as a quick check.
  3. Next I have a choice of setting the chave of the document while importing, in the Import With Document ID radio buttons. The choices are UUID, or where possible, a Value of Field.
    • Note when choosing Value of Field:
      • This field has to be in every document with a non-null, unique value. 
      • The tool does not ensure that candidate ID fields have unique values across every document. The Import UI only checks to ensure that the field is presente in every document.
      •  If you select an ID field with duplicate values, then older documents will get overwritten by new documents with the same ID.
    • For now, I will stick with the UUID choice and go ahead with the Import.
  4. Next, I selected teste como meu Destination Bucket.
  5. Then, I click on the Importar dados button at the bottom of the screen.

Import is successful and it also displays the number of documents imported in a pop up box.
The screen also shows a helpful cbimport command in case the data size is large.

 Lets import the same set of documents, but this time, choose the Value of Field assim:


I will choose id.
Now, that I have imported the same set of documents twice, but with different chaves, I will now have 10 documents in the bucket. Let’s check this out by clicking on the Document Editor button in the azul panel on top.

Here we see the 10 documents, a set of 5 with the id field as the document key and another set of 5 with the server generated UUID as the key.

Let’s check out 1 document:

Looks good.

Further tests

You can also test importing the same set of data in different formats.

File Formats

Lista JSON

Uma lista JSON é uma lista (indicada por colchetes) de qualquer número de objetos JSON (indicados por chaves) separados por vírgulas.

Linhas JSON

As linhas JSON são um arquivo em que cada linha tem um objeto JSON completo separado nessa linha.

CSV (variáveis separadas por vírgulas)

Observação:

  • The CSV format “flattens” JSON data and does not support arrays or nested values.
  • The CSV format doesn’t have a well-defined way to support null values. String values in CSV are optionally quoted, so there is no standard way to distinguish the string “null” from the value null. So, after importing from a CSV dataset, the value null will be imported as the string “null”.

TSV (Tab Separated Variables)

Observação:

  • The TSV format “flattens” JSON data and does not support arrays or nested values.
  • The TSV format doesn’t have a well-defined way to support null values. String values in TSV are optionally quoted, so there is no standard way to distinguish the string “null” from the value null. So, after importing from a CSV dataset, the value null will be imported as the string “null”.

Explore os recursos do Couchbase Server 6.6

Autor

Postado por Prasad Doddi

Prasad é gerente de produto sênior de suporte, gerenciabilidade e ferramentas do Couchbase. Antes do Couchbase, ele trabalhou na IBM em vários departamentos, incluindo desenvolvimento, controle de qualidade, suporte e vendas técnicas. Prasad tem mestrado em Engenharia Química pela Clarkson University. Eng. pela Clarkson University, NY.

Deixar uma resposta