Couchbase 6.6 comes with a much needed feature, Import documents using the Couchbase Admin Web Console. This provides an easy way to quickly import small datasets in a variety of formats to compliment cbimport which is a more comprehensive command-line solution, with a lot more data import options.
In this blog post, we will look at some use cases and some gotchas when Importing data.
Checking out the feature
Import Documents is accessed by Clicking on the Documentos link on the left panel and the Import Document button in the blue panel at the top of the page.
The fields are all self explanatory, but I’ll take a look at importing a small dataset (just 5 lines) to demonstrate the feature. I have created 4 files with different formats on my laptop to demonstrate the feature. Note that we do not necessarily need an empty Destination Bucket, but for this test, I created a bucket teste which does not have any documents yet.
Let’s import our JSON List dataset.
- I clicked on the Select File to Import button and selected airport.json, a file with 5 JSON documents on my laptop.
- Before actually importing the data, the screen shows a sample of the File Contents in different formats. This serves as a quick check.
- Next I have a choice of setting the chave of the document while importing, in the Import With Document ID radio buttons. The choices are UUID, or where possible, a Value of Field.
- Note when choosing Value of Field:
- This field has to be in every document with a non-null, unique value.
- The tool does not ensure that candidate ID fields have unique values across every document. The Import UI only checks to ensure that the field is presente in every document.
- If you select an ID field with duplicate values, then older documents will get overwritten by new documents with the same ID.
- For now, I will stick with the UUID choice and go ahead with the Import.
- Note when choosing Value of Field:
- Next, I selected teste como meu Destination Bucket.
- Then, I click on the Importar dados button at the bottom of the screen.
Import is successful and it also displays the number of documents imported in a pop up box.
The screen also shows a helpful cbimport command in case the data size is large.
Lets import the same set of documents, but this time, choose the Value of Field assim:
I will choose id.
Now, that I have imported the same set of documents twice, but with different chaves, I will now have 10 documents in the bucket. Let’s check this out by clicking on the Document Editor button in the azul panel on top.
Here we see the 10 documents, a set of 5 with the id field as the document key and another set of 5 with the server generated UUID as the key.
Let’s check out 1 document:
Looks good.
Further tests
You can also test importing the same set of data in different formats.
File Formats
Lista JSON
Uma lista JSON é uma lista (indicada por colchetes) de qualquer número de objetos JSON (indicados por chaves) separados por vírgulas.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
[ { "nome do aeroporto": "Calais Dunkerque", "cidade": "Calais", "país": "França", "faa": "CQF", "geo": { "alt": 12, "lat": 50.962097, "longo": 1.954764 }, "icao": "LFAC", "id": 1254, "tipo": "aeroporto", "tz": "Europa/Paris" }, .... { "nome do aeroporto": "Bray", "cidade": "Albert", "país": "França", "faa": nulo, "geo": { "alt": 364, "lat": 49.971531, "longo": 2.697661 }, "icao": "LFAQ", "id": 1258, "tipo": "aeroporto", "tz": "Europa/Paris" } ] |
Linhas JSON
As linhas JSON são um arquivo em que cada linha tem um objeto JSON completo separado nessa linha.
1 2 3 4 5 |
{"nome do aeroporto":"Peronne St Quentin","cidade":"Peronne","país":"França","faa":nulo,"geo":{"alt":295,"lat":49.868547,"longo":3.029578},"icao":"LFAG","id":1255,"tipo":"aeroporto","tz":"Europa/Paris"} {"nome do aeroporto":"Bray","cidade":"Albert","país":"França","faa":nulo,"geo":{"alt":364,"lat":49.971531,"longo":2.697661},"icao":"LFAQ","id":1258,"tipo":"aeroporto","tz":"Europa/Paris"} {"nome do aeroporto":"Calais Dunkerque","cidade":"Calais","país":"França","faa":"CQF","geo":{"alt":12,"lat":50.962097,"longo":1.954764},"icao":"LFAC","id":1254,"tipo":"aeroporto","tz":"Europa/Paris"} {"nome do aeroporto":"Les Loges","cidade":"Nangis","país":"França","faa":nulo,"geo":{"alt":428,"lat":48.596219,"longo":3.006786},"icao":"LFAI","id":1256,"tipo":"aeroporto","tz":"Europa/Paris"} {"nome do aeroporto":"Couterne","cidade":"Bagnole-de-l'orne","país":"França","faa":nulo,"geo":{"alt":718,"lat":48.545836,"longo":-0.387444},"icao":"LFAO","id":1257,"tipo":"aeroporto","tz":"Europa/Paris"} |
CSV (variáveis separadas por vírgulas)
1 2 3 4 5 6 |
nome do aeroporto,cidade,país,faa,geo.alt,geo.lat,geo.solitário,icao,id,tipo,tz "Calais Dunkerque","Calais","França","CQF",12,50.962097,1.954764,"LFAC",1254,"aeroporto","Europa/Paris" "Peronne St Quentin","Peronne","França","null",295,49.868547,3.029578,"LFAG",1255,"aeroporto","Europa/Paris" "Les Loges","Nangis","França",nulo,428,48.596219,3.006786,"LFAI",1256,"aeroporto","Europa/Paris" "Couterne","Bagnole-de-l'orne","França",nulo,718,48.545836,-0.387444,"LFAO",1257,"aeroporto","Europa/Paris" "Bray","Albert","França",nulo,364,49.971531,2.697661,"LFAQ",1258,"aeroporto","Europa/Paris" |
Observação:
- The CSV format “flattens” JSON data and does not support arrays or nested values.
- The CSV format doesn’t have a well-defined way to support null values. String values in CSV are optionally quoted, so there is no standard way to distinguish the string “null” from the value null. So, after importing from a CSV dataset, the value null will be imported as the string “null”.
TSV (Tab Separated Variables)
1 2 3 4 5 6 |
nome do aeroporto cidade país faa geo.alt geo.lat geo.solitário icao id tipo tz "Calais Dunkerque" "Calais" "França" "CQF" 12 50.962097 1.954764 "LFAC" 1254 "aeroporto" "Europa/Paris" "Peronne St Quentin" "Peronne" "França" nulo 295 49.868547 3.029578 "LFAG" 1255 "aeroporto" "Europa/Paris" "Les Loges" "Nangis" "França" nulo 428 48.596219 3.006786 "LFAI" 1256 "aeroporto" "Europa/Paris" "Couterne" "Bagnole-de-l'orne" "França" nulo 718 48.545836 -0.387444 "LFAO" 1257 "aeroporto" "Europa/Paris" "Bray" "Albert" "França" nulo 364 49.971531 2.697661 "LFAQ" 1258 "aeroporto" "Europa/Paris" |
Observação:
- The TSV format “flattens” JSON data and does not support arrays or nested values.
- The TSV format doesn’t have a well-defined way to support null values. String values in TSV are optionally quoted, so there is no standard way to distinguish the string “null” from the value null. So, after importing from a CSV dataset, the value null will be imported as the string “null”.