Cbimport --field_separator

sarahlunette · February 9, 2024, 1:09pm

Hi !
I have an issue with the field_separator field in commandline for cbimport. My csv file is a regular one line separated by tabs and values by spaces or semicolons. I cannot get the right command to get the first line to be the keys in the JSON representation of the record.
Do you know how to tune this ?
Thank you very much for your answer.

dh · February 9, 2024, 2:08pm

Do you mean your Comma-Separated-Value file is a single line (no newline characters) with each “row” separated by a TAB character and within each “row”, values have spaces or semi-colons between them ? ("row"s to be converted to documents.)

Could you provide a (fabricated?) example of the file?

sarahlunette · February 9, 2024, 3:39pm

ANN;JAN;FEB;MAR tab (tab is newline), semicolon are separators, the first line is the keys for the JSON dictionnary
1989;10;20;30 tab
1990;11;25;28 tab

It works with replacing by commas, but not with spaces or with other separators in the commands I used

dh · February 9, 2024, 4:40pm

What version are you using; I’ve used 7.2.0 for these:

$ cat t.csv
a;b;c
1;2;3
4;5;6

$ xxd t.csv
00000000: 613b 623b 630a 313b 323b 330a 343b 353b  a;b;c.1;2;3.4;5;
00000010: 360a                                     6.

$ cbimport csv -c localhost -u Administrator -p password -b default -d file:///tmp/t.csv -g '#UUID#' --field-separator ';'
CSV `/tmp/t.csv` imported to `localhost` successfully
Documents imported: 2 Documents failed: 0

$ curl -su Administrator:password http://localhost:8093/query/service -d 'statement=select * from default'
{
"requestID": "1b27f235-5708-49f0-b3da-28dfbb99aa96",
"signature": {"*":"*"},
"results": [
{"default":{"a":"1","b":"2","c":"3"}},
{"default":{"a":"4","b":"5","c":"6"}}
],
"status": "success",
"metrics": {"elapsedTime": "61.525682ms","executionTime": "61.40333ms","resultCount": 2,"resultSize": 74,"serviceLoad": 3}
}

$ cat t.csv
a b c
1 2 3
4 5 6

$ xxd t.csv
00000000: 6120 6220 630a 3120 3220 330a 3420 3520  a b c.1 2 3.4 5 
00000010: 360a                                     6.

$ cbimport csv -c localhost -u Administrator -p password -b default -d file:///tmp/t.csv -g '#UUID#' --field-separator ' '
CSV `/tmp/t.csv` imported to `localhost` successfully
Documents imported: 2 Documents failed: 0

$ curl -su Administrator:password http://localhost:8093/query/service -d 'statement=select * from default'
{
"requestID": "854e8518-4f68-4747-95b0-4009e2baf5be",
"signature": {"*":"*"},
"results": [
{"default":{"a":"1","b":"2","c":"3"}},
{"default":{"a":"4","b":"5","c":"6"}}
],
"status": "success",
"metrics": {"elapsedTime": "64.229753ms","executionTime": "63.97735ms","resultCount": 2,"resultSize": 74,"serviceLoad": 3}
}

sarahlunette · February 9, 2024, 8:20pm

I don’t know what version, but it returns this error: missing field-separator

dh · February 9, 2024, 8:49pm

Could you:

cat /opt/couchbase/VERSION.txt

or run:

SELECT ds_version();

to let us know the version?

dh · February 12, 2024, 9:42am

I had a scratch through the source and “field-separator” was added as an option in 2016 - so present since at least version 5. This makes me think it is a command-line syntax error rather than the option (or a really out-of-date version).

Could you share your command line (along with the version obtained from a method noted above) ?

sarahlunette · February 12, 2024, 11:03am

No, because it works with other field_separators, just the csv is not loaded with the right format

dh · February 12, 2024, 12:03pm

Do my commands with their example data not work for you? I included the hex dumps so you could see the precise content of the files - does that hold any clues as to why your files won’t load and report the “missing field-separator” error.

I’ve assumed you’ve specified “(tab is newline)” to indicate clearly where newline characters exist in your data. Is this not the case? Can you hex-dump your data and verify only the expected (0x0d, 0x0a or 0x0d0a) sequences terminate each line?

Have you got any embedded new line characters in your data?

I included my commands to show how I passed the --field-separator argument. Does this differ from your specification?

sarahlunette · February 12, 2024, 12:14pm

It is not a problem of tab and indeed it is the right interpretation. The commands don’t work with my file

dh · February 12, 2024, 12:20pm

OK, so you’d need to compare your file content - If it is as demonstrated then it should load. You’ll have to investigate to see where the specified format is broken in your file.

(I have tried breaking the file in various ways but have not succeeded in reproducing the error you’ve noted.)

I presume you’ve tried using just the first two lines of your file for testing - “head -2 src.csv > test.csv” or similar ?

sarahlunette · February 12, 2024, 12:58pm

Not yet, but I was wondering if it could be syntax as when I put the semicolon, the ’ is changing from a straight one to a italic one. Just for semi-colon

dh · February 12, 2024, 2:46pm

That will be something in your shell - how different delimiters are entered and interpreted. And yes, the meaning will change. You can use " " and ";" too. (If you’re on Windows, then double quotes (ASCII character code 0x22) are a must.)

sarahlunette · February 12, 2024, 3:08pm

Even with double quotes

dh · February 12, 2024, 4:11pm

OK, so the cbimport command requires the single-character value be passed as the argument value. How precisely this is supplied to cbimport is a function of your shell.

What is your OS & shell ?

If you can’t get these characters passed by your shell (have you tried a copy-n-paste of the commands from the examples?) try a different shell, if one is available.

sarahlunette · February 12, 2024, 4:44pm

MacOS terminal, I tried copy and paste, but the style changes anyways

dh · February 12, 2024, 5:06pm

Do you experience this character change when an editor running in your terminal or only at the command line? (Which shell are you using? - ps is the simplest way to tell.) If it is just at the command line, try switching shell.

I’m not a Mac user so I’ve asked about and with iTerm2 there is apparently no issue with input of ';' etc. – perhaps that’s an option if it is your terminal emulator itself.

An alternative may be to use a GUI editor to save the precise (i.e. without the character changes) command to a file and just run that instead.

HTH.

sarahlunette · February 12, 2024, 10:45pm

Actually, it does not appear in the editor (nano for instance). However if I copy paste from alternative GUI editor, there is a change when the command is paste

dh · February 13, 2024, 7:35am

OK, so likely your terminal emulator itself. Pasting will be processed like key presses typically.

So I’d suggest examining your terminal settings and/or writing the command in nano and saving it to a file, then execute the file (set permissions and execute or simply sh /path/to/file).

sarahlunette · February 13, 2024, 9:13am

That is a good idea ! Thanks I’ll try that !

Topic		Replies	Views
Issue with cbimport Couchbase Server	1	1035	July 15, 2018
Import CSV file into Couchbase Couchbase Server	6	3584	January 22, 2018
Error with cbimport csv file and key Couchbase Server	9	3425	April 2, 2024
Cbimport not working as expected Couchbase Server couchbase-cli	4	2603	November 28, 2021
Cbimport : bare " in non-quoted-field Couchbase Server	0	2625	February 21, 2018

Cbimport --field_separator

Related topics