There will be a few things at play here. The source is comma-separated; it is converted to JSON data for storage. JSON by definition is going to be larger as each document contains every field name, plus necessary supplemental syntax (quotes, braces, etc.).
The first two lines of the CSV are:
Date,Low,High,Mean,Region
2013-1-2,-2,5,1.5,UK
$ head -2 regular-time-series.csv|awk '{print length($0)}'
26
21
Translated to JSON this is at minimum:
$ cbc cat 4c88b6ed-eb5e-48e8-9436-602223f66cad -u Administrator -P password -U couchbase://192.168.2.22/travel-sample --scope=time --collection=regular
4c88b6ed-eb5e-48e8-9436-602223f66cad CAS=0x17ab6b2042a20000, Flags=0x0, Size=62, Datatype=0x01(JSON)
{"Date":"2013-1-2","High":5,"Low":-2,"Mean":1.5,"Region":"UK"}
or
select encoded_size(r) from `travel-sample`.time.regular r use keys["4c88b6ed-eb5e-48e8-9436-602223f66cad"];
{
"requestID": "c30511cc-55b8-4b05-a3c3-14ef106079c0",
"signature": {
"$1": "number"
},
"results": [
{
"$1": 62
}
],
i.e. from 21 source bytes (data only, including the newline) to 62 bytes.
cbq> select sum(encoded_size(r)) total, count(1) cnt, sum(encoded_size(r))/count(1) avg from `travel-sample`.time.regular r;
{
"requestID": "aa02b584-eea3-44e0-bbf4-1738fd3d31b1",
"signature": {
"total": "number",
"cnt": "number",
"avg": "number"
},
"results": [
{
"total": 22760,
"cnt": 365,
"avg": 62.35616438356164
}
This is nothing special about time series data, rather the basics of CSV vs JSON formats.
On disk, the data service will compress documents when it deems it beneficial. (This compressed size isn’t reflected in querying.)
I presume you’re looking at the collection statistics in the UI for the memory and disk sizes? These will not be constants - for example I can see only 54.7 KiB in memory with the file loaded once (365 documents), and 196 KiB on disk (includes other overhead, not just the raw data). Loading the file another 9 times (for 3650 documents in total) the stats are 546 KiB in memory but only 608 KiB on disk. A further ten loads (7300 documents) and it is 1.06 MiB in memory and 1.04 MiB on disk - the point being that it isn’t linear so extrapolation from a small sample is unlikely to be accurate.
Buckets, Memory, and Storage | Couchbase Docs provides more detail on storage, caching, compaction etc.
HTH.