How to partially update my items?

Hi,
I want to update some items in my collection. But there is too much data to update so I want to update my data day by day.

For example;

update testBucket.testScope.testCollection
set detail.testField = detail.test_field
unset detail.test_field
where field2 = 'value2' and occurred_at between 'beginDate' and 'aDayAfterBeginning';

Here is my example query. Normally these dates are 3 months long but I want to use it like occurred_at between STR_TO_MILLIS('2023-07-10T00:00:00.000Z') and STR_TO_MILLIS('2023-07-11T00:00:00.000Z') and so on.

I wanted to solve this problem by using this for loop and transactions logic like that →

FOR date IN (date_range_millis(STR_TO_MILLIS('2023-07-10T00:00:00.000Z'),STR_TO_MILLIS('2023-08-15T00:00:00.000Z'), 'day')
  START TRANSACTION
      update testBucket.testScope.testCollection
      set detail.testField = detail.test_field
      unset detail.test_field
      where field2 = 'value2' and occurred_at between date and (date+86400000);
   COMMIT
END

But I couldn’t implement this logic.

Can anyone help me about implementing this or for finding a better solution?

Is the transaction per day important?

You would have to write a programme to have per-day transactions; then you could simply use DATE_TRUNC_MILLIS(occurred_at,'day') = $day with $day perhaps being the programme iterating over the DATE_RANGE_MILLIS() function results.

(If not, then:

UPDATE testBucket.testScope.testCollection
SET detail.testField = detail.test_field
UNSET detail.test_field
WHERE field2 = 'value2' 
AND occurred_at BETWEEN str_to_millis('2023-07-10T00:00:00.000Z') AND str_to_millis('2023-08-15T23:59:59.999Z')

will of course update the same range of documents, but I’m sure you were aware of that.)

Transaction is not important that much. If I get timeout on any iteration I can start my query again from last date.

The main reason behind my request is, when I want to select all the items between 2023-07-10T00:00:00.000Z , 2023-08-15T23:59:59.999Z these dates I’m having timeout because of the amount of items that return from occurred_at index. So that is the why I want to keep dates just 1 day long. I need 1 day long occurred_at dates on my every update.

OK, is the timeout the transaction that’s timing out? Or are you seeing an index scan timing out?

If it is the transaction, can you not include a longer transaction timeout to facilitate the request you’re wanting to run? (Just to keep things simple.)

Otherwise you’d still have to have a programme to run one statement per day if you’re having to issue multiple statements.

Here is a simplistic (Linux) shell script using the cbq shell that would run a statement per day:

cbq -u Administrator -p password -pretty=false -quiet -terse -s 'select date_range_millis(STR_TO_MILLIS("2023-07-10T00:00:00.000Z"),STR_TO_MILLIS("2023-08-15T00:00:00.000Z"), "day") dates'|grep "^{"|sed 's/[^[:digit:]]/\n/g'|grep "^[0-9]"|while read day
do
  cbq -u Administrator -p password -s "UPDATE testBucket.testScope.testCollection SET detail.testField = detail.test_field UNSET detail.test_field WHERE detail.test_field IS VALUED AND field2 = 'value2' AND DATE_TRUNC_MILLIS(occurred_at,'day') = ${day}"           
done

(I added the IS VALUED so it could be re-run and would only pick items that had yet to be updated. You could do alternative processing of the DATE_RANGE_MILLIS results with a tool such as ‘jq’.)

This uses the DATE_TRUNC_MILLIS approach but that may not perform as well as the BETWEEN range.

If it is problematic you could use:

cbq -u Administrator -p password -pretty=false -quiet -terse -s 'select date_range_millis(STR_TO_MILLIS("2023-07-10T00:00:00.000Z"),STR_TO_MILLIS("2023-08-16T00:00:00.000Z"), "day") dates'|grep "^{"|sed 's/[^[:digit:]]/\n/g'|awk '/^[0-9]/{if (s!=0) print s" "$1; s=$1}'|while read f t
do
  cbq -u Administrator -p password -s "UPDATE testBucket.testScope.testCollection SET detail.testField = detail.test_field UNSET detail.test_field WHERE detail.test_field IS VALUED AND field2 = 'value2' AND occurred_at BETWEEN ${f} AND ${t}"
done

(One day more in the date_range_millis results as overlapping pairs of results are used in the statement.)

HTH.

[Edit: I should note that if your version isn’t current you may not have the -terse cbq option; alternative text processing could be used then.]

I am using 7.1.1 Community Edition.

How can I write these queries on Query Editor?

Thanks for your quick responses :slight_smile:

The Query Editor can only run SQL++ statements; as I noted you typically need a programme to “automatically” run multiple individual statements. Of course you could submit a different statement for each day one at a time in the editor - you’d just update the day constant each time:

UPDATE testBucket.testScope.testCollection
SET detail.testField = detail.test_field
UNSET detail.test_field
WHERE field2 = 'value2' 
AND DATE_TRUNC_MILLIS(occurred_at,'day') = STR_TO_MILLIS('2023-01-10')

( You could use the REST endpoint instead of cbq to submit the statements, but you’d still need to write a client programme. )

You could also write a UDF - JavaScript Functions with Couchbase | Couchbase Docs - to do this. You would have to have access to save the UDF (and of course be able to write JavaScript).

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.