Copy documents within same bucket matching condition

While copying 15000 documents matching some condition and at the same time trying to update two attribute values using below N1QL query, I get “the query was canceled.” exception. I use .Net SDK
I wanted to modify two attributes in the newly copied document (id, scenarioId)

“INSERT INTO Schedules(KEY x.id, VALUE x)
SELECT OBJECT_PUT(Y,"scenarioId", newScenarioId) AS x FROM (SELECT RAW OBJECT_PUT(d,"id",newScheduleId) FROM Schedules AS d USE KEYS $oldScheduleId LET newScheduleId = $newScheduleId) AS Y LET newScenarioId = $newScenarioId”;

var copySchedulesQueryResults = await cluster.QueryAsync(
copySchedulesQuery, options => options
.Parameter(“oldScheduleId”, (string)scheduleId)
.Parameter(“newScheduleId”, newScheduleId.ToString())
.Parameter(“newScenarioId”, newScenarioId.ToString()));

What else can be done to quickly make copies of such documents?

If already know keys why don’t use KV get and KV write by changing the document in SDK.

Simplified query. May not help your error

INSERT INTO Schedules(KEY x.id, VALUE x)
SELECT OBJECT_CONCAT(d, {"id":$newScheduleId, "scenarioId": $newScenarioId}) AS x
FROM Schedules AS d USE KEYS $oldScheduleId ;

One query sending more than one value as object


INSERT INTO Schedules(KEY x.id, VALUE x)
SELECT OBJECT_CONCAT(d, $obj.[META(d).id]) AS x
FROM Schedules AS d USE KEYS OBJECT_NAMES($obj);

$obj = {"oldScheduleId1":{"id":"newScheduleId1", "scenarioId": "newScenarioId1"},
        "oldScheduleId2":{"id":"newScheduleId2", "scenarioId": "newScenarioId2"},
         ......
       }

Could you please explain the working of the below query? Does it mean I have to get the $obj through a N1QL query?
SELECT OBJECT_CONCAT(d, $obj.[META(d).id]) AS x
FROM Schedules AS d USE KEYS OBJECT_NAMES($obj);

Not through N1QL you need construct yourself in SDK,

You mentioned 15000 documents might doing some loop and calling 15000 queries?
Instead construct objet that 15000 fields with old, new object u want replace and call once.

ar copySchedulesQueryResults = await cluster.QueryAsync(
copySchedulesQuery, options => options
.Parameter(“oldScheduleId”, (string)scheduleId)
.Parameter(“newScheduleId”, newScheduleId.ToString())
.Parameter(“newScenarioId”, newScenarioId.ToString()));

Or post the query that matches 15000 condition and how you are generation new values newScheduleId, newScenarioId.

@vsr1 , @btburnett3 Here’s the entire query/logic pasted below and yes I do loop through individual 15000 id’s to copy the old to new schedule.

var getScheduleQuery = “Select RAW d.id as scheduleIds from SomeBucket.Schedules AS d WHERE d.type = $type AND d.scenarioId= $oldScenarioId”;
var getSchedulesResult = await cluster.QueryAsync(
getScheduleQuery, options => options
.Parameter(“type”, “Schedule”)
.Parameter(“oldScenarioId”, oldScenarioId));
var scheduleIds = await getSchedulesResult.Rows.ToListAsync();
foreach (var scheduleId in scheduleIds)
{
var newScheduleId = Guid.NewGuid();
//first replace id with new schedule Id, then change scenarioId with new scenarioId
var copySchedulesQuery = "INSERT INTO SomeBucket.Schedules(KEY x.id, VALUE x) " +
"SELECT OBJECT_PUT(Y,"scenarioId", newScenarioId) AS x FROM (SELECT RAW OBJECT_PUT(d,"id",newScheduleId) FROM " +
SomeBucket.Schedules AS d USE KEYS $oldScheduleId LET newScheduleId = $newScheduleId) AS Y LET newScenarioId = $newScenarioId”;
var copySchedulesQueryResults = await cluster.QueryAsync(
copySchedulesQuery, options => options
.Parameter(“oldScheduleId”, (string)scheduleId)
.Parameter(“newScheduleId”, newScheduleId.ToString())
.Parameter(“newScenarioId”, newScenarioId.ToString()));
}

var newScheduleId = Guid.NewGuid();

wha is newScenarioId ???

My bad, newScenarioId is a new Guid() passed as parameter to the function

The following query all matched document are inserted into new target.
If required increase query timeout
If one time u can use Query Work Bench

INSERT INTO Schedules(KEY x.id, VALUE x)
    SELECT OBJECT_CONCAT(s, {"id":UUID(), "scenarioId": $newScenarioId}) AS x
    FROM Schedules AS s
    WHERE s.type = "Schedule" AND s.scenarioId = $oldScenarioId;
1 Like

@vsr1 one doubt in the above query since matching “schedule” already has these two attributes(‘id’ and ‘scenarioId’), when I use OBJECT_CONCAT(s, {“id”:UUID(), “scenarioId”: $newScenarioId}), will it not duplicate these attributes again?

This works like a charm. @vsr1 You are a lifesaver :slight_smile:

object_concat() if field is there replaces new value, not there adds

Thanks @vsr1 . For some reason query takes 4.6 s to execute in console but when used with .Net SDK takes 12 s without any serialization just to retrieve the results as dynamic. Its not the first time I see it happening though, happens in other examples as well.

Not sure why it took so long via .NET SDK. Try to see what is issue.

  1. enable request profile for the statement in question and analyze
  2. Force USE INDEX in the SELECT (of same index used in Query Work Bench console ) and checkout.