N1QL - Order By On Date fields with Offset not working correctly

Problem Statement:
The data is stored in the Couchbase with the date fields in the ISO 8601 format. Also, the date contains Offset information. When trying to SORT the records by the date-time field using the Order By Clause, the offset is not considered in calculating the sorting order.

Question

  1. Why does the default sorting order not working with DateTime offset?
  2. What is the recommended way to sort the date with offset information?

Reproducing the issue
INSERT INTO Bucket VALUES (“k01”, {“created”:“2020-11-23T03:55:12.5922223-06:00”, “docType” : “TEST”});
INSERT INTO Bucket VALUES (“k02”, {“created”:“2020-11-23T03:59:51.0496435-05:00”, “docType” : “TEST”});
INSERT INTO Bucket VALUES (“k03”, {“created”:“2020-11-23T04:00:07.7449352-05:00”, “docType” : “TEST”});
INSERT INTO Bucket VALUES (“k04”, {“created”:“2020-11-23T04:10:04.4547325-06:00”, “docType” : “TEST”});
INSERT INTO Bucket VALUES (“k05”, {“created”:“2020-11-23T04:20:19.6912516-05:00”, “docType” : “TEST”});

Query to Select data by created Ascending order
SELECT created, STR_TO_UTC(created) as utc FROM Bucket
WHERE docType = “TEST” and created > “2020-11-23”
ORDER BY created ASC

Actual Result
created utc
“2020-11-23T03:55:12.5922223-06:00” “2020-11-23T09:55:12.592Z”
“2020-11-23T03:59:51.0496435-05:00” “2020-11-23T08:59:51.049Z”
“2020-11-23T04:00:07.7449352-05:00” “2020-11-23T09:00:07.744Z”
“2020-11-23T04:10:04.4547325-06:00” “2020-11-23T10:10:04.454Z”
“2020-11-23T04:20:19.6912516-05:00” “2020-11-23T09:20:19.691Z”

Expected Result
created utc
“2020-11-23T03:59:51.0496435-05:00” “2020-11-23T08:59:51.049Z”
“2020-11-23T04:00:07.7449352-05:00” “2020-11-23T09:00:07.744Z”
“2020-11-23T04:20:19.6912516-05:00” “2020-11-23T09:20:19.691Z”
“2020-11-23T03:55:12.5922223-06:00” “2020-11-23T09:55:12.592Z”
“2020-11-23T04:10:04.4547325-06:00” “2020-11-23T10:10:04.454Z”

created of type ISO-8601 formatted string.

Any comparison operators (ORDER BY, predicates) on string will do left to right by byte comparison.
ISO-8601 formatted string are reliable string comparable when all the values are converted to same timezone.
As JSON doesn’t have timestamp data type, no way to know it is timestamp or actual string.
As if all the times are local time (instead of absolute UTC time). If the timestamp has different timezones and want absolute UTC comparison user must explicitly convert to common timezone using date functions in comparison or convert to number of MILLIS.

Thanks @vsr1 for the explanation.
I am able to understand how the underlying Date comparison is working with N1QL.

In certain programming languages with rich Data Types like C#, the DateTime comparison works fine, even when the JSON data is having DateTime with offset information. So, it was a bit confusing and definitely a learning.

Regarding the solution, which would you recommend from the following w.r.t performance.

  1. Use the function, STR_TO_UTC(created) in the Query and then sort by the STR_TO_UTC(created).
  2. Add the UTC information in the Document, and then sort by the UTC date as a string comparison.

Does adding the function in the query incur a considerable cost. The approximate data size is around 1 million records.

That all depends on how do use.
Adding two fields can be maintenance.
If you never need timezone the storing as UTC will be better.

I realize that I should benchmark the performance with any of the solutions to pick the best one.

However, from the theoretical behavior of the Couchbase Query engine, Is adding a function like STR_TO_UTC(created) adds any considerable cost?

Any language accessing field vs doing function on top will have overhead. If that only few times don’t see it doing millions times you might see difference.