Hello, I want to perform join using not a primary key. Is it possible?
What I need is something like the SQL below:
select *
from activity
left join activity as future_activity on activity.user_id = future_activity.user_id and activity.date = future_activity.date - interval ‘1 day’
I understand syntax does not support it. But maybe you guys know a workaround? (I tried to perform a join of select, but obviously failed).
You should try to have parent-child relationship via child referencing the parent’s document-id.
Logically, this will be the equivalent query:
select *
from activity
left join activity as future_activity ON KEYS
activity.user_id = meta(future_activity).id /* future_activity.user_id */
WHERE date_diff_str(activity.date, future_activity.date, “day”) = 1;
actual query is:
select *
from activity
left join activity as future_activity ON KEYS
activity.user_id
WHERE date_diff_str(activity.date, future_activity.date, “day”) = 1;
If the document id is a composite key, you can concatnate the values to form a ON KEYS expression.
With Couchbase, you can join from parent-to-child as well.
You commented out future_activity.user_id: but this is what I actually need: activity.user_id = future_activity.user_id as activity.user_id definitely can not be a primary key in future_activity because it’s the same “table”.
ON [ PRIMARY ] KEY rhs-expression.lhs-expression-key FOR lhs-expression
I like the word expression in lhs-expression. This is an expression! A subquery?? But again use keys is mandatory for subqueries.
As I understand you offer somehow to turn it to a situation with “parent-child”, and I don’t mind to modify my data to make it possible. But I don’t see how. Maybe you know a trick or a common practice having such an impressive experience?
The query you mentioned is the LOGICAL equivalent. Below is the ACTUAL query syntax.
select *
from activity
left join activity as future_activity
ON KEYS activity.user_id
WHERE date_diff_str(activity.date, future_activity.date, "day") = 1;
What we are trying to achieve is finding out if a user is a new subscriber to our services during a calendar month. We would know if they are a new subscriber if they did not make a subscription last month (note: some subscribers are resurrected so querying the first time they made a subscription payment ever does not work).
The subscription payment document has the following schema:
The created property is the date when a subscription payment was made. If the account made a payment in October we want to know if they also made a payment in September. If they did not then they are a new user. We would like to be able to get a data series (daily / weekly / monthly) data series of new users.
This type of query is extremely useful for SAAS companies who need to calculate retention metrics etc. for reporting purposes.
I tried this, trying to mimic the model you gave. See the query below.
If you give full sample documents with document-keys, we can try that.
In Couchbase (and NoSQL in general), document key design is important since they form the reference/relationship.
select meta().id as dockey, * from pay;
[
{
"dockey": "subpayment::ca0bef29-0007-4d1b-a8cc-6671151a6b36",
"pay": {
"account_uid": "user::acc_HAYPTvMCDGCShwmBFeJqSc",
"created": "2015-08-21T12:30:40",
"doc_type": "subspayment"
}
},
{
"dockey": "subpayment::cc9d320b-6eb0-48f2-a59b-b187739318f2",
"pay": {
"account_uid": "user::acc_HAYPTvMCDGCShwmBFeJqSc",
"created": "2015-04-21T12:30:40",
"doc_type": "subspayment"
}
},
{
"dockey": "subpayment::f52cfa5f-1ca1-4c04-a475-dead49988d97",
"pay": {
"account_uid": "user::acc_HAYPTvMCDGCShwmBFeJqSc",
"created": "2015-03-23T12:30:40",
"doc_type": "subspayment"
}
},
{
"dockey": "subpayment::fbb6c992-e189-4edd-97ce-c64a8ef16faa",
"pay": {
"account_uid": "user::acc_HAYPTvMCDGCShwmBFeJqSc",
"created": "2015-05-21T12:30:40",
"doc_type": "subspayment"
}
},
{
"dockey": "user::acc_HAYPTvMCDGCShwmBFeJqSc",
"pay": {
"fname": "John",
"lname": "Miller",
"udate": "2015-10-10T12:30:40",
"zip": 94523
}
}
]
select *
FROM pay p USE KEYS ["user::acc_HAYPTvMCDGCShwmBFeJqSc"]
INNER JOIN pay s ON KEY s.account_uid FOR p
WHERE date_diff_str(p.udate, s.created, "month") = 1;
Here is a sample of data for each subscription document. There are 3 accounts and what we need to know is when they started subscribing by looking at whether they made a payment the month before also. If they did not make a payment the month before then they are considered newly subscribed.
SCROLL the window above to see additional queries.
Add the grouping of these results by month (use the date functions…)
SELECT (TOSTR(date_part_str(dcreated, "year")) || "-" || TOSTR(date_part_str(dcreated, "month"))) as month, count(1) total
FROM
(SELECT account_uid, max(created) as dcreated
FROM c
GROUP BY account_uid
HAVING (DATE_DIFF_STR("2016-11-15T12:30:40", max(created), "month")) > 0 ) AS newacts
GROUP BY (TOSTR(date_part_str(dcreated, "year")) || "-" || TOSTR(date_part_str(dcreated, "month")))
[
{
"month": "2015-11",
"total": 2
},
{
"month": "2015-10",
"total": 1
}
]