Clicks SUM group by hour in the day

I have many documents with click information ( > 100 000 000 ) I need to calculate sum clicks in day grouped by hours
But sample query like

SELECT  sum(done)
FROM `task_bot_arh` AS tba    WHERE _type='BotTask' and 
                
stamp BETWEEN ( 1619481600000 and  1619567999000)  AND   done>0

Request failed due to timeout

I maked index
CREATE INDEX adv_done_stamp_type ON task_bot_arh(done,stamp) WHERE (_type = ‘BotTask’)

Please сan you tell me how to execut this task?

CREATE INDEX adv_done_stamp_type ON task_bot_arh ( stamp, done ) WHERE ( _type = ‘BotTask’ AND done > 0)

OR

CREATE INDEX adv_done_stamp_type ON task_bot_arh ( stamp, done ) WHERE ( _type = ‘BotTask’ )

I maked it but

SELECT  count(done)
FROM `task_bot_arh` AS tba    WHERE _type='BotTask' and 
                
stamp BETWEEN  1619481600000 and  1619567999000

execution: 1m20.4s

Index Currently Used

CREATE INDEX adv_done_stamp_type ON task_bot_arh(done,stamp) WHERE (_type = ‘BotTask’)

This is very slowly !

Use the following index

CREATE INDEX ix1 ON task_bot_arh ( stamp, done ) WHERE ( _type = “BotTask” )

I maked index

Index Currently Used

CREATE INDEX ix1 ON task_bot_arh(stamp,done) WHERE (_type = ‘BotTask’)

BUT execution: 8.8s

SELECT  count(done)
FROM `task_bot_arh` AS tba    WHERE _type='BotTask' and 
                
stamp BETWEEN  1619481600000 and  1619567999000

Any ideas?

Maybe couchbase does such a functional?
MYSQL takes about a second

Post the EXPLAIN plan. Also CE version of query services is limited 4 cores and query services needs to get all the qualified data from index and do count. It also depends on how much data you count and size of indexer memory allocated, storage type etc.

I used CE edition and now try Couchbase Server Enterprise Edition 6.6.1 build 9213
Data: 220000
Index memory:36864
Analytics 31751
Search 4096
Explain is

{
    "#operator": "Sequence",
    "~children": [
        {
            "#operator": "IndexScan3",
            "as": "tba",
            "covers": [
                "cover ((`tba`.`stamp`))",
                "cover ((`tba`.`done`))",
                "cover ((meta(`tba`).`id`))",
                "cover (count(cover ((`tba`.`done`))))"
            ],
            "filter_covers": {
                "cover ((`tba`.`_type`))": "BotTask"
            },
            "index": "ix1",
            "index_group_aggs": {
                "aggregates": [
                    {
                        "aggregate": "COUNT",
                        "depends": [
                            1
                        ],
                        "expr": "cover ((`tba`.`done`))",
                        "id": 3,
                        "keypos": 1
                    }
                ],
                "depends": [
                    1
                ]
            },
            "index_id": "f22688c61b155938",
            "index_projection": {
                "entry_keys": [
                    3
                ]
            },
            "keyspace": "task_bot_arh",
            "namespace": "default",
            "spans": [
                {
                    "exact": true,
                    "range": [
                        {
                            "high": "1619567999000",
                            "inclusion": 3,
                            "low": "1619481600000"
                        },
                        {
                            "inclusion": 0,
                            "low": "null"
                        }
                    ]
                }
            ],
            "using": "gsi"
        },
        {
            "#operator": "Parallel",
            "~child": {
                "#operator": "Sequence",
                "~children": [
                    {
                        "#operator": "InitialProject",
                        "result_terms": [
                            {
                                "expr": "cover (count(cover ((`tba`.`done`))))"
                            }
                        ]
                    },
                    {
                        "#operator": "FinalProject"
                    }
                ]
            }
        }
    ]
}

The plan (EE) looks right and using index aggregation. I.e. Indexer does count. cc @deepkaran.salooja

And why is it so slow?
Maybe I need to change some system settings?

@Lenty_Me , what is the execution time with the EE 6.6.1? Can you also check what is the resident percent of the index ix1 (available on the UI)?

Execution time: 10.5s

Index resident ratio 17%
Index items 46,670,934

Thank you for your support, I look forward to your recommendations.

@Lenty_Me , one option you can try is to give more memory for the indexer to increase the resident percent. With 17% resident, if any page needs to be fetched from the disk, that will increase the execution time.
You can experiment by running just a single scan with same input params repeatedly. If the execution time improves(and the single scan needs to access <17% of the data), that would indicate that the subsequent scans are having better cache hits.

Also, you can try using partitioned index. This can help spread out the index on multiple nodes and utilize the multiple cores, disk IO on the box and reduce the execution time significantly.

1 Like

I want sum of activity by day and my work hours (8am-8pm), off hours (the rest). So for every day I’ll get 2 rows, group of work hours and … When you want to group by minute, hour , day , week, etc., you may be tempted to just group by your timestamp column. If you do that, clicker counter though, you’ll get one group … To add minutes to a datetime you can use DATE_ADD() function from MySQL . In PHP, you can use strtotime(). select date_add(yourColumnName,interval 30 minute) from yourTableName; To use the above syntax, let us create a table.

To add minutes to a datetime you can use DATE_ADD() function from MySQL* . In PHP, you can use strtotime(). select date_add(yourColumnName,interval 30 minute) from yourTableName; To use the above s clicker counter yntax, let us create a table. We can **use DATEADD() function like below to add hours to DateTime in Sql Server. DATEADD() functions first parameter value can be hour or hh all will return the same result.