Your Enterprise Gatekeeping Just Made Me Build a Better Solution

TL;DR: Tried to use FIRST_VALUE() and LAST_VALUE() for time series aggregation, discovered they’re enterprise-only, built a superior JavaScript solution instead. Thanks for the motivation!

The Problem That Started It All

I needed simple OHLC aggregation from market data. Coming from a SQL background, I naturally reached for what should be basic window functions:

-- What I wanted to write (seems reasonable, right?)
WITH range_start AS (${ts_start}), 
     range_end AS (${ts_end})
SELECT 
  date,
  FIRST_VALUE(t._v3) OVER (PARTITION BY IDIV(t._t, ${intervalMilliseconds}) ORDER BY t._t ASC) AS open,
  MAX(t._v1) AS high,
  MIN(t._v2) AS low, 
  LAST_VALUE(t._v0) OVER (PARTITION BY IDIV(t._t, ${intervalMilliseconds}) ORDER BY t._t ASC) AS close,
  SUM(t._v4) AS volume
FROM `${cbConnection.bucketName}` AS d
UNNEST _timeseries(d, {'ts_ranges': [range_start, range_end]}) AS t
WHERE d.ticker = '${symbol}'
.....

The Enterprise Wall

Imagine my surprise when I discovered that FIRST_VALUE() and LAST_VALUE() - functions that exist in literally every other database (PostgreSQL, MySQL, SQL Server, Oracle, even SQLite!) - are locked behind your Enterprise Edition.

For basic aggregation functions. In the BIG 2025?

Plot Twist: JavaScript Was Actually Better

Instead, I wrote this in JavaScript:

const aggregated = aggregateByInterval(rawData, '5m', {
  open: 'first',
  high: 'max', 
  low: 'min',
  close: 'last',
  volume: 'sum'
});

And you know what? It’s actually BETTER than your enterprise functions:

Why My JavaScript Solution Destroys Your Enterprise Features:

:white_check_mark: Better Performance

  • Your enterprise: SQL parsing, query planning, network round trips
  • My solution: In-memory aggregation with efficient groupBy operations

:white_check_mark: Infinitely Extensible

// Try doing THIS with your enterprise functions:
const result = createAggregator(data, '5m', {
  open: 'first',
  high: 'max',
  low: 'min', 
  close: 'last',
  volume: 'sum',
  vwap: (values) => calculateVWAP(values),
  rsi: (values) => calculateRSI(values, 14),
  ema: (values) => calculateEMA(values, 20),
  volatility: (values) => calculateVolatility(values),
  customMetric: (values) => myComplexCalculation(values)
});

What This Says About Couchbase’s Strategy

You’re gate-keeping basic SQL window functions that have been standard in databases for decades. Functions that cost you essentially nothing to implement, but you’re using as enterprise bait.

Meanwhile, developers like me are building better solutions that completely bypass your database for these operations.

The Irony

I still use Couchbase Community Edition for data storage (it’s great for that!), but now I do ALL my aggregation in JavaScript. Your enterprise gatekeeping literally drove me to build a superior solution that doesn’t need your database at all for the complex parts.

Message to Couchbase Product Team

Maybe consider what PostgreSQL, MySQL, and even SQLite figured out years ago: basic aggregation functions shouldn’t be enterprise features.

You’re not protecting valuable IP - you’re just annoying developers into building better alternatives.


P.P.S. - Couchbase, when you inevitably make these functions available in Community Edition, you’ll find that many of us have already moved on to better solutions. :man_shrugging:


Built with: JavaScript, Lodash, and spite :smiling_face_with_horns:

Basic aggregates refer as Without OVER() clause are available in CE.
Online Analytic (Window) functions (OVER clause) are advanced futures of SQL.
Depends on how one can write (Partition by, order by, window frames) those can be very resource(CPU, memory) intensive due to sorts, repeated computation of each row for many values.
In Couchbase CE Query service is limited by max 4 cores due to that Window functions not available

1 Like

Also check validity of SQL because you are using basic aggregates and will not able to use any non group by columns

As your description The following query might satisfy your requirement without Window functions.
FYI: Key here is MIN/MAX calculated over array [t._t, t._v3] and from final result extract index 1

WITH range_start AS (${ts_start}),
     range_end AS (${ts_end})
SELECT
  ts,
  MIN(t._v2) AS low,
  MAX(t._v1) AS high,
  SUM(t._v4) AS volume,
  MIN([t._t, t._v3])[1] AS open,
  MAX([t._t, t._v4])[1] AS close
FROM `${cbConnection.bucketName}` AS d
UNNEST _timeseries(d, {'ts_ranges': [range_start, range_end]}) AS t
LET ts = IDIV(t._t, ${intervalMilliseconds})
WHERE d.ticker = '${symbol}'
GROUP BY ts;
1 Like