https://questdb.io logo
Title
l

Lars Riis Olsen

03/29/2023, 3:30 PM
Hi!, is something like quasarDb’s RESTRICT TO operator possible in QuestDb?. It seems somewhat similar to LATEST ON but latest on can’t have a time stamp as its PARTITION BY. We are trying to meet the use case described here: https://doc.quasar.ai/master/user-guide/howto/forecast.html
j

Jaromir Hamala

03/29/2023, 4:13 PM
hi Lars, this looks like an omission in questdb. I see no good reason why
latest on
could not partition by
timestamp
given it can partition by
long
. I’ll explore this further and will let you know.
here is a possible workaround: consider this schema:
create table forecasts (when _timestamp_, version _timestamp_, temperature _double_) _timestamp_(version) partition by day;
Data in the table are sorted by the version (=when the forecast was created) then insert 6 forecasts. 3 versions for each days:
insert into forecasts values 
  (now(), dateadd('h', -10, now()), 40), 
  (now(), dateadd('h', -9, now()), 30), 
  (now(), dateadd('h', -1, now()), 42);

insert into forecasts values 
  (dateadd('d', 1, now()), dateadd('h', -10, now()), 140), 
  (dateadd('d', 1, now()), dateadd('h', -9, now()), 140), 
  (dateadd('d', 1, now()), dateadd('h', -1, now()), 142);
ideally, you would like to do this:
select * from forecasts
latest on version PARTITION by when;
but this does not work due to partitioning by a timestamp not being supported. so this is a possible workaround:
(select *, cast (when as long) as when_long from forecasts)
latest on version PARTITION by when_long;
the subquery casts the
timestamp
to
long
and then it’s used for partitioning. that being said - it’s still a workaround. I will explore why partitioning by timestamp is not implemented. and will add it if that’s just an omission. and one more warning: bear in mind
latest on
in most efficient when partitioning by a symbol - because questdb knows all symbols currently present in a table. and it can short-circuit full table scanning. when partitioning by
long
it cannot use this optimization.
j

javier ramirez

03/29/2023, 4:39 PM
I believe you should be able to do this today using row_number() over (partition by timestamp order by version DESC), then filtering only rows with number 1
it would be slower than LATEST ON, but should produce accurate results
using the table Jaromir showed above you can do
with numbered AS (
    select *, row_number OVER(partition by `when` order by `version` desc) as r_num
  from forecasts)
  select * from numbered   where r_num = 1
and get correct results
try executing without the where if you want to see the individual ranked results
l

Lars Riis Olsen

03/29/2023, 4:51 PM
Thanks. I will give it a try.
v

Vlad

03/30/2023, 11:01 AM
another workaround is to cast timestamp to
long
and have only one date in the table (to save on storage). For example this query can execute on demo.questdb.io and does what you're looking for:
select timestamp, tt::timestamp, symbol, side, price, amount from 
(SELECT timestamp, (date_trunc('day', timestamp))::long tt, symbol, side, price, amount FROM trades
WHERE symbol in ('BTC-USD', 'ETH-USD') and timestamp > dateadd('d', -10, now()))
LATEST ON timestamp PARTITION BY symbol, tt, side;