https://questdb.io logo
Title
h

Hristo Katsarski

05/16/2023, 8:27 AM
Hello QDB team, Is there a way/workaround to prevent ingesting rows in a table with the same values in "symbol" and "timestamp" columns (those are de-facto a composite PrimaryKey and regular SQL DB would throw "duplicate PK" error). We're using the Sender interface from the Java ILP client to ingest new rows. Accidentally, our software may attempt to persist new rows with the same symbol and timestamp values.
i

Imre

05/16/2023, 10:25 AM
hi @Hristo Katsarski, unfortunately there is no way to set UNIQUE constraint in QuestDB or do deduplication of records automatically. you can try to deal with it in SQL but that is likely to come with a performance penalty. • you can use
first()
,
last()
select symbol, first(col1), first(col2), timestamp from table;
select symbol, last(col1), last(col2), timestamp from table;

(you can add GROUP BY clause but QuestDB can figure it out from the list of columns too as above)
• you can also try to use
LATEST ON
instead of `last()`:
select symbol, col1, col2, timestamp from table latest on timestamp partition by symbol;
the other option is remove duplicates periodically from the table. let’s say overnight. you can create a temp table, and using the above SELECT commands copy the clean data into this table for today’s partition (assuming you have daily partitions but works for hour/month too), drop the partition, and then insert the clean data back from the temp table.
this will recreate the partition without duplicates
n

Nicolas Hourcard

05/16/2023, 10:59 AM
Hi @Hristo Katsarski , as a separate item, @Maciej Bodek is going to follow up here to take you through how to ingest data frequently, into the same tables, via the brand new CSV import UI that we are releasing in the coming few hours
h

Hristo Katsarski

05/16/2023, 11:09 AM
Hi @Imre, thank you for the detailed reply, I have to see if I can apply it for our use cases, at present I'm filtering duplicate timestamps from returned ResultSet-s.
Hi @Nicolas Hourcard, thank you!
m

Maciej Bodek

05/16/2023, 12:01 PM
Hi @Hristo Katsarski, Before I come back with some detailed instructions (we haven’t yet released the new version as well), could you give me a rough estimate on how big your file would be - how many rows are we talking about on average? 🙂
h

Hristo Katsarski

05/16/2023, 12:24 PM
Hi @Maciej Bodek , we'd like to import periodically new data from CSV file, data size can vary but won't be bigger than 10-20 K rows.
m

Maciej Bodek

05/16/2023, 12:25 PM
Sure, that won’t be an issue. I’ll come back with detailed instructions once the release is done. 🙂
h

Hristo Katsarski

05/16/2023, 12:28 PM
thank you 👍
m

Maciej Bodek

05/17/2023, 12:02 PM
Hey @Hristo Katsarski, 7.1.2 is already available, feel free to download and check out the new Import UI. Writing to an existing table can be done by dropping/selecting/pasting a file with the data, then making sure Write mode is set to Append, and just hit Upload. A few notes: • The currently shipped version does not allow you to change schema, designated timestamp and it’s pattern for an already existing table. You can all do it by selecting a different target table name though, it will automatically unlock the entire form. • Check out
Settings
panel for any tweaks you might need. If you have any problems, perhaps import to a new table first, and then you can click
Details
button next to the upload status, to see detailed upload result. Let me know if you have any issues, happy to explain further 👋
I would perhaps import to a new table first, just to see if everything is correct. This way you can also play with settings as you like 🙂