Title
#users-public
p

Paweł Wangryn (Vangreen)

09/29/2022, 9:21 AM
Hi, I try to implement questdb to my reactive java app. I'm trying to save large amount documents per second (~40k pps). I use InfluxDB Line Protocol java client. I write simple sink which buffer documents for one second and then create sender table for each document and on end of stream (flux) I try to make
sender.flush()
but data appears in database with delay or when i stop produce documents. Is questdb good choice to save this amount of documents?
Nicolas Hourcard

Nicolas Hourcard

09/29/2022, 9:31 AM
hey Vangreen - this is likely linked to configuration of the commit lag. Did you have a look at this page? https://questdb.io/docs/guides/out-of-order-commit-lag/
9:31 AM
also, what version do you use? we very recently changed the default settings for commit lag
p

Paweł Wangryn (Vangreen)

09/29/2022, 9:32 AM
I use latest docker image and latest maven lib
9:33 AM
Thanks for quick reply, I will look at it 🙂
Nicolas Hourcard

Nicolas Hourcard

09/29/2022, 9:37 AM
👍
p

Paweł Wangryn (Vangreen)

09/29/2022, 9:56 AM
my docker composecofiguration
version: '3.7'

services:
  questdb:
    image: questdb/questdb
    container_name: questdb
    restart: always
    ports:
      - 9000:9000
      - 9009:9009
      - 8812:8812
      - 9003:9003
    volumes:
      - /mnt/xfs/quest:/root/.questdb
    environment:
      - QDB_LOG_W_STDOUT_LEVEL=ERROR
      - QDB_LOG_W_FILE_LEVEL=ERROR
      - QDB_LOG_W_HTTP_MIN_LEVEL=ERROR
      - QDB_SHARED_WORKER_COUNT=20 # Amount of worker threads
      - QDB_PG_USER=admin # postgresql user -> Configured in .env file
      - QDB_PG_PASSWORD=secret # postgresql password -> Configured in .env file
      - QDB_TELEMETRY_ENABLED=false # Disable telemetry
      - QDB_CARIO_COMMIT_LAG=1000
      - QDB_MAX_UNCOMMITTED_ROWS=10000
Nicolas Hourcard

Nicolas Hourcard

09/29/2022, 9:57 AM
OK - thanks, I’ll let our engineers look into this
p

Paweł Wangryn (Vangreen)

09/29/2022, 9:58 AM
Thanks
10:25 AM
I notice when i decreased number of columns in my collection the save is much faster
Nicolas Hourcard

Nicolas Hourcard

09/29/2022, 10:28 AM
how many columns do you have ?
p

Paweł Wangryn (Vangreen)

09/29/2022, 10:30 AM
12 (2 long, 1 symbol, 9 string)
Nicolas Hourcard

Nicolas Hourcard

09/29/2022, 10:31 AM
OK, that is not a high number… we can easily deal with this
10:31 AM
engineering will be able to help im sure
Jaromir Hamala

Jaromir Hamala

09/29/2022, 10:47 AM
hello, do you assign timestamps on a client (=
at(timestamp)
or on a server (=
atNow()
)?
10:48 AM
what is a size of each row? I see your disk utilization is 13-14MB/s. Does it mean each of your rows is around 1KB?
Andrey Pechkurov

Andrey Pechkurov

09/29/2022, 10:49 AM
One more thing to mention is that our Java client has a blocking API, so make sure that you use corresponding calls in your reactive framework to call it.
Jaromir Hamala

Jaromir Hamala

09/29/2022, 10:51 AM
now this might seem silly, but worth trying: 13MB/s is suspiciously close to limits of 100mbit network. I once accidentally connected my home server with old cable for 100mbit networks. not happy days😃 are you sure this is not the case here?
p

Paweł Wangryn (Vangreen)

09/29/2022, 10:53 AM
@Jaromir Hamala I use
atNow()
10:53 AM
I have gigabit connection to my server, I'm 100% sure its ok.
Jaromir Hamala

Jaromir Hamala

09/29/2022, 10:54 AM
ok, that’s good. as a little experiment: could try to feed data to questdb server from multiple clients? to see if it makes any difference at all.
p

Paweł Wangryn (Vangreen)

09/29/2022, 10:55 AM
@Jaromir Hamala now I also notice, with decrease amount of columns, write speed is ok till about 500k documents in table
10:56 AM
After that save speed slows down
Jaromir Hamala

Jaromir Hamala

09/29/2022, 10:57 AM
interesting. what’s the cardinality of the symbol column? in other words: how many unique symbols do you have?
10:57 AM
you said it’s running in a docker container. what networking do you use?
p

Paweł Wangryn (Vangreen)

09/29/2022, 11:03 AM
@Jaromir Hamala
sender.table("packet")
                                        .symbol("id", packet.getId())
                                        .stringColumn("src", packet.getSrc())
                                        .stringColumn("dst", packet.getDst())
                                        .longColumn("observationTimestamp", packet.getObservationTimestamp())
.atNow()
11:03 AM
bridge network
Jaromir Hamala

Jaromir Hamala

09/29/2022, 11:06 AM
hm, what is
id
? is it unique per packet? can you execute
select count_distinct(_id_) from packet;
?
p

Paweł Wangryn (Vangreen)

09/29/2022, 11:07 AM
@Jaromir Hamalayes its unique
Jaromir Hamala

Jaromir Hamala

09/29/2022, 11:10 AM
ok, then it should not be the
symbol
type. the
symbol
type is good when the same string is repeated across many rows and when there is a relatively small amount of unique symbols. that’s not your case at all. try to change it from symbol to string.
p

Paweł Wangryn (Vangreen)

09/29/2022, 11:10 AM
Now its seems to run great 😄
11:11 AM
Thanks for help
Jaromir Hamala

Jaromir Hamala

09/29/2022, 11:11 AM
excellent!
11:11 AM
you are very welcome!
j

javier ramirez

09/29/2022, 2:21 PM
Just a small thing, did you notice the typo in the env variable?
- QDB_CARIO_COMMIT_LAG=1000
2:21 PM
it should be CAIRO
2:21 PM
you can also just set the commit lag property per table (on creation or with an alter table) so you don’t depend on the environment
p

Paweł Wangryn (Vangreen)

09/30/2022, 6:29 AM
@javier ramirez thanks 😄 , yes I set lag per table because this env variable doesn't work, now I know why 😄