Title
#users-public
g

gmaurice

08/18/2022, 12:07 PM
Hello, since i upgraded from 6.4.3 to 6.5 i got sometimes this error :
max txn-inflight limit reached
which i didn’t get before. Any idea about the cause ? Another thing is i see memory that regularly grows with time even prior to 6.5. We can see on this screenshot that the memory growing has changed, in a way faster with the upgrade.
12:10 PM
If you see at the memory consumption oscillation, it’s before the error
max txn-inflight limit reached
occurs, after that, the memory stops to grow. Even if i can’t request data, data is written.
Andrey Pechkurov

Andrey Pechkurov

08/18/2022, 12:15 PM
Hello, Are there any errors in the server logs?
g

gmaurice

08/18/2022, 12:21 PM
Hello, I only get this error regularly :
2022-08-17T09:29:17.089462Z E i.q.c.p.PGConnectionContext error [pos=57, msg=`[0]: max txn-inflight limit reached [txn=2621753, min=2605361, size=16384]`, errno=0]
However, i have not enough logs in back to find the context of the first instance. I only use docker logs for now, i will switch to write logs on disk and i tell you back on the next occurrence.
Alex Pelagenko

Alex Pelagenko

08/18/2022, 12:35 PM
max txn-inflight limit reached
usually caused by one the queries holding table for too long, possibly because of resources leak.
12:36 PM
To diagnose you can run
select * from reader_pool()
12:36 PM
find the table with positive owner and the time when it started
12:37 PM
if it looks very long time ago, please find the log section for the time period to identify the query and let us know what it is
12:39 PM
do you restart QuestDB or why uptime is getting shorter?
g

gmaurice

08/18/2022, 12:41 PM
uptime is getting back to zero when i restarts it, yes.
Alex Pelagenko

Alex Pelagenko

08/18/2022, 12:42 PM
what’s the reason for restart?
g

gmaurice

08/18/2022, 12:43 PM
Because of the error, questdb stopped to answer to read queries.
12:48 PM
However, i don’t remember why i restarted questdb before the upgrade (time of the annotation)
12:55 PM
On reader_pool, how can i found the effective reader with the integer given as the owner ?
Alex Pelagenko

Alex Pelagenko

08/18/2022, 12:56 PM
there is no such query yet, the useful part is
timestamp
12:57 PM
by this you may be able to find the query in the logs
12:57 PM
I’d expect it would be a failing query
12:58 PM
failing a sense that it did not return the data but was an error
12:59 PM
but not necessary, can it a successful query
g

gmaurice

08/18/2022, 12:59 PM
Yes, i found a very long one. The owner is corresponding to a sort of connection ?
1:02 PM
Ok, with the
timestamp
, i found an error :
2022-08-17T19:05:49.805401Z E i.q.c.p.PGWireServer
java.lang.StringIndexOutOfBoundsException: String index out of range: 8
        at java.lang.StringLatin1.charAt(StringLatin1.java:48)
        at java.lang.String.charAt(String.java:1515)
        at java.lang.Character.codePointAt(Character.java:8910)
        at java.util.regex.Pattern$CharPropertyGreedy.match(Pattern.java:4273)
        at java.util.regex.Pattern$Start.match(Pattern.java:3608)
        at java.util.regex.Matcher.search(Matcher.java:1728)
        at java.util.regex.Matcher.find(Matcher.java:745)
        at io.questdb.griffin.engine.functions.regex.MatchStrFunctionFactory$MatchConstPatternFunction.getBool(MatchStrFunctionFactory.java:100)
        at io.questdb.griffin.engine.table.AsyncFilteredRecordCursorFactory.filter(AsyncFilteredRecordCursorFactory.java:167)
        at io.questdb.cairo.sql.async.PageFrameReduceJob.reduce(PageFrameReduceJob.java:175)
        at io.questdb.cairo.sql.async.PageFrameReduceJob.consumeQueue(PageFrameReduceJob.java:132)
        at io.questdb.cairo.sql.async.PageFrameReduceJob.consumeQueue(PageFrameReduceJob.java:106)
        at io.questdb.cairo.sql.async.PageFrameSequence.stealWork(PageFrameSequence.java:390)
        at io.questdb.cairo.sql.async.PageFrameSequence.dispatch(PageFrameSequence.java:369)
        at io.questdb.cairo.sql.async.PageFrameSequence.next(PageFrameSequence.java:294)
        at io.questdb.griffin.engine.table.AsyncFilteredRecordCursor.fetchNextFrame(AsyncFilteredRecordCursor.java:184)
        at io.questdb.griffin.engine.table.AsyncFilteredRecordCursor.of(AsyncFilteredRecordCursor.java:226)
        at io.questdb.griffin.engine.table.AsyncFilteredRecordCursorFactory.getCursor(AsyncFilteredRecordCursorFactory.java:129)
        at io.questdb.griffin.engine.table.SelectedRecordCursorFactory.getCursor(SelectedRecordCursorFactory.java:58)
        at io.questdb.griffin.engine.groupby.SampleByFillNoneRecordCursorFactory.getCursor(SampleByFillNoneRecordCursorFactory.java:97)
        at io.questdb.cutlass.pgwire.PGConnectionContext.setupFactoryAndCursor(PGConnectionContext.java:2488)
        at io.questdb.cutlass.pgwire.PGConnectionContext$PGConnectionBatchCallback.postCompile(PGConnectionContext.java:2593)
        at io.questdb.griffin.SqlCompiler.compileBatch(SqlCompiler.java:930)
        at io.questdb.cutlass.pgwire.PGConnectionContext.processQuery(PGConnectionContext.java:2256)
        at io.questdb.cutlass.pgwire.PGConnectionContext.parse(PGConnectionContext.java:1550)
        at io.questdb.cutlass.pgwire.PGConnectionContext.handleClientOperation(PGConnectionContext.java:415)
        at io.questdb.cutlass.pgwire.PGJobContext.handleClientOperation(PGJobContext.java:81)
        at io.questdb.cutlass.pgwire.PGWireServer$1.lambda$$0(PGWireServer.java:81)
        at io.questdb.network.AbstractIODispatcher.processIOQueue(AbstractIODispatcher.java:166)
        at io.questdb.cutlass.pgwire.PGWireServer$1.run(PGWireServer.java:106)
        at io.questdb.mp.Worker.run(Worker.java:116)
I will fix it. However, why it’s still an active query ?
Alex Pelagenko

Alex Pelagenko

08/18/2022, 1:03 PM
no, not really, query would be before that, something using regex
g

gmaurice

08/18/2022, 1:04 PM
Yes, it’s a query with a regex 😉
Alex Pelagenko

Alex Pelagenko

08/18/2022, 1:04 PM
can you share the query itself? would be useful to fix it
1:05 PM
it’s probably a leak, that’s why it’s still active
g

gmaurice

08/18/2022, 1:10 PM
For sure :
SELECT
  timestamp as time,
  exchange,
  symbol,
  count() as trades
FROM
  table
WHERE
  symbol ~ '.*-?ETH' and
  timestamp BETWEEN '2022-08-17T13:05:43.981Z' AND '2022-08-17T19:05:43.981Z'
SAMPLE BY 1m
1:11 PM
this query is created and executed by grafana
Alex Pelagenko

Alex Pelagenko

08/18/2022, 1:15 PM
thanks, here is bug if you want to track https://github.com/questdb/questdb/issues/2441
g

gmaurice

08/18/2022, 1:33 PM
Thank you, subscribed.
Nicolas Hourcard

Nicolas Hourcard

08/18/2022, 1:34 PM
Hi @gmaurice , would you be able to take us through your use case at a glance? thanks a lot
3:41 PM
thanks
g

gmaurice

08/18/2022, 8:36 PM
After having restarted questdb with the right log configuration, i missed to restart my data ingester 😄 After a while, i restarted them (about 1,5 millions trades) and questdb has crashed a couple of minutes after, with this critical error :
2022-08-18T20:17:49.500012Z C server-main unhandled error [job=io.questdb.cairo.sql.async.PageFrameReduceJob@4721d212, ex=
java.lang.NullPointerException: Cannot invoke "io.questdb.cairo.sql.Function.getBool(io.questdb.cairo.sql.Record)" because "filter" is null
        at io.questdb.griffin.engine.table.AsyncFilteredRecordCursorFactory.filter(AsyncFilteredRecordCursorFactory.java:167)
        at io.questdb.cairo.sql.async.PageFrameReduceJob.reduce(PageFrameReduceJob.java:175)
        at io.questdb.cairo.sql.async.PageFrameReduceJob.consumeQueue(PageFrameReduceJob.java:132)
        at io.questdb.cairo.sql.async.PageFrameReduceJob.run(PageFrameReduceJob.java:194)
        at io.questdb.mp.Worker.run(Worker.java:116)
]
the line just before mentioned the same
java.lang.StringIndexOutOfBoundsException: String index out of range: 8
we talked about. I don't know if it's related.
Andrey Pechkurov

Andrey Pechkurov

08/19/2022, 5:45 AM
@gmaurice could be related. BTW do you have the
pg.worker.count
config property set to a non-default value?
g

gmaurice

08/19/2022, 7:40 AM
No, i didn’t change the default. However, how can i found the default
server.conf
file ? Because i used the one i used for 6.4.3. Maybe some things have changed between versions.
Andrey Pechkurov

Andrey Pechkurov

08/19/2022, 7:54 AM
There should be no breaking changes in the default config file
7:54 AM
It should be located in
<qdb_root_dir>/conf
g

gmaurice

08/19/2022, 8:19 AM
Thanks
Andrey Pechkurov

Andrey Pechkurov

08/22/2022, 2:03 PM
@gmaurice we have fixed the leak and it should be shipped in the next patch release. In the meantime, could you share the string column value on which you were getting StringIndexOutOfBoundsException? Knowing your jdk version is also important since this exception looks a lot like a bug in the standard Java library.
g

gmaurice

08/22/2022, 2:13 PM
🙏 You mean jdk version of the client ?
Andrey Pechkurov

Andrey Pechkurov

08/22/2022, 2:14 PM
No, the server one
2:15 PM
Or maybe you're using the one we ship with the rt version of QuestDB?
g

gmaurice

08/22/2022, 2:22 PM
Yes, exactly, i’m using the docker image you provide
Andrey Pechkurov

Andrey Pechkurov

08/22/2022, 2:30 PM
And do you happen to know which string value led to this exception?
g

gmaurice

08/23/2022, 3:16 PM
You mean the one which matches with regex
.*-?ETH
?
Andrey Pechkurov

Andrey Pechkurov

08/23/2022, 4:27 PM
No, the one that led to the exception
g

gmaurice

08/23/2022, 4:36 PM
So you talk about the query string ? If not, i’m unable to find the one you want.
Andrey Pechkurov

Andrey Pechkurov

08/23/2022, 5:05 PM
No, about the string (column value) on which the regex matcher was throwing the exception
g

gmaurice

08/23/2022, 5:21 PM
Ok understood, i can find it in logs ? I didn’t see it for now. Else, maybe i can do a distinct on the column but i will not be able to point you the one which led to the exception.
Andrey Pechkurov

Andrey Pechkurov

08/23/2022, 5:23 PM
All distinct values would be just fine
g

gmaurice

08/23/2022, 5:26 PM
Ok the next time i got the exception i’ll do that, should be soon 😉
Andrey Pechkurov

Andrey Pechkurov

08/23/2022, 5:53 PM
Hopefully, this doesn't happen, but if it does, you know what to do 🙂
g

gmaurice

08/25/2022, 11:39 AM
Hello @Andrey Pechkurov, i got the exception and here are the distinct values on
symbol
column :
"ALT-USD-22U30"
      "BTC-USD"
      "BTC-USD-22U30"
      "BTC-USD-PERP"
      "BTC-USDT"
      "CEL-USD"
      "CEL-USD-22U30"
      "CEL-USD-PERP"
      "ETH-USD"
      "ETH-USD-PERP"
      "ETH-USDT"
      "EXCH-USD-22U30"
      "MID-USD-22U30"
      "PRIV-USD-22U30"
      "SHIT-USD-22U30"
      "SOL-USD"
      "SOL-USD-22U30"
      "SOL-USD-PERP"
      "SOL-USDT"
      "USDT-USD"
      "USDT-USD-22U30"
      "USDT-USD-PERP"
Andrey Pechkurov

Andrey Pechkurov

08/25/2022, 11:51 AM
Hello, Many thanks for the info! I'll try to reproduce it.
3:50 PM
Tried to reproduce it on both 6.5 Docker and questdb-6.5-rt-linux-amd64 and failed to do so. No exception whatsoever 😞
1:56 PM
Awesome to hear that!
g

gmaurice

08/31/2022, 1:57 PM
You’re welcome, you’re doing great stuff, you deserve it 🙂