https://questdb.io logo
Title
s

Sebastián Torrealba

04/14/2023, 3:18 PM
Hi guys, any thoughts about efficiently exporting table(2.5B records) to a csv file ?
j

javier ramirez

04/14/2023, 4:00 PM
For storage efficiency, I cannot think of anything other than paginating over the REST API. For time efficiency, maybe launching multiple queries in parallel for non-overlapping time segments that align with your partitions. You would end up with multiple files in this case that you would need to concat
s

Sebastián Torrealba

04/14/2023, 4:47 PM
We did the parallel paginated query to the rest api, but where looking for something at file level
h

Henri Asseily

04/14/2023, 8:23 PM
Why would you export to csv in the first place?
s

Sebastián Torrealba

04/14/2023, 8:26 PM
As julia udf seems no to be in the near future will do heavy computation on bigquery and then back to questdb
j

javier ramirez

04/17/2023, 9:15 AM
Very interesting use case, Sebastián. I will point about this to the core team as we have recently talked about the possibility of both having some sort of in-database processing, and also exploring portable formats like apache arrow. No final decisions or dates at the moment, but your use case for sure can inspire us 🙂
Depending on your experience, using Spark with questdb might be also an alternative rather than bigquery, so you could have the read/processing/write back in a single spark app https://dev.to/glasstiger/integrate-apache-spark-and-questdb-for-time-series-analytics-3i3n
s

Sebastián Torrealba

04/17/2023, 2:15 PM
Thanks @javier ramirez will have a look on spark