-
Pagination on query from on-disk/partitioned data (HDB)
Hello All,
Paginating through a table when querying in batches is quite a useful feature in SQL-like DBs. Understand that kdb being a column-based database, it is not the same expected behaviour as in SQL.
In kdb+, however, there are a couple of similar implementations:
- limit-offset like behaviour using syntax shown here: https://code.kx.com/q/ref/select/#limit-expression
- using a virtual column i, which refers to the row number, as shown here: https://code.kx.com/q4m3/9_Queries_q-sql/#9332-the-virtual-column-i-in-where
The above 2 options have their own limitations. The first one only works on in-memory tables, and in the second one, any filters used are applied after row numbers (i) are selected, i.e. the number of required rows are filtered first and then the other where conditions are applied.
My use case: I am trying to do pagination over a HDB table that is partitioned on-disk.
One of the alternatives, although not ideal, I’ve found is to query the data from HDB first and once in memory, use the limit-offset like qsql query to achieve the pagination result, but this means, I have to query the entire dataset for every page. Other than creating a middle layer to store the entire data I need to page over in memory and create other APIs to query from this, I am happy to adopt any other more simpler approach.
Has anyone run into this problem and has found any solution or using any alternatives that they can share? Maybe its just me, but I feel like this could be a very helpful feature.
Thanks & Regards,
Sid
Log in to reply.