KX Community

Find answers, ask questions, and connect with our KX Community around the world.
KX Community Guidelines

Home Forums kdb+ Partition Table Memory Usage

  • Partition Table Memory Usage

    Posted by jattwick on August 24, 2023 at 12:00 am

    I’m seeing some weird behaviour with querying a partition table (partitioned by month)

    This table has 414 columns, so each partition (if we select all columns is about 25 GB). This is a special table (not dealing with this many columns regularly).

    rawData:select from myPartitionTable where month in 2022.11m

    This query runs almost instantly, and when I check .Q.w[], it only uses 2GB. This was surprising to me, as I thought it would load the entire partition into memory given I didn’t select any specific columns, so I was expecting about 25GB of memory usage.

    However, I then try to run this:

    rawDataFiltered:select from rawData where date>2022.11.03

    and I ran out of memory. After playing around with this and increasing the memory limit, it seems like when running this query, it loads the entire partition into memory (my memory usage skyrockets to close to 25GB). Is what I’m describing correct? How does KDB behave with regards to loading partitions into memory?

    gyorokpeter-kx replied 9 months ago 3 Members · 3 Replies
  • 3 Replies
  • gyorokpeter-kx

    Member
    August 24, 2023 at 12:00 am

    There is some smart logic behind querying partitioned tables. If you filter on the partition column only, the data is actually not loaded into memory until you actually do some operations on it. So the first query doesn’t have to load all the data, just references to that specific partition. Then the second query does need to read in the data because you are querying an actual column in the table, so now q has to read the filtering column to find which rows match, and since you didn’t specify which columns to keep, it must also load the rest of the columns and filter them.

  • rocuinneagain

    Member
    August 24, 2023 at 12:00 am
  • jattwick

    Member
    August 24, 2023 at 12:00 am

    Thank you! Another related question – after doing the initial query on the partition, when I do

    100#rawData

    That still worked (and was fast, and did not increase the memory usage). If you’re saying we are not actually loading the partition into memory, shouldn’t this not work? Is there a way to partially load the columns in the partition?

Log in to reply.