KX Community

Find answers, ask questions, and connect with our KX Community around the world.

Home Forums kdb+ HDB – map sym to int for partitioning

  • HDB – map sym to int for partitioning

    Posted by erichards on February 20, 2023 at 12:00 am

    Our access patterns are heavily based upon accessing one sym for a long history, rather than accessing everything for a given date.

    As such it would be massively more efficient for us to partition by sym rather than by date. Or even to maintain two HDBs with both partition types.

    While KDB doesn’t natively support partitioning by sym, it sounds like this is possible if we map this to an int ourselves.

    • Is this a well-trodden path?
    • What would be the best way to maintain this sym-to-int mapping? Could we re-use the indices in the sym file?
    • How could we ensure this mapping isn’t corrupted? Perhaps in a similar way to how .Q.en locks the sym file during writes?
    • Is there an upper limit to the integer that can be used for partitioning?

    Many thanks

    erichards replied 2 months, 1 week ago 2 Members · 2 Replies
  • 2 Replies
  • davidcrossey

    Member
    February 23, 2023 at 12:00 am

    Hi erichards,

    You could partition by int in this case – note, a sym is just a mapping to an integer value anyway, so you could use that for your partitioning. A blog from Aqua-Q here that discusses how this is applicable to IoT, and how you may manage the mapping as an example: kdb+ IoT Database Structure | AquaQ

    In terms of upper limit, I presume that would be the max integer value, but my best advice would be to mock up a database with some int partitions to see how performant it is and any limitations that come with it for your use-case.

    One consideration to keep in mind is that depending on your data read/write access patterns, some partitions could grow exponentially large, in which case you may want to add another layer of abstraction to your write down – e.g. int = year+sym for better query distribution later.

    Hope this helps.

  • erichards

    Member
    February 24, 2023 at 12:00 am

    Thank you David, it is re-assuring to know that we are not following an anti-pattern or something too bespoke.

     

    With a small demo we have found a sym-partitioned db to be much more performant for our access patterns.

     

    I had not considered the partition being a combination of sym + time, that might come in useful.

Log in to reply.