KX Community

Find answers, ask questions, and connect with our KX Community around the world.
KX Community Guidelines

Home Forums kdb+ Key Value Store

  • Key Value Store

    Posted by jlucid on January 2, 2024 at 12:00 am

    Wondering if anyone has ever tried to implement a dedicated key value store in kdb+, something like levelDB.

    I have a situation where users wish to perform a lookup by an alphanumeric string but I don’t know which date partition contains the associated record in advance. Clearly I need to avoid an exhaustive search across all date partitions. If I had a lookup of string to date that would help narrow the search.

     

    I’ve tried using a keyed table, stored as a flat file, but it’s not scalable in terms of memory. I could hold the past months worth in memory and that would satisfy 90% of the queries but I need something more general with constant lookup time.  I’d also like to avoid having to introduce another technology

     

     

    jlucid replied 8 months, 2 weeks ago 3 Members · 3 Replies
  • 3 Replies
  • rocuinneagain

    Member
    January 4, 2024 at 12:00 am

    A splayed table on disk with an attribute on a column would be worth testing as these can be mapped rather than requiring to be all in memory

    https://code.kx.com/q/ref/set-attribute/#unique

     

    Using 1: to write an Anymap file also creates a mappable object worth exploring

    https://code.kx.com/q/releases/ChangesIn3.6/#anymap

     

    If a single splay/anymap would be too large a fixed size int partitioned DB on a fixed range hash of the alphanumeric string could be used

  • jlucid

    Member
    January 4, 2024 at 12:00 am

    Thanks for the ideas Rian, yes the single anymap file would be too large, but I could try distributing the keys across a set of int partitions, so grouping them in some way, perhaps using a hash. That would reduce the search space. Then I could split the partitions again if they get too big.

     

    Another idea was having a Bloom or Cuckoo filter associated with each date partition, using that to determine if a string is definitely not present in a partition to avoid searching, but it’s not a native feature and I can’t find any examples of people using that.

     

    For the levelDB option, I see that I can compile the C++ library into a shared library and then load that into my q process. At least with that approach I am just writing a wrapper library for the main “Get” and “Put” methods. So that should be relatively quick to test against and use as a benchmark

     

  • conor_mahony

    Member
    February 3, 2024 at 12:00 am

    May not be for you, however worth noting that a really simple method of improving performance is to persist a guid string representation together with the original string. This doesn’t help with regex type queries of course. -> hashguid:{0x0 sv md5 x}. If the topic is of interest, see https://dataintellect.com/blog/methods-for-storing-text-data-on-disk-in-kdb/

Log in to reply.