-
What is the role played by key columns in a keyed table [ query join/performance ] ?
I would like to understand what is the underlying optimization or performance tweak that kdb+/q is able to extract when we apply key on one/more columns in the table. Key serves the purpose of helping join 2 or more tables via row to row match using the keys. Internally the keyed table is stored as a map [ key value pair ]. The keyed columns are the key and the rest of unkeyed columns make the payload/data tuple. since it is a map querying the map for a particular key must be faster than an unkeyed table. The keys are basically nodes on a BST ( binary search tree ) which should look up in O(log(n)) time given n is size of the table. An unkeyed table on the other hand is an aggregate of rows [ an array basically ]. Hence searching it or querying should be O(n). But when we run a simple test with some data there is no difference in speed between querying unkeyed vs keyed table at all. Why is this ? In Fact the keyed table takes slightly more time to lookup a particular record. I do understand that the table may not be sorted in memory based on the keys but still with a dictionary type keyed table should a faster search be not possible ?
Log in to reply.