KX Community

Find answers, ask questions, and connect with our KX Community around the world.

Home Forums kdb+ Performance using a table to combine where phrases

  • Performance using a table to combine where phrases

    Posted by eohara_kdb on July 6, 2023 at 12:00 am

    Hi all,

    Passing a table in to a where clause instead of the individual values significantly affects performance, even compared to using & to join all the where phrases into one. What is the query doing under the covers?

    1#.eoh.f date 
    SRC PRICE_TYPE INDEX1 INDEX2 CONTRACT1 PROFILE CONTRACT_TYPE 
    -------------------------------------------------------------------------- 
    2023.06.28 market dev JPM 2023.10.M BLA 
    // PUBLICATIONS is date-partitioned, and doesn't have attributes on the columns we filter on 
    ts select from PUBLICATIONS where ([]date;SRC;PRICE_TYPE;INDEX1;INDEX2;CONTRACT1;PROFILE;CONTRACT_TYPE) in 1#.eoh.f 
    1176 177568 
    ts select from PUBLICATIONS where date=2023.06.28,src=`market,PRICE_TYPE=`dev,INDEX1=`JPM,INDEX2=`,CONTRACT1=`2023.10.M,PROFILE=`,CONTRACT_TYPE=`BLA 
    5 2101760 
    ts select from PUBLICATIONS where (date=2023.06.28)&(src=`market)&(PRICE_TYPE=`dev)&(INDEX1=`JPM)&(INDEX2=`)&(CONTRACT1=`2023.10.M)&(PROFILE=`)&CONTRACT_TYPE=`BLA 
    531 179664 
    ts select from PUBLICATIONS where (date in 2023.06.28)&(SRC in `market)&(PRICE_TYPE in `dev)&(INDEX1 in `JPM)&(INDEX2 in `)&(CONTRACT1 in `2023.10.M)&(PROFILE in `)&CONTRACT_TYPE in `BLA 
    504 179664

    Using kdb+ 3.5 2017.10.11

    Thanks,

    Eoghan

    eohara_kdb replied 2 months, 1 week ago 3 Members · 2 Replies
  • 2 Replies
  • davidcrossey

    Member
    July 6, 2023 at 12:00 am

    I would presume optimizations such as left to right sub-clause filtering as mentioned here https://code.kx.com/q4m3/9_Queries_q-sql/#9333-multiple-where-subphrases are lost when filtering using a table/dict lookup.

    i.e. we need to check all of the columns specified in the lookup simlataneouly, instead of passing each consective filter to the next sub-phrase.

    To visualize differently:

    q)show tab:([]c1:`a`b`c;c2:1 2 3) c1 c2 ----- a 1 b 2 c 3 q)tab in 1#tab 100b q)select from tab where tab in 1#tab c1 c2 ----- a 1

    Also worth having a look at the parse trees:

    q)parse "select from PUBLICATIONS where date=2023.06.28,src=`market,PRICE_TYPE=`dev,INDEX1=`JPM,INDEX2=`,CONTRACT1=`2023.10.M,PROFILE=`,CONTRACT_TYPE=`BLA" ? `PUBLICATIONS 
    ,((=;`date;2023.06.28);(=;`src;,`market);(=;`PRICE_TYPE;,`dev);(=;`INDEX1;,`JPM);(=;`INDEX2;,`);(=;`CONTRACT1;,`2023.10.M);(=;`PROFILE;,`);(=;`CONTRACT_TYPE;,`BLA)) 0b () 
    q)parse "select from PUBLICATIONS where ([]date;SRC;PRICE_TYPE;INDEX1;INDEX2;CONTRACT1;PROFILE;CONTRACT_TYPE) in 1#.eoh.f" ? `PUBLICATIONS 
    ,,(in;(+:;(!;,`date`SRC`PRICE_TYPE`INDEX1`INDEX2`CONTRACT1`PROFILE`CONTRACT_TYPE;(enlist;`date;`SRC;`PRICE_TYPE;`INDEX1;`INDEX2;`CONTRACT1;`PROFILE;`CONTRACT_TYPE)));(#;1;`.eoh.f)) 0b ()

     

  • gyorokpeter-kx

    Member
    July 6, 2023 at 12:00 am

    If you have a list of where clauses separated by a comma, it is executed sequentially, successively reducing the number of rows processed. On the other hand if you use the & operator, all operations will run on the full table.

Log in to reply.