KX Community

Find answers, ask questions, and connect with our KX Community around the world.

Home Forums KX Academy Issue with Order in the by statement

  • Issue with Order in the by statement

    Posted by mn_12 on June 2, 2023 at 12:00 am

    Hi,

     

    I intended to write this as well in my original post but decided to do so here instead for better organization as it concerns the Queries Exercise notebook.

     

    trade is:

    time         sym  size price side exchange 
    ------------------------------------------ 
    05:41:51.079 MSFT 4877 6918  B    T        
    01:05:28.400 JPM  9380 1603  B    T        
    02:18:49.437 BP   544  9927  S    L    
    ..

    meta trade is:

    c       | t f a 
    --------| ----- 
    time    | t     
    sym     | s     
    size    | j     
    price   | j     
    side    | s     
    exchange| s

    quote is:

    time         sym  bid  ask  bidSize askSize exchange 
    ---------------------------------------------------- 
    04:38:41.495 BP   6741 6390 9355    655     T        
    07:41:35.446 MSFT 7613 8622 9019    3311    T        
    02:24:10.480 GE   1300 8483 3835    5549    T       
    ..

    meta quote is:

    c       | t f a 
    --------| ----- 
    time    | t     
    sym     | s     
    bid     | j     
    ask     | j     
    bidSize | j     
    askSize | j     
    exchange| s

     

    I understand that in general, order matters in the where statement.

    I am uncertain if order matters in the by statement for the following two questions and if there is any implication (apart from the order of the columns in the output) when it comes to performance for instance.

    If it does, I wonder if there is a general guideline as to how I am able to determine whether it matters and if so, the optimal order in the by statement.

    a.

    Select the last trade price for each sym in hourly buckets

    The provided solution is:

    select last price by hour:60 xbar time.minute, sym from trade

    (I am uncertain if the one below is acceptable and if there is any implication on say performance:

    select last price by sym, hour:60 xbar time.minute from trade)

    b.

    Select the max and min askSize in the quote table for each exchange in 2 hour buckets, excluding JPM

    The provided solution is:

    select maxAskSize:max askSize, minAskSize:min askSize by exchange, hour:120 xbar time.minute from quote where sym<>`JPM

    (I am uncertain if the one below is acceptable and if there is any implication on say performance:

    select maxAskSize:max askSize, minAskSize:min askSize by hour:120 xbar time.minute, exchange from quote where sym<>`JPM)

     

    Once again, I really appreciate any assistance with this issue.

     

    Thank you very much.

    mn_12 replied 2 months, 2 weeks ago 2 Members · 1 Reply
  • 1 Reply
  • rocuinneagain

    Member
    June 2, 2023 at 12:00 am

    Link to your original post for reference:

    You can test yourself and compare time and memory usage:

     

    //Try and time the queries to compare 
    \ts select last price by hour:60 xbar time.minute, sym from trade 
    \ts select last price by sym,hour:60 xbar time.minute from trade 
    
    //If they run very fast you can run them N number of times to get more accurate result 
    //Running 100 times here 
    \ts:100 select last price by hour:60 xbar time.minute, sym from trade 
    \ts:100 select last price by sym,hour:60 xbar time.minute from trade 
    
    //Attributes are useful in kdb+ 
    //Apply grouped on sym and test 
    update `g#sym from `trade 
    \ts:100 select last price by hour:60 xbar time.minute, sym from trade 
    \ts:100 select last price by sym,hour:60 xbar time.minute from trade 
    
    //Removed grouped and test again 
    update `#sym from `trade 
    \ts:100 select last price by hour:60 xbar time.minute, sym from trade 
    \ts:100 select last price by sym,hour:60 xbar time.minute from trade

     

     

Log in to reply.