KX Community

Find answers, ask questions, and connect with our KX Community around the world.
KX Community Guidelines

Home Forums KX Academy Issue with Window Join

  • Issue with Window Join

    Posted by mn_12 on July 8, 2023 at 12:00 am

    Hi,

     

    I intended to write this as well in my previous post but decided to do so in a new post here as it concerns the Practical Guidance – Joins notebook.

     

    In the Practical Guidance – Joins notebook:

    show t:([]sym:3#`JPM;time:09:30:01 09:30:04 09:30:08;price:120 123 121) is:

    sym time     price 
    ------------------ 
    JPM 09:30:01 120   
    JPM 09:30:04 123   
    JPM 09:30:08 121

    show q:([]sym:10#`JPM;time:asc 09:30:00+10?8;ask:10?90+til 20;bid:10?90+til 20) is:

    sym time     ask bid 
    -------------------- 
    JPM 09:30:00 102 107 
    JPM 09:30:00 91  100 
    JPM 09:30:01 93  93  
    JPM 09:30:02 102 108 
    JPM 09:30:03 91  93  
    JPM 09:30:04 96  108 
    JPM 09:30:05 102 106 
    JPM 09:30:05 103 99  
    JPM 09:30:07 97  101 
    JPM 09:30:07 95  102

    t[`time]  is:

    09:30:01 09:30:04 09:30:08

    show windows: -2 1+: t[`time]  is:

    09:29:59 09:30:02 09:30:06 
    09:30:02 09:30:05 09:30:09

    wj[windows;`sym`time;t;(q;(max;`ask);(min;`bid))] is:

    sym time     price ask bid 
    -------------------------- 
    JPM 09:30:01 120   102 93  
    JPM 09:30:04 123   103 93  
    JPM 09:30:08 121   103 99

     

    I am unable to figure out why the result of the window join has 103 for ask and 99 for bid in the last row, instead of 97 for ask and 101 for bid.

    My comprehension is as follows:

    To determine both the ask and bid values in the last row, we need to look at the rows in q where the time values fall between 09:30:06 and 09:30:09 inclusive. Only the last two rows satisfy this condition.

    We then determine the higher ask value (97) and the lower bid value (101) among these two rows.

     

    This approach works for the first two rows in the result of the window join.

    (For the first row, look at the rows in q where the time values fall between 09:29:59 and 09:30:02 inclusive. Only the first four rows satisfy this condition. Then determine the highest ask value (102) and the lowest bid value (93) among these four rows.

    For the second row, look at the rows in q where the time values fall between 09:30:02 and 09:30:05 inclusive. Only the fourth to the eighth rows satisfy this condition. Then determine the highest ask value (103) and the lowest bid value (93) among these five rows.)

     

    It is my first time learning about the window join and I am uncertain if my approach is right.

     

    Once again, I really appreciate any assistance or clarification with this issue.

     

    Thank you very much.

    mn_12 replied 9 months, 3 weeks ago 3 Members · 2 Replies
  • 2 Replies
  • vkennedy

    Member
    July 10, 2023 at 12:00 am

    Hi,

    For wj, the prevailing quote on entry to the window is considered valid as quotes are a step function.

    See this link for more details.

    Window join | Reference | kdb+ and q documentation – Kdb+ and q documentation (kx.com)

  • megan_mcp

    Administrator
    July 11, 2023 at 12:00 am

    Hi , just to add to ‘s reply;

    • It’s because wj uses the last-most quote out side of the window as well as quotes in the window (that last-most quote is called the “prevailing” quote in the notes).  If this is not the desired result and you strictly want stuff in the window only, you use wj1

    If I use your example to explain:

    t:([]sym:3#`JPM;time:09:30:01 09:30:04 09:30:08;price:120 123 121) 
    q:([]sym:10#`JPM;time:asc 09:30:00,09:30:00,09:30:01,09:30:02,09:30:03,09:30:04,09:30:05,09:30:05,09:30:07,09:30:07;ask:(102,91,93,102,91,96,102,103,97,95);bid:(107,100,93,108,93,108,106,99,101,102);rn:1+til 10) 
    windows: -2 1+: t[`time] 
    wj[windows;`sym`time;t;(q;(max;`ask);(min;`bid);((::);`rn))]
    • That last-most quote only is taken when there is no quote right at the exact window start time (you can see this in the second row of the result, we had a quote exactly at window start time 09:30:02, so the last-most quote outside the window isn’t included in that case, for the third row it was because no quote exactly at 09:30:06)

Log in to reply.