KX Community

Find answers, ask questions, and connect with our KX Community around the world.

Home Forums kdb+ differ applied on each day rather then the entire dataset

  • differ applied on each day rather then the entire dataset

    Posted by user931206 on January 12, 2024 at 12:00 am

    I am trying to run this query :

    select diffDate:differ startDate from tab where date within(.z.d-2;.z.d-1)

     

    I would expect to have this

    publishTime sym startDate difD
    2024.01.10D00:00:01.014000000 test1 2024.01.11 1
    2024.01.10D00:00:03.541000000 test1 2024.01.11 0
    2024.01.10D00:01:46.222000000 test1 2024.01.11 0
    2024.01.10D00:02:14.276000000 test1 2024.01.12 1
    2024.01.10D00:02:22.306000000 test1 2024.01.12 0
    2024.01.10D23:59:58.366000000 test1 2024.01.12 0
    2024.01.11D00:00:00.875000000 test1 2024.01.12 0
    2024.01.11D00:00:02.378000000 test1 2024.01.12 0
    2024.01.11D00:00:35.445000000 test1 2024.01.12 0
    2024.01.11D00:02:02.623000000 test1 2024.01.15 1
    2024.01.11D00:02:06.133000000 test1 2024.01.15 0

     

     

    However, I get this (check the difD col around roll over day)

    publishTime sym startDate difD
    2024.01.10D00:00:01.014000000 test1 2024.01.11 1
    2024.01.10D00:00:03.541000000 test1 2024.01.11 0
    2024.01.10D00:01:46.222000000 test1 2024.01.11 0
    2024.01.10D00:02:14.276000000 test1 2024.01.12 1
    2024.01.10D00:02:22.306000000 test1 2024.01.12 0
    2024.01.10D23:59:58.366000000 test1 2024.01.12 0
    2024.01.11D00:00:00.875000000 test1 2024.01.12 1
    2024.01.11D00:00:02.378000000 test1 2024.01.12 0
    2024.01.11D00:00:35.445000000 test1 2024.01.12 0
    2024.01.11D00:02:02.623000000 test1 2024.01.15 1
    2024.01.11D00:02:06.133000000 test1 2024.01.15 0

     

    Is this expected. I thought differ and deltas are applied on the entire array of startDate no on each day.

    user931206 replied 2 months ago 2 Members · 2 Replies
  • 2 Replies
  • rocuinneagain

    Member
    January 12, 2024 at 12:00 am

    differ is not one of the aggregations which will auto perform map-reduce

    https://code.kx.com/q4m3/14_Introduction_to_Kdb+/#1437-map-reduce

    In this case the operation is applied once per partition.

     

    To have it operate on your full result set you can change your query to query the data from disk untouched and then perform the aggregation in memory:

    select diffDate:differ startDate from select startDate from tab where date within(.z.d-2;.z.d-1)

     

  • user931206

    Member
    January 12, 2024 at 12:00 am

    Thanks
    I was looking for that link there.
    It’s not normal that they don’t clarify this whey they talk about delta/differ.

Log in to reply.