KX Community

Find answers, ask questions, and connect with our KX Community around the world.

Home Forums kdb+ select avg (ignore null)

  • select avg (ignore null)

    Posted by powerpeanuts on June 29, 2022 at 12:00 am

    select avg[col1;col2] from t

    would give average of col1 and col2 including null. What is the “ignoring null” version of it? Thanks.

    powerpeanuts replied 2 months ago 3 Members · 4 Replies
  • 4 Replies
  • davidcrossey

    Member
    June 29, 2022 at 12:00 am

    Hi powerpeanuts,

    As per this note avg does ignore nulls for a single vector, so you could apply the average to each pair of values from each column as follows:

    q)t:([]c1:`a`b`c;c2:1 0N 2;c3:5 3 5) 
    q)update c4:avg each flip (c2;c3) from t 
    c1 c2 c3 c4 
    ------------ 
    a  1  5  3 
    b     3  3 
    c  2  5  3.5

    Hope this helps

  • powerpeanuts

    Member
    June 29, 2022 at 12:00 am

    perfect, exactly what I am looking for. Thanks David.

  • davidcrossey

    Member
    June 30, 2022 at 12:00 am

    avg/min/max/sum where implemented similar to SQL92 as per the note here.

    From what I gather, there are some optimizations available for a vector that aren’t available for a slice, hence the functionality we see today, and trying to change this would be difficult now as you might, (and probably will) see different results in an existing system.

  • darrenwsun

    Member
    June 30, 2022 at 12:00 am

    Thanks David for following up.

    Ignoring null makes perfect sense, especially for avg/max/sum. Imagine otherwise, the result would be null as long as the input contains one null, effectively forcing everyone to explicitly coalesce.

    I understand such change is not backward compatible, and given the ability to manually exclude null like what you shared, I’d agree it’s not worth doing it. However I’m curious if we could leave legacy burden behind, what would be the decision from language designers.

Log in to reply.