KX Community

Find answers, ask questions, and connect with our KX Community around the world.

Home Forums kdb+ How long should a name be?

  • How long should a name be?

    Posted by sjt on June 11, 2022 at 12:00 am

    How long should a variable or function name be?

    A well-established tradition encourages us to use clear, meaningful names; names that explain what the variable contains, what the function does.

    Here is an alternative strategy dont.

    A century ago, Bertrand Russell thought words denote whatever they denote by being some kind of compressed description. (His colleagues in linguistic philosophy soon devised better theories of reference.) Early programming languages commonly limited token names to e.g. eight characters. At its launch in 1981, Smalltalk abandoned the restriction with glee, hoping people might be able to use the language without formal training. It was some kind of a thrill then to see names such as ProcessMessage and incomingMessageQueue. (The tradition endures in Java; perhaps not the thrill.) In these we have descriptions without even the compression.

    The Iverson College meetings often included a few guests curious about the APL family of languages. At Oxford in 2014 Arthur Whitney demonstrated his nascent operating system kOS and the text editor he had written for it: four lines of k. A guest asked him if the code would not be clearer with longer, explanatory variable names. Whitney thought briefly, then replied, No. Blogging later, our guest wrote how someone else had afterwards helped her see long variable names tend to obscure the transformations between them, which are more interesting.

    But I have a different quarrel with explanatory names the explanation. Reading English engages a part of my brain quite distinct from what I use for math. Its the story-telling part; it responds to poetry and explanations. The word incomingMessageQueue paws at my mind, demanding attention like a needy pet. I see people queuing. I think about messages. I think about about people who send me messages, and about how messages travel. English is a wonderful medium for essays and poetry. But there is a reason mathematicians devise notations to abstract irrelevant detail away. In math we say (over and over again) x rather than the number you first thought of. For thinking mathematically, we need something bare, something without distracting associations.

    So not English words. (Or whatever language you think in.)

    Smalltalks desire for code that any English speaker could read was understandable but treacherous. English words suggest transformations or relationships that might mislead. For example, when I flip a coin it reverses a random number of times and lands heads or tails with equal probability. When I flip a matrix it transposes exactly once. As the guest on a recent Array Cast episode said, when you use an English-like reserved word you might need to reflect on whether you are getting what you think you are getting; but a symbol such as ? (the APL symbol for qs flip) leaves no uncertainty.

    So how long should a q name be?

    A strong hint comes from the default argument names. X has always marked the spot, stood for the unknown, so x is a functions first argument; and y and z follow along. Single-letter names are fine for short functions anyway, perhaps up to three lines long.

    Keep in mind: naming a value is a request to your reader to remember the association. If you create a local variable, you ask your reader to remember the name/value association for the rest of the function. Be gentle:

    • Avoid setting and reading a variable on the same line if you have no further use for it. Instead see if you can avoid it with a tacit (or point-free) expression. For example 1 reverse rather than (a;)reverse a:. Perhaps you can use a lambda to limit the scope of the variable: {(x;reverse x)}. Worst case, have one throw away name that you use only for ephemeral values, set and consumed almost immediately. (I use niq for nimporte quoi, but whatever.)
    • Set a variable once only: within a function have a name mean just one thing. The obvious exception is a series of amendments: updates to a table; amends to a list; building up an HTML document.
    • The distance in your code between setting and reading a variable is how long your reader has to remember the association. Keep those distances short.
    • Use a name that recalls but does not rehearse the value associated with it. For example, imq rather than incomingMessageQueue.
    • Some variables form natural groups. For example, you might partition a message into a header and a body. Give them names of the same length.
      head:msg@… body:msg@…

    For me the sweet spot for names is 2-4 letters. My incoming message queue can be imq or msgs, with the expanded version in the comments.

     

    imq: … / incoming message queue pim:{[msg] / process incoming message .. } pim each imq

     

    Thats my sweet spot. What works for you?


    Further reading: Three Principles of Coding Clarity

    sjt replied 1 month, 2 weeks ago 2 Members · 2 Replies
  • 2 Replies
  • sjt

    Member
    March 19, 2024 at 10:55 am

    When I joined the ranks of APL instructors years ago it was well known that students learned APL faster if they had not been exposed to other programming languages. (Iverson always warned against confusing unfamiliar and difficult.)

    Back in the day, we supposed that evidence of the superiority of Iversons notation. Other languages were rubbish; one day all software would be written in APL.

    Nowadays we know better. (I have written an application GUI in APL. I intend it to remain a once-in-a-lifetime experience.) We need to be polyglots. Nothing bad in that: humans have been polyglots since the Old Stone Age. (Only in recent centuries have linguistic communities grown large enough to host monoglots.)

    Which leaves us with the barrier of the unfamiliar. Q widened access to kdb+ by letting users exploit their knowledge of SQL. KX Insights is doing more by providing access through Python and Web UI. But for those who need to write in q there remains the transition from what (in a recent ArrayCast episode) Joel Kaplan called the one-potato-two-potato approach to what shall we call it vector thinking?

    A recent article in the RSA Journal  on lifelong education stresses the challenge of unlearning as part of learning new skills.

    What helped you learn vector thinking?

    Last year I led an online workshop on vector thinking. We explored vector solutions in q to a small but non-trivial problem. Participants found it helped them, and I promised to hold another one. Perhaps its time? (Respond here to encourage me.)

  • darrenwsun

    Member
    March 19, 2024 at 11:01 am

    As a dev using this language and a practitioner of the technology, I’m proud of it’s unprecedented performance and expressiveness in manipulating data. But when someone from other tech background complain about the steep learning curve and difficulty in reading (most) q code, I usually remain silent: I felt the pain and sometimes I still feel it by “how much succinct one can do something in q with very short names and chaining so many expressions.”

     

    Don’t get me wrong – the language is beautiful, especially when looked from a mathematical context. It’s just that q lives in an era/ecosystem where a complex system usually involves multiple languages/stacks and the others adopt a very different perspective towards what it means by readability/clean.

Log in to reply.