KX Community

Find answers, ask questions, and connect with our KX Community around the world.

Home Forums kdb+ Read JSON file output in KDB

  • Read JSON file output in KDB

    Posted by kpc on December 21, 2023 at 12:00 am

    Hello Team,

    I have attached sample JSON file.

    I have created a python code which reads json file as an input using pandas dataframe and returns dataframe contains information about all the employees.

    requirement is  i need to load output(dataFrame) of python script as an input to KDB function.  i have used  .p.get function to run the python function in KDB but the output i am getting is  “dynamicLoad“. 

    Reading JSON file can be done directly in KDB but I wanted it to keep the reading json in python and accessing data in KDB to reduce workload.

     


    Sample Code:
    python code:
    import pandas as pd;
    import json
    def readJsonFile(fp):
    with open(fp) as data_file:
    data = json.load(data_file)
    res = pd.json_normalize(data);
    return res

     

    KDB code

    readJSONFile:.p.get[“readJsonFile;<];

    out:readJSONFile[fp];

    Tried as like below

     

    Can someone please hhelp me.

     

    Thanks in advance..!!!

    kpc replied 2 months ago 2 Members · 2 Replies
  • 2 Replies
  • rocuinneagain

    Member
    December 21, 2023 at 12:00 am

    PyKX under q is replacing embedPy and is recommended if you are starting a new project (*Except for on Windows as support is still in development for this)

    https://code.kx.com/pykx/2.2/pykx-under-q/intro.html

    Install it with:

    pip install pykx 
    python -c "import pykx;pykx.install_into_QHOME()"

    Then you can load in your python code (I placed it in a file called functions.py):

    q)\l pykx.q 
    q)loadPy:{.pykx.pyexec"exec(open('",(1_ string x),"').read())"} 
    q)loadPy`:functions.py 
    q)readJSONFile:.pykx.get[`readJsonFile;<] 
    q)readJSONFile`sampleJSONFile.json 
    Employees 
    ------------------------------------------------------------------------------- 
    +`userId`jobTitleName`firstName`lastName`preferredFullName`employeeCode`regio.. 
    
    q)(readJSONFile`sampleJSONFile.json)[0;`Employees] 
    
    userId jobTitleName firstName lastName preferredFullName employeeCode region phoneNumber emailAddress 
    --------------------------------------------------------------------------------------------------------------------- 
    rirani Developer    Romin     Irani    Romin Irani       E1           CA     408-1234567 romin.k.irani@gmail.com 
    nirani Developer    Neil      Irani    Neil Irani        E2           CA     408-1111111 neilrirani@gmail.com 
    thanks Program Directory Tom  Hanks    Tom Hanks         E3           CA     408-2222222 tomhanks@gmail.com

     

    One  issue to watch for here though is all the data being brought back as symbols to q. For the lifetime of a q process all unique symbols are interned in memory and cannot be garage collected. You would need to update your code to convert columns to suitable datatypes.  Text data should mostly be passed to q as bytes so it is converted to type C in q.

    meta (readJSONFile`sampleJSONFile.json)[0;`Employees] 
    c                | t f a 
    -----------------| ----- 
    userId           | s 
    jobTitleName     | s 
    firstName        | s 
    lastName         | s 
    preferredFullName| s 
    employeeCode     | s 
    region           | s
    phoneNumber      | s 
    emailAddress     | s

     

    Using q’s inbuilt JSON parser is simple and default textual data to type C which prevents any issues with unique symbols.

    q).j.k raze read0`:sampleJSONFile.json

    Employees| +`userId`jobTitleName`firstName`lastName`preferredFullName`employe..

    q)(.j.k raze read0`:sampleJSONFile.json)`Employees

    userId jobTitleName firstName lastName preferredFullName employeeCode region phoneNumber emailAddress 
    ----------------------------------------------------------------------------------------------------------------------------- 
    "rirani" "Developer" "Romin" "Irani" "Romin Irani" "E1" "CA" "408-1234567" "romin.k.irani@gmail.com" 
    "nirani" "Developer" "Neil" "Irani" "Neil Irani" "E2" "CA" "408-1111111" "neilrirani@gmail.com" 
    "thanks" "Program Directory" "Tom" "Hanks" "Tom Hanks" "E3" "CA" "408-2222222" "tomhanks@gmail.com" 
    
    meta (.j.k raze read0`:sampleJSONFile.json)`Employees 
    c                | t f a 
    -----------------| ----- 
    userId           | C 
    jobTitleName     | C 
    firstName        | C 
    lastName         | C 
    preferredFullName| C 
    employeeCode     | C 
    region           | C 
    phoneNumber      | C 
    emailAddress     | C
    
    
    

     

  • rocuinneagain

    Member
    December 21, 2023 at 12:00 am

    If you want to complete this with embedPy you will need the df2tab utility function

    ml/util/utilities.q at master KxSystems/ml (github.com)

     

    You can install the library following:

    https://github.com/KxSystems/ml/tree/master?tab=readme-ov-file#installation

     

    Or the function can also be defined stand alone if you prefer:

    https://stackoverflow.com/a/51612291/4256419

     

     

     

    //You cannot pass < as there is no automatic way to convert a dataframe to q readJSONFile:.p.get[`readJsonFile];

    //Pass the embedPy foreign to df2tab to do the conversion out:df2tab[readJSONFile[fp]];

     

     

     

Log in to reply.