KX Community

Find answers, ask questions, and connect with our KX Community around the world.
KX Community Guidelines

Home Forums kdb+ Climate Data

  • Climate Data

    Posted by joelpoplin on September 27, 2023 at 12:00 am

    I’m curious if kdb+/q would be an appropriate solution for spatial time series data. Climate data is typically 4-D and can be quite large. Would like be able to do some in memory computations and also query at much faster speeds than current standard structures (grib or netcdf files with python xarray).

    joelpoplin replied 3 months, 1 week ago 2 Members · 1 Reply
  • 1 Reply
  • rocuinneagain

    Member
    September 29, 2023 at 12:00 am

    Yes it could be used.

     

    To test you could look at PyKX for an easy Python interface.

    A 2 minute example of passing in a Dataset to q in shown below.

    PyKX allows Registering Custom Conversions so you could create a function to pass the Dataset in exactly the form you wish to q instead of passing it all as a dictionary in my example.

     

    import pykx as kx 
    import xarray as xr 
    import numpy as np 
    import pandas as pd 
    ds = xr.Dataset(     
          {"foo": (("x", "y"), np.random.rand(5, 5))},     
          coords={         "x": [10, 20, 30, 40, 50],         
                           "y": pd.date_range("2000-01-01", periods=5),         
                           "z": ("x", list("abcde")),     }, ) 
    kx.q['ds'] = kx.toq(ds.to_dict()) 
    kx.q('ds') 
    pykx.Dictionary(pykx.q(' 
    coords   | `x`y`z!+`dims`attrs`data!((,`x;,`y;,`x);(()!();()!();()!());(10 20.. attrs    | ()!() dims     | `x`y!5 5 data_vars| (,`foo)!+`dims`attrs`data!(,`x`y;,()!();,(0.7412575 0.2054306 0.10.. ')) 
    kx.q('flip ds[`coords;;`data]') 
    pykx.Table(pykx.q(' 
    x  y                             z 
    ---------------------------------- 
    10 2000.01.01D00:00:00.000000000 a 
    20 2000.01.02D00:00:00.000000000 b 
    30 2000.01.03D00:00:00.000000000 c 
    40 2000.01.04D00:00:00.000000000 d 
    50 2000.01.05D00:00:00.000000000 e ')) 
    kx.q('ds[`data_vars;`foo;`data]') 
    pykx.List(pykx.q(' 0.7412575 0.2054306   0.1009393 0.8792678 0.04105999 0.1811459 0.01659637  0.2406029 0.4900055 0.551788   0.6303767 0.0702013   0.6831359 0.5961667 0.3722388  0.9255059 0.9202499   0.5055902 0.9767793 0.7440498  0.7331576 0.003197568 0.4939932 0.5433492 0.01175784 ')) 

     

Log in to reply.