rocuinneagain
Forum Replies Created
-
Pass variable
a
to APIxxx
on remote handleh
and save to local variablet
:t:h(“xxx”;a)
More about IPC:
- https://code.kx.com/q/wp/ipc/#synchronous-queries
- Interprocess communication | Basics | kdb+ and q documentation
Formatting tips:
-
A user has written qrapidjson which implements .j.j but not .j.k
The underlying rapidjson library does allow a parse flag kParseNumbersAsStringsFlag
http://rapidjson.org/namespacerapidjson.html#a81379eb4e94a0386d71d15fda882ebc9
Although in a quick test the library looks to already parse numbers without decimal places as longs by default:
q)rapidjson:.p.import`rapidjson q)rapidjson[`:loads]["1471220573128024107"]` 1471220573128024107 q)type rapidjson[`:loads]["1471220573128024107"]` -7h q)type each rapidjson[`:loads]["[1471220573128024107, 1471220573128024107.0]"]` -7 -9h //Same for default json library in python q)json:.p.import`json q)json[`:loads]["1471220573128024107"]` 1471220573128024107 q)type json[`:loads]["1471220573128024107"]` -7h
-
iasc gives you the sorted idexes which you can use
q)update sVec:sVec(@)'iasc each sNum,asc each sNum from tbl sVec sNum ----------- a b c 1 2 3 a b c 1 2 3 a b c 1 2 3
-
rocuinneagain
MemberJune 17, 2022 at 12:00 am in reply to: for hdb query, why is it fast when including where date>$date$Hi renbright –
There are some posts showing how to embed code which will make your question more clear:
- Adding inline code to your questions & messages
- Creating q code block with syntax highlighting in a message
Can you share more complete example?
The second of these queries is faster because it reads less data:
//Retrieves data from all dates select from table //Retrieves only the last 5 days of data select from table where date>(.z.d-5)
-
1.
An example data publisher is included in documentation https://qpython.readthedocs.io/en/latest/usage-examples.html#data-publisher
2.
For going straight to HDB you can write a custom
.u.upd
yourself to write direct- https://code.kx.com/q/ref/upsert/#splayed-table
- https://code.kx.com/q/ref/asc/#sorting-data-on-disk
- https://code.kx.com/q/ref/set-attribute/#grouped-and-parted
Or use a file based batch load approach to backfill.
Articles:
- https://code.kx.com/q/wp/data-loaders/
- https://kx.com/blog/kx-product-insights-parsing-data-in-kdb/
- https://kx.com/blog/kdb-q-insights-parsing-json-files/
Documentation:
- https://code.kx.com/q/ref/dotj/
- https://code.kx.com/q/ref/file-text/#load-csv
- https://code.kx.com/q/interfaces/arrow/
- https://code.kx.com/q/interfaces/hdf5/
-
rocuinneagain
MemberJune 2, 2022 at 12:00 am in reply to: Need help to read hdf. file written in python to kdb+?If you run through the example in q then inspect the created file in python you will see what the interface expects
https://code.kx.com/q/interfaces/hdf5/examples/#create-a-dataset
The table columns are stored individually inside groups
>>>data = h5.File('experiments.h5', 'r') >>> data['experiment2_tables'] <HDF5 group "/experiment2_tables" (1 members)> >>> data['experiment2_tables/tab_dset'] <HDF5 group "/experiment2_tables/tab_dset" (5 members)> >>> data['experiment2_tables/tab_dset/class'] <HDF5 dataset "class": shape (10000,), type "<i2">
(The filename must end in ‘.h5’)
For you to store data from Python you should match this style using groups for columns.
import h5py as h5 import pandas as pd df = pd.DataFrame({"AA":[1, 2], "BB":[3, 4], "CC":[5, 6]}) f = h5.File('forKX.h5','w') project = f.create_group("project") table = project.create_group("table") for col in df.columns: table[col] = df[col].to_numpy() f.close()
kdb+ still does not know you intend this data to be a table.
As outline in the docs https://code.kx.com/q/interfaces/hdf5/hdf5-types/#tables-and-dictionaries
Attributes would be needed.
Without the attributes you can reshape in to a table like so:
q){flip x!{.hdf5.readData["forKX.h5";"project/table/",string x]} each x}`AA`BB`CC AA BB CC -------- 1 3 5 2 4 6
-
rocuinneagain
MemberJune 2, 2022 at 12:00 am in reply to: Need help to read hdf. file written in python to kdb+?This code creates a basic table in a file written by KX:
t:([] AA:1 2;BB:3 4;CC:5 6) .hdf5.createFile["byKX.h5"] .hdf5.createGroup["byKX.h5";"project"] .hdf5.writeData[fname;"project/table";t]
If we expand out what it is doing to match the documentation we can create the exact same file with:
https://code.kx.com/q/interfaces/hdf5/hdf5-types/#tables-and-dictionaries
t:([] AA:1 2;BB:3 4;CC:5 6) .hdf5.createFile["diy.h5"] .hdf5.createGroup["diy.h5";"project"] .hdf5.createGroup["diy.h5";"project/table"] {.hdf5.writeData["diy.h5";"project/table/",string x;t x]} each cols t .hdf5.writeAttr["diy.h5";"project/table";"datatype_kdb";"table"] .hdf5.writeAttr["diy.h5";"project/table";"kdb_columns";cols t]
Finally this would be the python equivalent:
import h5py as h5 import pandas as pd import numpy as np df = pd.DataFrame({"AA":[1, 2], "BB":[3, 4], "CC":[5, 6]}) f = h5.File('forKX.h5','w') project = f.create_group("project") table = project.create_group("table") table.attrs["datatype_kdb"] = np.array( [ord(c) for c in 'table'], dtype=np.int8) table.attrs["kdb_columns"] = [x.encode('ascii') for x in df.columns] for col in df.columns: table[col] = df[col].to_numpy() f.close()
All three read in the same way:
q).hdf5.readData[“byKX.h5″;”project/table”] AA BB CC ——– 1 3 5 2 4 6 q).hdf5.readData[“diy.h5″;”project/table”] AA BB CC ——– 1 3 5 2 4 6 q).hdf5.readData[“forKX.h5″;”project/table”] AA BB CC ——– 1 3 5 2 4 6
h5dump is useful to inspect h5 files.
https://support.hdfgroup.org/HDF5/doc/RM/Tools/h5dump.htm
Used it will print the shapes and types of the contents of your file
h5dump diy.h5
The main 3 takeaways are:
1. Only supported types are available with this interface https://code.kx.com/q/interfaces/hdf5/hdf5-types/#type-mapping
2. In the real world tabular data you have from another source in a .h5 will not read straight in to a kdb+ table. You will need to extract the data column by column as I showed in a previous example
3. If you have a file the interface is unable to read you can still use embedPy to manipulate the data and transfer to kdb+
-
rocuinneagain
MemberMay 24, 2022 at 12:00 am in reply to: What is the role played by key columns in a keyed table [ query join/performance ] ?For optimisations in qsql queries you should look to attributes.
Here applying the unique attribute to the Name column:
q)t3:update `u#Name from t1 q)\ts do[100000;select from t3 where Name=`Anand] 64 1808
-
rocuinneagain
MemberMay 24, 2022 at 12:00 am in reply to: What is the role played by key columns in a keyed table [ query join/performance ] ?You can see the speed improvement if you index rather than query using qsql:
q)select from t1 where Name=`Anand Name Age Salary ---------------- Anand 10 1000 q)select from t2 where Name=`Anand Name | Age Salary -----| ---------- Anand| 10 1000 q)t2`Anand Age | 10 Salary| 1000 q)\ts do[100000;select from t1 where Name=`Anand] 164 1808 q)\ts do[100000;select from t2 where Name=`Anand] 174 1808 q)\ts do[100000;t2`Anand] 71 960
-
This becomes easier by splitting the problem in 2 parts.
- Finding the indexes where needed number of 1b is hit
- Finding the indexes where needed number of 0b is hit
These are the only indexes that matter. Then a prototype list can be populated with nulls before having the important indexes overlaid. Then the prevailing values are carried forward by fills.
q)f:{o:count[z]#0N;o:@[o;where x=x msum z;:;1];y:y+1;"b"$0^fills @[o;where y=y msum not z;:;0]} q)f[3;2] 00001111000111001101000b 00000011110001111111110b q)f[3;2] 110010111001111b 000000001111111b
-
Something more like:
q)750{(x[0]+1;x[1]+x[2];x[2])}/0
0 2 750 1500 2
Or use a dictionary for readability:
q)750{x[`ID]+:1;x[`y]:sum x`y`z;x}/`ID`y`z!0 0 2
ID| 750
y | 1500
z | 2
-
Is the
/
for ofdo
what you are looking for?https://code.kx.com/q/ref/accumulators/#do
q)750 +/2 2 754
-
Are you on Windows/Linux/Mac?
Esc
works in Windows (PowerShell + CMD)- In Linux (Bash)
Ctrl
+E
,Ctrl
+U
https://stackoverflow.com/a/16687377/4256419
On Linux/Mac rlwrap is suggested.
-
rocuinneagain
MemberMay 5, 2022 at 12:00 am in reply to: Creating q code block with syntax highlighting in a messageInstructions now also added for adding inline code to your questions & messages.
-
rocuinneagain
MemberApril 29, 2022 at 12:00 am in reply to: Question about query that stuck the KDB processUsing parse helps to see what is happening.
Parsing a simplified version of your query helped to show the order of execution in the parse tree:
q)-1 .Q.s1 last value last parse "select sums(size)/sum(size) from tab"; ((/;`size);+;(sum;`size))
The q equivalent can then be shown to have the same parse tree
q)-1 .Q.s1 parse"size/[sums;sum size]"; ((/;`size);+;(sum;`size))