rocuinneagain
Forum Replies Created
-
rocuinneagain
MemberJanuary 13, 2022 at 12:00 am in reply to: Johansen cointegration test kdb+ implementYou could use the python statsmodels.tsa.vector_ar.vecm.JohansenTestResult by importing it through EmbedPy and passing the data as a dataframe using mltab2df.
Below is an example based of a similar python version: http://web.pdx.edu/~crkl/ceR/Python/example14_3.py
$pip install statsmodels q)l p.q q)l ml/ml.q q).ml.loadfile`:init.q q)data:flip `YEAR`Y`C!"IFF"$flip 1_-12_{{x where not ""~/:x}" " vs x} each "rn" vs .Q.hg "http://web.pdx.edu/~crkl/ceR/data/usyc87.txt" q)coint_johansen:.p.import[`statsmodels.tsa.vector_ar.vecm;`:coint_johansen] q)pd:.ml.tab2df[data][`:set_index;"YEAR"] q)res:coint_johansen[pd;0;2] q)res[`:lr1]` 31.78169 12.17119 -1.566747e-012 q)res[`:lr2]` 19.6105 12.17119 -1.566747e-012 q)res[`:cvm]` 18.8928 21.1314 25.865 12.2971 14.2639 18.52 2.7055 3.8415 6.6349 q)res[`:cvt]` 27.0669 29.7961 35.4628 13.4294 15.4943 19.9349 2.7055 3.8415 6.6349 q){flip y!(x@/:hsym y)@:`}[res;`lr1`lr2`cvm`cvt] lr1 lr2 cvm cvt ---------------------------------------------------------------------------- 31.78169 19.6105 18.8928 21.1314 25.865 27.0669 29.7961 35.4628 12.17119 12.17119 12.2971 14.2639 18.52 13.4294 15.4943 19.9349 -1.566747e-012 -1.566747e-012 2.7055 3.8415 6.6349 2.7055 3.8415 6.6349
-
rocuinneagain
MemberJanuary 6, 2022 at 12:00 am in reply to: Lists, dictionaries, tables and lists of dictionariesAs in the example it is a nested generic list the items need to be dealt with one at a time. As the list could have many different tables or even different datatypes within it.
q)cols each .[dsEg;(`html;`body)] a b q).[dsEg;(`html;`body);{cols each x}] doctype| ,"html" html | `text`body!(,"test";,`a`b)
The use of :: may be useful to you if you have not been using it
https://code.kx.com/q/ref/apply/#nulls-in-i
It allows you to skips levels
q).[dsEg;(`html;`body;::;`a)] d f g //Better shown on an item with multiple entries in the list q)dsEg2:(`doctype`html)!(enlist "html";`text`body!(enlist"test";2#enlist ([]a: `d`f`g;b: 23 43 777))); q).[dsEg2;(`html;`body;::;`a)] d f g d f g
.Q.s1 may also be useful to you as it can help show the underlying structure of an item better than the console at times.
https://code.kx.com/q/ref/dotq/#qs1-string-representation
q).[dsEg;(`html;`body;::;`a)] d f g //Looks like a symbol list type 11h but is in fact a single item egeneric list type 0h q){-1 .Q.s1 x;} .[dsEg;(`html;`body;::;`a)] ,`d`f`g //.Q.s1 output can be ugly but always shows exact structure
-
rocuinneagain
MemberDecember 7, 2021 at 12:00 am in reply to: Question : how to Integrate solacekdb.dll and run session with KX developer and q on windows 10 x64They can be on separate machines/environments.
You can see in the White Paper that they use Kdb+ running on an ec2 instance ‘ip-172-31-70-197.ec2.internal’ and it connects to Solace broker running on ‘mr2ko4me0p6h2f.messaging.solace.cloud’
As long as your network settings are setup so that the 2 hosts can communicate over the needed ports then you will be successful.
-
rocuinneagain
MemberDecember 6, 2021 at 12:00 am in reply to: Installing HTML5 Demo Dashboards on Kx PlatformYou can
1. Upload the .tgz to the server (SCP/FTP etc)
2. Extract it
3. Import the package through the UI
https://code.kx.com/platform/release_management/#import-from-a-different-location
-
The traditional tick.q is a single threaded process but there are many areas you can investigate to ensure performance is at it’s best:
- Use taskset to ensure other processes are not using the same core/recourses as the TP
- As the TP persists data to a log file ensuring that you have evaluated your hardware configuration is important to validate that your storage disk is not bottlenecking your system
- Chained tickerplants can be used to balance the flow of data in a system, particularly when there are many subscribers
- Async broadcast can be used to optimise publishing the same data to multiple subscribers
- Ensure the system is setup to follow best practices outlined in Linux production notes
- Use Unix domain sockets when opening connections on localhost to reduce CPU usage
- Unix domain sockets are also available from C feedhandlers
- Whitepaper on Kdb+ tick profiling
KX Platform:
- Messaging performance tuning
- Tickerplant template
- Process instances core affinity
-
Yes your understanding is correct for those 4 items.
These tutorial videos may be useful:
-
rocuinneagain
MemberNovember 20, 2021 at 12:00 am in reply to: How can I replicate "each til count subs"q)tab:([] a:`a`b`c;b:1 2 3) //Create a table q)tab 0 //Index in to the table to row 0. A dictionary is returned a| `a b| 1 q)til 3 0 1 2 q)count tab // tab contains 3 rows 3 q)til count tab // 'til' returns numbers up to entered value 0 1 2 q){show tab x}each til count tab //Using 'each' and 'show' in a lambda to display each row a| `a b| 1 a| `b b| 2 a| `c b| 3
- “A simple table is a list of dictionaries” https://code.kx.com/q/learn/dicts-tables/
- https://code.kx.com/q/kb/faq/#row-indexing
- https://code.kx.com/q4m3/6_Functions/#618-anonymous-functions-and-lambda-expressions
-
Kdb+ does have inbuilt HTTP functions for GET (.Q.hg) and POST (.Q.hp)
These blog posts may be of interest:
Another option would be to use embedPy.
You can use it to expose Python functions to q
- Web Scraping – A Kdb+ Use case – KX – (Related code: abin-saju/qblog)
When data is returned as JSON you can use .j.k to deserialize
-
TP = Tickerplant (Relays data to subscribers + recovers from logfile after crashes)
RDB = Realtime Database (Stores data for query)
RTE = Realtime engine (Performs streaming calculations and stores caches or publishes results)
(Any process can be customised)
1. Yes
2. It is acting more like a mixture between a TP,RDB,RTE
a) It does not store a logfile to recover in case of a crash (tp-logfile – a TP normally does this)
b) It stores data indefinitely instead of acting only as a relay. (Unlike a TP, more like and RDB, although an RDB will clear once every 24hrs)
c) It does not relay data untouched instead only specific data is forwarded (similar to an RTE)
getSyms – sends like of unique symbols across tables
getQuotes – sends last row by sym from quote table
getTrades – sends last row by sym from trade table
3. Yes
The execution flow is:
1. FH sends messages to PubSub (lines 18/19) every 100ms
2. The messages arrive to PubSub and .z.pg evaluates them. This mean upd/insert (pubsub.q line 8 will save the incoming data to quote/trade. PubSub now has some data cached.
3. The next time the PubSub timer (.z.ts) is triggered (every second) the ‘pub’ function will trigger and send data to subscriptions.
This code is a basic demo so it may have some holes in it’s logic (like never clearing data in PubSub so eventually memory will run out)
-
.z.ts is the timer function which is evaluated on intervals of the timer variable set by system command t
In your example ‘pub’ is called for each subscriber in ‘subs’ once per second.
To start subs has count 0 but when a new subscriber connects and subscribes they will be added.
How the subscribers are added to that table in that demo is:
- websockets.html calls connect() on load of the page (code)
- websockets.js holds the connect() definition which shows it sends loadPage[] to the q process (code)
- pubsub.q holds the definition of loadPage which shows that it calls ‘sub’ (code)
- ‘sub’ then adds your subscriber to ‘subs’
-
Yes defining upd in this way means it behaves the same as insert (mostly)
q)upd:insert q)tab:([] a:1 2) q)insert[`tab;enlist 3] ,2 q)tab a - 1 2 3 q)upd[`tab;enlist 4] ,3 q)tab a - 1 2 3 4
But there are differences. ‘insert’ is a built in operator which cannot be passed as the first item by reference over a handle.
(This is causing the issue you are seeing)
q)value(`upd;`tab;enlist 5) //Pass by reference succeeds for user defined function ,4 q)value(`insert;`tab;enlist 6) //Pass by reference fails for operator 'insert [0] value(`insert;`tab;enlist 5) ^ q)value("insert";`tab;enlist 6) //Pass as parse string succeeds ,5 q)value(insert;`tab;enlist 6) //Pass by value succeeds ,6
User defined functions can only use prefix notation whereas operators can be used prefix or infix.
q)`tab insert enlist 7 //Infix with operator succeeds ,7 q)`tab upd enlist 8 //Infix with user defined function fails 'type [0] `tab upd enlist 8 ^ q)insert[`tab;enlist 8] //Prefix with operator succeeds ,8 q)upd[`tab;enlist 9] //Prefix with user defined function succeeds ,9
-
rocuinneagain
MemberNovember 12, 2021 at 12:00 am in reply to: Cannot write to handle N. OS reports: Bad file descriptorUsually when you see this error it is due to one of:
- A fault in the network infrastructure between hosts
- One of the involved processes has died unexpectedly
- Some code in either your process or the remote has purposefully closed the connection (hclose)
There are handlers available in the .z namespace which are helpful to detect events on IPC handles:
- .z.pc is called after a connection has been closed
- .z.po is called after a connection has been opened
These could be implemented to track and attempt to reconnect dropped connections.
(.z.W is a useful dictionary or current open handles)
There are some examples such as dotz which is a library building on these features to trace, monitor and control execution.
-
rocuinneagain
MemberNovember 12, 2021 at 12:00 am in reply to: Partitioning Tables Intraday by Custom Fields?That would require for deep level changes to the codebase.
For example when a HDB is written to a new:
- A new date folder is created.
- Tables are written.
- The HDB process is reloaded
If you reloaded the HDB before all tables were written you would have errors.
q)mkdir HDB q)cd HDB q)`:2021.01.01/tab1/ set ([] a:1 2 3) `:2021.01.01/tab1/ q)`:2021.01.01/tab2/ set ([] b:1 2 3) `:2021.01.01/tab2/ q)\l . q)select from tab1 date a ------------ 2021.01.01 1 2021.01.01 2 2021.01.01 3 q)select from tab2 date b ------------ 2021.01.01 1 2021.01.01 2 2021.01.01 3 q)`:2021.01.02/tab1/ set ([] a:4 5 6) `:2021.01.02/tab1/ q)\l . q)select from tab1 date a ------------ 2021.01.01 1 2021.01.01 2 2021.01.01 3 2021.01.02 4 2021.01.02 5 2021.01.02 6 q)select from tab2 './2021.01.02/tab2/b. OS reports: No such file or directory [0] select from tab2 ^
.Q.bv – would be one possible helpful extension which can in memory fill tables missing from partitions (using ` as argument to fill using first partition as a template)
q)\l HDB q)select from tab1 date a ------------ 2021.01.01 1 2021.01.01 2 2021.01.01 3 2021.01.02 4 2021.01.02 5 2021.01.02 6 q)select from tab2 //Table missing from latest partition is not found 'tab2 [0] select from tab2 ^ q).Q.bv` //Using ` first partition used as prototype and table is found q)select from tab2 date b ------------ 2021.01.01 1 2021.01.01 2 2021.01.01 3
(.Q.chk is unsuitable as it uses most recent partition as template to fills tables missing from partitions)
This is only one item that would be needed to implement what you asked.
Currently EOD is a single action for all tables, all code and processes involved would need updates to operate table by table. Meaning changes in RDB/IDB/HDB and others.
-
rocuinneagain
MemberNovember 10, 2021 at 12:00 am in reply to: Partitioning Tables Intraday by Custom Fields?The Platform codebase is designed to write date partitioned HDB’s only.
The Intraday database – KX Platform exists to so that 24hrs of data does not need to be kept in memory, but it will again store to the HDB on a date partition only.
-
rocuinneagain
MemberNovember 5, 2021 at 12:00 am in reply to: Partitioning Tables Intraday by Custom Fields?Hi Leah,
This blog post will likely be of interest: https://kx.com/blog/partitioning-data-in-kdb/
It covers the basics of looking at hourly partitions and also fixed size partitions.
Regards,
Rian