

MilanGill
Forum Replies Created
-
MilanGill
MemberFebruary 26, 2025 at 4:38 pm in reply to: Confused by serialization result of char list. (Aka “string”)… this forum is so buggy …
-
MilanGill
MemberFebruary 26, 2025 at 4:40 pm in reply to: Confused by serialization result of char list. (Aka “string”)When I read the data back using a hex editor, I find a very unusual serialization format.
The first two bytes of data are different to what I would typically expect. `FE, 20` instead of `FF 01`. Most of the other types of data which I have serialized begin with a header containing `FF 01`.
Following this there is a byte `0A` which I believe corresponds to “list of char”.
Then there are a sequence of 13 zeros. This doesn’t make much sense to me, since I would expect to see a length. (With value 2.)
Finally, I see characters “a” and “b”. (Which is the original data.)
-
MilanGill
MemberFebruary 26, 2025 at 4:41 pm in reply to: Confused by serialization result of char list. (Aka “string”)
-
-
-
MilanGill
MemberFebruary 28, 2025 at 5:15 pm in reply to: Confused by serialization result of char list. (Aka “string”)The reason we are trying to do this is we have other processes written in other languages which we want to be able to read data written to disk by Q/KDB.
These are obscure languages, and there are no clients directly supported/provided by KDB (the organization).
-
MilanGill
MemberFebruary 26, 2025 at 11:44 am in reply to: What is the implementation of the deprecated DateTime datatype?Thank you!
-
MilanGill
MemberFebruary 26, 2025 at 10:22 am in reply to: Does KDB have string data type? What is a string in KDB?On the subject of interning, does the fact that KDB interns Symbols suggest that all processes it is communicating with should also intern strings?
I lean slightly towards “yes” on the basis that these datatypes would be expected to have similar performance in both systems.
However, consider the serialization format alone. It is independent of the implementation detail. Whether or not strings (symbols) are interned is a system implementation detail. A KDB “system” interns them. Some other system might not.
The serialization format does not contain any information about whether or not strings (symbols) are interned, or atomic. It contains a tag (number) followed by some data.
KDB could release a new version of their software tomorrow, and decide not to intern Symbols. They could keep the same serialization format.
This suggests leaning towards answering “no” to the above question.
Any thoughts?
-
MilanGill
MemberFebruary 19, 2025 at 2:19 pm in reply to: Serialization and header values 0xFF 0x01It may be exclusively related to on-disk serialization. If that process is not documented then perhaps it is the case that KDB simply writes this combination of bytes as a header.
Thanks for the link. I read through some similar pages earlier but couldn’t find anything which seemed relevant.
I do recall seeing something at some point which looked like it might have been part of a header specification. It may even have been this page which you link above. But I didn’t see any values which would match up with 0xff 0x01. I’m not sure exactly what that page was.
-
MilanGill
MemberFebruary 19, 2025 at 11:51 am in reply to: Char datatype and Unicode? Can 8 bit charsThank you, that makes sense.
In this sense, `char` is very much like the C or C++ interpretation of “char”. In that it is a single byte. It may contain a valid ASCII value, or it may be part of a codepoint of some larger UTF-8 sequence of bytes.
-
I think this is the right function to use. I guess in this context IPC, meaning InterProcess Communication, means communication via network sockets. In that case, this function appears to be an equivalent to the one which someone has manually implemented in the codebase I am looking at. That function is used to talk to KDB instances via a network socket. The same logic has been repurposed for writing data to disk.
-
I believe re-using the same serialization format as is used for KDB was just for convenience. A function existed to serialize data. It was probably simply reused for serializing to disk.