Replies – Forums – pcarroll

pcarroll

Member

March 23, 2022 at 12:00 am in reply to: How to load the table from aws s3

If you included the full command it would help more . Generally speaking there might be permission issues for you to access the bucket and/or issues with the loading of the table. Some good points to confirm on this would be…

Can you use aws cli to copy the table outside of q prompt?
Can you load the table if it were stored on file storage rather than local?

pcarroll

Member

January 23, 2022 at 11:00 am in reply to: Wordle kdb

Folks, I might need to add this to the JIRA board as it is now becoming priority numero uno, but I have an interesting Friday conclusion!

1) So, last we left off, we had counted the letter usage in a given position, I decided to divide this by the number of words to assign scores to each letter in each position

// some sanity checks 
q)sum each letFreq 3145 3145 3145 3145 3145 
q)count wordz 3145 // looks good 
q)letScore:letFreq%3145 
q)letScore 
"sbcaptmdfglrehwkvnojiquyzx"!0.1383148 0.08426073 0.08330684 0.06868045 0.061.. 
"aoeriluhntpywcmsdbgxvkfqzj"!0.1558029 0.1236884 0.117965 0.1033386 0.0848966.. 
"aiorenlutsdgmbcpvywfkzhxjq"!0.1224165 0.1023847 0.0899841 0.08744038 0.07599.. 
"eainrsctolugdmkpvfhbwzyxj"!0.1243243 0.08139905 0.07980922 0.07567568 0.0705.. 
"eytnlrhasdkopmcgifxwbzuv"!0.1799682 0.1357711 0.09189189 0.08076312 0.062957..

I feel like this makes senses as an e in position five should be more valuable based on frequency than an a in position three. Although I might be putting the in front of the .

2) I then compared all the 5 letter words to find out the score of each letter in each word

q)wordzScore:wordz!@'[letScore;]each wordz 
"aaron"| 0.06868045 0.1558029 0.08744038 0.06104928 0.08076312 
"ababa"| 0.06868045 0.0063593 0.1224165 0.01240064 0.05373609 
"aback"| 0.06868045 0.0063593 0.1224165 0.0645469 0.04038156 
"abase"| 0.06868045 0.0063593 0.1224165 0.06486486 0.1799682 
"abash"| 0.06868045 0.0063593 0.1224165 0.06486486 0.05532591

3) And if we sum these and desc sort we get the following

q)desc sum each wordzScore 
"sauce"| 0.6 
"saute"| 0.6 
"salle"| 0.5974563 
"caine"| 0.5971383 
"slate"| 0.5879173

So there we have it! Are sauce and saute the optimum starting words? If not they are certainly big contenders for “most popular baby names for twins in 2022!” I will work on how we solve after line 1, but for now I must take a break before MB Games and Hasbro take action against me.

Paulyc

pcarroll

Member

January 22, 2022 at 11:00 am in reply to: Wordle kdb

Here is my morning!

1) Get me some words…

q)wordz:system "curl http://wiki.puzzlers.org/pub/wordlists/unixdict.txt";

2) Think about any any obvious misdirections?

q)words where words like "*'*" "aren't" "d'etat"

Okay we need to lose these guys

q)wordz:wordz where all each wordz in .Q.a;

And we only want to get the 5 letter words so…

q)wordz:wordz where 5=count each words

3) So my next thought was to look at most popular letters in each location for 5 letter words

q)letFreq:desc each count each 'group each flip wordz 
q)letFreq 
"sbcaptmdfglrehwkvnojiquyzx"!435 265 262 216 194 179 174 159 148 146 131 115 .. 
"aoeriluhntpywcmsdbgxvkfqzj"!490 389 371 325 267 260 247 174 99 94 67 57 50 4.. 
"aiorenlutsdgmbcpvywfkzhxjq"!385 322 283 275 239 205 198 193 161 114 103 99 8.. 
"eainrsctolugdmkpvfhbwzyxj"!391 256 251 238 222 204 203 203 192 190 117 101 9.. 
"eytnlrhasdkopmcgifxwbzuv"!566 427 289 254 198 180 174 169 167 128 127 104 65..

I decided to flip these just on the off chance it spells out something humorous , also I had to take 26 of each because as you can imagine, no 5 letter word ends in q (for example) so the flipping would give a ‘length error

q)flip 26#'key each letFreq "saaee" "boiay" "ceoit" "arrnn" "pierl"

Alas Earwax! It would have pretty cool if that first line was a word right?

Anyway, my next thought would be to assign scores to letters in words in a given location i.e. if a word starts with s it gets the most points for that location, b is the second most points? Hopefully then we would find the optimum real word? Would love to know if anyone is thinking differently on this?

pcarroll

Member

January 21, 2022 at 11:00 am in reply to: Wordle kdb

Some updates,

I did find some minor updates that can be made here:
1) I found a script of words that are being used in wordle, makes clean-up easier!

q)wordz:system "curl https://gist.githubusercontent.com/cfreshman/a03ef2cba789d8cf00c08f767e0fad7b/raw/a9e55d7e0c08100ce62133a1fa0d9c4f0f542f2c/wordle-answers-alphabetical.txt";

2) I was also informed that we could use the following method as curl is linux specific.

q)wordz:"n" vs .Q.hg"https://gist.githubusercontent.com/cfreshman/a03ef2cba789d8cf00c08f767e0fad7b/raw/a9e55d7e0c08100ce62133a1fa0d9c4f0f542f2c/wordle-answers-alphabetical.txt"

pcarroll

Member

January 21, 2022 at 12:00 am in reply to: Function of syntax `

Also, this just brings up good questions about symbols versus strings & when and why we use them.
Many languages including q intern strings so that only one copy of any distinct value is stored. Any other references to the same string are stored as just that, references.
This helps with memory management, and improves performance on equality comparison i.e.

// assuming col1 is symbol column 

q)select from tab where col1 in `a`b

Storing the same string e.g. “New York Stock Exchange” repeatedly in memory would require a greater amount of memory than its symbol counterpart.

Because q typically stores the string pool entirely in memory (that’s what your sym file is in a typical kdb+ database) consideration should be given as to what columns are symbol and which are string.

If the string list contains many short repeated values this may be a good candidate for casting to a symbol

pcarroll

Forum Replies Created

pcarroll

pcarroll

pcarroll

pcarroll

pcarroll

pcarroll