-
Trouble With Huge CSVs
Greetings All,
I’ve got a couple 40gb CSVs that I’m hoping to perform some joins on.
I do not know the column format, or headers, or if headers are even in the csv.
Im working with a good bit of memory, with 256gb accessible.
Loading the files into memory clearly doesn’t work — as expected the program crashes.
So made my way here (loading from large files page). I understand I’ll have to convert my csvs to splayed tables, save those tables down and then work from there instead of using the csvs.
I’m able to see the rows inside the csv with .Q.fs[0N!]`:file.csv — I still don’t know the entirety of whats inside though.
I go through this little bit,
and obviously it’s too big and crashes the program. I try to insert the rows directly into a table on disk with .Q.fs[{`:newfile upsert flip colnames!(“DFFFFIS”;“,”)0:x}]`:file.csv and that crashes too
Should I be chunking this and going from that angle or is there a better way to do this?
Log in to reply.