Page 1 of 1

Is it a text file?

Posted: Mon May 18, 2020 4:17 pm
by paulmansour
What are the ways to tell, nowadays, if a file is a text file or not in Dyalog? Specifically, I want to know if I can potentially use ⎕CSV on a particular file. To this end I can hack up something like this mess:

      IsTextFile←{
⍝ ⍵ ←→ Filename
o←'Records' 64
t←⍵ ⎕NTIE 0
11::1⊣⎕NUNTIE t
92::0⊣⎕NUNTIE t
≢⎕NUNTIE t⊣(⎕CSV⍠o)t
}


No doubt there are shorter, better solutions. We could trap ⎕NGET, but I don't want to read the whole file as it could be large (I could have sworn ⎕NGET was enhanced to read X number of records, but that appears not to be the case).

No doubt there are also solutions using ⎕NREAD, but I think ⎕CSV is doing alot of work under the covers to determine the type of text file the thing is, so maybe its useful here.

Any solutions or hint or tips appreciated.

Re: Is it a text file?

Posted: Tue May 19, 2020 9:36 am
by StefanoLanzavecchia
What is a text file? Is a Japanese "text" file containing quotes in Chinese, Arab, Hebrew, Vietnamese, Hindi, Klingon and various math symbols a text file?

EDIT: so... if I actually bothered reading the posts in full... If a text file is something that []CSV won't choke on I think, by definition, the only way is to try to run []CSV on it and see if it likes it or not. Your or Chris' solutions seem both to go in that direction.

Re: Is it a text file?

Posted: Tue May 19, 2020 11:34 am
by crishog
Well you can simplify it a bit:

IsItText←{92:: ''
⎕CSV⍠'Records 1'⊣⍵}

This just uses the name to avoid all of the tie/untie & Records=1 to avoid any "Invalid number of fields in record" error, but it still has to be trapped in case the file cannot be translated using UTF-8 - the default in this case & the limited definition of a "text file"

Not sure what you wanted the result to be, this is just "here's the first record" (I could read it) & a null vector for "I failed to read it" (which doesn't tell you if it is an empty text file, but it wouldn't be worth continuing with ⎕CSV

Re: Is it a text file?

Posted: Tue May 19, 2020 6:25 pm
by paulmansour
Thanks Chris. Much nicer.

I can't remember (from yesterday!) why I was also trapping for the domain error, and reading a block of records. I think I was combining the test for "does it look like a text file" file with "does it look like a CSV file". I've split that out into two functions now. I'm just looking for a boolean result for each test. Once passed those tests, I have another function which will provides things like the header row, etc.