I have an admission to make, I always mistrust anything that looks too complicated or uses "extra" bits of filler. I particularly dislike the "artificial" use of ⊢ in train constructs. That said, I have to admit they look clever.
I saw somewhere recently from Roger (≠⊆⊢) and thought how elegant but I rarely have one character as a separator so tried it for 2↑',' - well not that expression exactly as it clearly gives an error when ⍺ is not scalar but a similar train.
So I thought what about for the vector case.
'.,'(~⍤∊⍨⊆⊢)'ab,cd,efg.hij'
'.,'{⍺(~⍤∊⍨⊆⊢)⍵}'ab,cd,efg.hij'
non train without parenthesis
'.,'{⍵⊆⍨~⍵∊⍺}'ab,cd,efg.hij'
non train with parenthesis
'.,'{(~⍵∊⍺)⊆⍵}'ab,cd,efg.hij'
I still think the last one is clearer and is more maintainable.
┌──┬──┬───┬───┐
│ab│cd│efg│hij│
└──┴──┴───┴───┘
and even Roger's can be rewritten
'.' (≠⊆⊢) 'ab.cd.efg.hij'
'.,'{⍵⊆⍨⍵≠⍺}'ab,cd,efg.hij'
'.'{(⍺≠⍵)⊆⍵} 'ab.cd.efg.hij'
I just can't help feeling a move towards white noise isn't an aid to a tool of thought for most people. I was wondering how other people felt.
My question is are we trying to appear to be too clever (to gratify ourselves and our intelligence) for what we hope will become a more generally accepted language? White noise must surely put most people off?
A train too far?
Forum rules
This forum is for discussing APL-related issues. If you think that the subject is off-topic, then the Chat forum is probably a better place for your thoughts !
This forum is for discussing APL-related issues. If you think that the subject is off-topic, then the Chat forum is probably a better place for your thoughts !
- MikeHughes
- Posts: 86
- Joined: Thu Nov 26, 2009 9:03 am
- Location: Market Harborough, Leicestershire, UK
-
- Posts: 62
- Joined: Sat Sep 12, 2015 1:40 pm
Re: A train too far?
A good middle ground solution:
'.,'({~⍵∊⍺}⊆⊢)'ab,cd,efg.hij'
Most unfamiliar languages start out as white noise. I admit I saw this post only a day after seeing Roger Hui use (=⊂⊢), which has the need for (1∘↓¨) and is thus less elegant.
'.,'({~⍵∊⍺}⊆⊢)'ab,cd,efg.hij'
Most unfamiliar languages start out as white noise. I admit I saw this post only a day after seeing Roger Hui use (=⊂⊢), which has the need for (1∘↓¨) and is thus less elegant.
-
- Posts: 94
- Joined: Sat Nov 28, 2009 3:12 pm
Re: A train too far?
Mike, nice to know that you are alive and kicking!
What about using the following train for the vector case:
For example
I have (slowly) started to like the function trains - in some nice cases they are even more efficient as my beloved d-fns :)
-Veli-Matti
What about using the following train for the vector case:
((~∊⍨)⊆⊢)
For example
'.,'((~∊⍨)⊆⊢)'ab,cd,efg.hij'
ab cd efg hij
I have (slowly) started to like the function trains - in some nice cases they are even more efficient as my beloved d-fns :)
-Veli-Matti
-
- Posts: 238
- Joined: Thu Jul 28, 2011 10:53 am
Re: A train too far?
0. Michael, thank you for your comments.
I did not invent the (≠⊆⊢) phrase. I believe I first saw it in Adám Brudzewsky’s code. For whatever it’s worth, in my code I never try to show how clever I am. I do have a personal quality scale: bad, OK, good, wow, “monument quality, suitable for presentation to the Galactic Emperor”. For examples of the last category, please see [Hui 2010; Hui 2020]. (≠⊆⊢) is not in the last category. (Sorry, Adám. ☺)
1. I dislike your use of the phrase “white noise”. I consider it pejorative. You know, “tacit definition” fans can consider ⍺ and ⍵ to be “white noise”, but I hope we never tell the dfnistas that. (FWIW, I consider myself a dfnista as well as a tacit definition fan.) One man’s white noise can be music to another’s ears.
Don’t what is and what isn’t clear depend on familiarity? There are people who view dfns with the same feelings as you do about tacit defns (“trains”).
2. You say you rarely have one character as a separator. Well, I myself rarely have more than one character as a sparator, and in my case (≠⊆⊢) serves quite nicely.
Re the expressions you presented for the vector separator case:
3. Finally, I note that if Dyalog APL ever implements ∌ (U+220C) (probably not ∉ (U+2209) because ∊ has a defect built into it from the 1960s), then your vector separator case can be (∌⊆⊣).
I did not invent the (≠⊆⊢) phrase. I believe I first saw it in Adám Brudzewsky’s code. For whatever it’s worth, in my code I never try to show how clever I am. I do have a personal quality scale: bad, OK, good, wow, “monument quality, suitable for presentation to the Galactic Emperor”. For examples of the last category, please see [Hui 2010; Hui 2020]. (≠⊆⊢) is not in the last category. (Sorry, Adám. ☺)
1. I dislike your use of the phrase “white noise”. I consider it pejorative. You know, “tacit definition” fans can consider ⍺ and ⍵ to be “white noise”, but I hope we never tell the dfnistas that. (FWIW, I consider myself a dfnista as well as a tacit definition fan.) One man’s white noise can be music to another’s ears.
Don’t what is and what isn’t clear depend on familiarity? There are people who view dfns with the same feelings as you do about tacit defns (“trains”).
2. You say you rarely have one character as a separator. Well, I myself rarely have more than one character as a sparator, and in my case (≠⊆⊢) serves quite nicely.
Re the expressions you presented for the vector separator case:
'.,' (~⍤∊⍨⊆⊢) 'ab,cd,efg.hij'
⍝ OK, but I would insert “superfluous” spaces to aid
⍝ in readability: (~⍤∊⍨ ⊆ ⊢)
'.,' {⍺(~⍤∊⍨⊆⊢)⍵} 'ab,cd,efg.hij'
⍝ I hope I _never_ write a function sufficient unto itself and then
⍝ embrace it with {⍺ blah ⍵} to make a dfn out of it.
'.,' {(~⍵∊⍺)⊆⍵} 'ab,cd,efg.hij'
⍝ Yes, I may write that.
'.,' {⍵⊆⍨⍵≠⍺} 'ab,cd,efg.hij'
'.' {(⍺≠⍵)⊆⍵} 'ab.cd.efg.hij'
⍝ I usually don’t use ⍨ like this, using it only to replace
⍝ long-scope parentheses.
3. Finally, I note that if Dyalog APL ever implements ∌ (U+220C) (probably not ∉ (U+2209) because ∊ has a defect built into it from the 1960s), then your vector separator case can be (∌⊆⊣).
-
- Posts: 238
- Joined: Thu Jul 28, 2011 10:53 am
Re: A train too far?
Since we are talking about partitioning a character string, I thought I’d post a more difficult example for your amusement. tokens from [Hui and Kromberg 2020] produce the tokens in the character string argument, individually boxed. It treats a dfn as a single token and extends the rules currently in Dyalog APL (in particular, it intentionally does not handle strands the same way).
(The ⍝ on the first line is because I don’t want the 6 space prompt on the first line, dammit!)
For example:
(Misalignments in the display are due to defects in the APL Chat Forum software.)
An alternative definition by John Scholes can be found in http://dfns.dyalog.com/n_tokens.htm.
⍝
ALP←'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyzΔ⍙_0123456789'
tokens←{
q←(⊢∨≠⍀)⍵='''' ⍝ quoted strings
a←q<⍵∊ALP ⍝ alphanumerics (names)
n←q<t∨(⍵='.')∧(1↓t⍪0)∨¯1↓0⍪t←⍵∊'0123456789¯⍬∞' ⍝ numbers
d←t∨×+⍀(q<⍵='{')-t←q<⍵='}' ⍝ dfns
Δ←{⍵ ⍺⍺ ¯1↓0⍪⍵}
p←(>Δ q)∨(>Δ a)∨(>Δ n)∨(>Δ d)∨(a⍱n)>q∨d∨⍵=' '
{⍵↓⍨-⊥⍨⍵=' '}¨ ⍵ ⊂⍨ p⍀ ⍲Δ p⌿n∨q
}
(The ⍝ on the first line is because I don’t want the 6 space prompt on the first line, dammit!)
For example:
tokens '¯1.2e¯3J¯.5E¯6 ∞ ''qi'' ⍬ {⍺+{×⍵}⍵}⍤1 2 3⊢jam'
┌───────────────────────┬─────────┬─┬─────┬─┬───┐
│¯1.2e¯3J¯.5E¯6 ∞ 'qi' ⍬│{⍺+{×⍵}⍵}│⍤│1 2 3│⊢│jam│
└───────────────────────┴─────────┴─┴─────┴─┴───┘
(Misalignments in the display are due to defects in the APL Chat Forum software.)
An alternative definition by John Scholes can be found in http://dfns.dyalog.com/n_tokens.htm.
-
- Posts: 238
- Joined: Thu Jul 28, 2011 10:53 am
Re: A train too far?
The reason why (≠⊆⊢) is not “monument quality” is not because of Adám, but because in Dyalog APL there is no partition function that can generate empty partitions directly, not ⊆ and not ⊂. I had this problem in the Roman numerals post:
⊢ r1 ← 1↓¨ ' ' (=⊂⊢) ' I II III IV V VI VII VIII IX'
┌┬─┬──┬───┬──┬─┬──┬───┬────┬──┐
││I│II│III│IV│V│VI│VII│VIII│IX│
└┴─┴──┴───┴──┴─┴──┴───┴────┴──┘
What the two partition functions produce:
⍝ unwanted leading blank in each partition
' ' (=⊂⊢) ' I II III IV V VI VII VIII IX'
┌─┬──┬───┬────┬───┬──┬───┬────┬─────┬───┐
│ │ I│ II│ III│ IV│ V│ VI│ VII│ VIII│ IX│
└─┴──┴───┴────┴───┴──┴───┴────┴─────┴───┘
⍝ missing required empty leading partition
' ' (≠⊆⊢) ' I II III IV V VI VII VIII IX'
┌─┬──┬───┬──┬─┬──┬───┬────┬──┐
│I│II│III│IV│V│VI│VII│VIII│IX│
└─┴──┴───┴──┴─┴──┴───┴────┴──┘
(Misalignments in the display are due to defects in the APL Chat Forum software.)
There is a way out without “burning up” another symbol, but it does use the much despised left argument coding technique, using either a character scalar or vector, or a complex scalar, or an enclosed vector. (All these cases are currently errors in the primitives.) e.g.
',+'⊂'/foo//upon/thee' ←→ { (⍵=⊃⍵)⊂⍵} '/foo//upon/thee'
'+'⊂'/foo//upon/thee' ←→ ⍝ ditto
',-'⊂'/foo//upon/thee' ←→ { 1↓¨(⍵=⊃⍵)⊂⍵} '/foo//upon/thee'
'-'⊂'/foo//upon/thee' ←→ ⍝ ditto
'+,'⊂'foo//upon/thee/' ←→ ⌽∘{⌽¨ (⍵=⊃⍵)⊂⍵}∘⌽ 'foo//upon/thee/'
'-,'⊂'foo//upon/thee/' ←→ ⌽∘(⌽¨1↓¨(⍵=⊃⍵)⊂⍵)∘⌽ 'foo//upon/thee/'
I leave as exercises for the reader☺:
- using a complex scalar or an enclose vector to do the encoding
- extending the definition to work on higher-ranked ⍵ (using the leading or trailing major cell as separator)
- Adam|Dyalog
- Posts: 143
- Joined: Thu Jun 25, 2015 1:13 pm
Re: A train too far?
I like the idea of being able to specify how to split, but I don't think requiring that the delimiter being present in the argument is a good idea, as that'd require appending (for performance, you don't want to prepend) the delimiter when the data is unknown, which doesn't work well for multiple delimiters (which is quite common: e.g. CRLF vs LF files).
One problem is that there are so many options when splitting:
∘ 3: Leading '/foo//upon/thee', trailing 'foo//upon/thee/', or infix 'foo//upon/thee' delimiters
∘ 3: Multiple '/foo/\upon\thee', multi-element '//foo////upon///thee', or multiple multi-element '//foo//\\upon\\thee' delimiters
∘ 3: Remove 'foo' '' 'upon' 'thee', keep '/foo' '/' '\upon' '\thee', or normalise '//foo' '//' '//upon' '//thee' delimiter
∘ 2: Keep 'foo' '' 'upon' 'thee' or remove 'foo' 'upon' 'thee' empty segments
Maybe more.
I often need to "simply" split a string on a string, i.e. infix multi-element delimiters to be removed while keeping empty segments. Easy in JavaScript:
How would you write this in APL?
P.S. I learned ≠⊆⊢ from Dan Baronet.
One problem is that there are so many options when splitting:
∘ 3: Leading '/foo//upon/thee', trailing 'foo//upon/thee/', or infix 'foo//upon/thee' delimiters
∘ 3: Multiple '/foo/\upon\thee', multi-element '//foo////upon///thee', or multiple multi-element '//foo//\\upon\\thee' delimiters
∘ 3: Remove 'foo' '' 'upon' 'thee', keep '/foo' '/' '\upon' '\thee', or normalise '//foo' '//' '//upon' '//thee' delimiter
∘ 2: Keep 'foo' '' 'upon' 'thee' or remove 'foo' 'upon' 'thee' empty segments
Maybe more.
I often need to "simply" split a string on a string, i.e. infix multi-element delimiters to be removed while keeping empty segments. Easy in JavaScript:
Code: Select all
"here##be####dragons".split("##")
[ "here", "be", "", "dragons" ]
How would you write this in APL?
P.S. I learned ≠⊆⊢ from Dan Baronet.
-
- Posts: 238
- Joined: Thu Jul 28, 2011 10:53 am
Re: A train too far?
A facility that can do more unusual partitions needs to embody a language. See for example the ;: Words / Sequential Machine verb in J. Partitions created by this primitive are competitive in timing with hand-coded C.
If you don’t want to do that, there is always the boolean vector option, where you split on the 1s, and keeping or discarding the items corresponding to the 1s. You use other APL primitives to construct the required boolean vector. (See “tokens” above.)
For using '##' as a substring delimiter, it would be handy to have a NOO (non-overlapping) substring primitive, which ⍷ is not. When 2<+/s⍷s,s on the substring s, to get from ⍷ to NOO requires solving the transitive closure problem.
If you don’t want to do that, there is always the boolean vector option, where you split on the 1s, and keeping or discarding the items corresponding to the 1s. You use other APL primitives to construct the required boolean vector. (See “tokens” above.)
For using '##' as a substring delimiter, it would be handy to have a NOO (non-overlapping) substring primitive, which ⍷ is not. When 2<+/s⍷s,s on the substring s, to get from ⍷ to NOO requires solving the transitive closure problem.
- Adam|Dyalog
- Posts: 143
- Joined: Thu Jun 25, 2015 1:13 pm
Re: A train too far?
My best so far (works only on text):
t←'here...be......dragons'
Split←{(⊢/¨r)↓¨⍵⊂⍨⍸⍣¯1∊1+⊃¨r←('|^',⍨'\W'⎕R'\\&'⊢⍺)⎕S 0 1⊢⍵}
'...' Split t
┌────┬──┬┬───────┐
│here│be││dragons│
└────┴──┴┴───────┘