Hi,
I'm trying to understand the Tally function. Why it is a need for this function? Why {1⌷⍴⍵} or something like this is not enough? Any examples on how it is useful with different kinds of arrays?
Tally
-
- Posts: 238
- Joined: Thu Jul 28, 2011 10:53 am
Re: Tally
{1⌷⍴⍵} is dependent on ⎕io, and alternatives such as {⊃⍴⍵} are dependent on ⎕ml. As well, they give the wrong answer for scalar ⍵. And all of them are not as terse as ≢⍵.
As for its use, how would you define a function for the average or mean? With tally, you can say avg←{(+⌿⍵)÷≢⍵} or avg←+⌿÷≢. These definitions are superior for reasons stated in the Vector article On Average. The article used ⍬⍴(⍴⍵),1 instead of ≢⍵ because at the time of writing ≢⍵ was not available. ≢ can often be used instead of ⍴ and is better because a scalar result is preferred.
Further afield, I conjecture that the lack of ≢ led to an APL design mistake 50 years ago, namely that of singleton extension for scalar functions. Singleton extension enabled expressions such as x+⍴vec.
As for its use, how would you define a function for the average or mean? With tally, you can say avg←{(+⌿⍵)÷≢⍵} or avg←+⌿÷≢. These definitions are superior for reasons stated in the Vector article On Average. The article used ⍬⍴(⍴⍵),1 instead of ≢⍵ because at the time of writing ≢⍵ was not available. ≢ can often be used instead of ⍴ and is better because a scalar result is preferred.
Further afield, I conjecture that the lack of ≢ led to an APL design mistake 50 years ago, namely that of singleton extension for scalar functions. Singleton extension enabled expressions such as x+⍴vec.
Re: Tally
Thanks for the explanation and link to the article, so it looks like ⍴ was not enough since very often one may need to get the vector size. But what if ⍴ could return scalar 1 for scalars as well instead of an empty array? Wouldn't it be more logical than to introduce a new function?
Re: Tally
then we'd havealexeyv wrote:what if ⍴ could return scalar 1 for scalars ... instead of an empty array?
rankvector ← ⍴ rhovector ← ⍴ vector ←,1
vector
1
rhovector
1
rankvector
1
rankscalar ← ⍴ rhoscalar ← ⍴ scalar←1
scalar
1
rhoscalar
1
rankscalar
1
Re: Tally
You wouldn't be able to identify the scalar's shape as scalar 1 because it's shape, the scalar's rank, would also be scalar 1. You would be in an endless loop of undetectability. In fact the DISPLAY function would have the same problem. It couldn't display a scalar because it wouldn't be able to detect one.
- Morten|Dyalog
- Posts: 460
- Joined: Tue Sep 09, 2008 3:52 pm
Re: Tally
alexeyv wrote:But what if ⍴ could return scalar 1 for scalars as well instead of an empty array? Wouldn't it be more logical than to introduce a new function?
This certainly would not be more "logical", since the shape function is supposed to return a result containing one element for each dimension of the right argument. You might argue that it would be more practical, but as Phil points out, this would mean that you could not distinguish between a one element vector and a scalar. Most programming languages don't recognise scalars as arrays, but the idea that a scalar is a zero-dimensional array lies at the very heart of APL; the entire foundations of the language would need to be rebuilt if you made this change. Just about every application in existence would stop working, so it might not be so practical after all.
Re: Tally
Thanks, I understand. Returning to Tally, I tried to play around it to determine its limits and scope of usage. From the Dyalog's documentation:
Ignoring the statement about equivalence for now, I ran a few tests:
It feels strange so in the 3rd case I don't get 3 again but 1 instead. What does the "major cells" in the definition actually mean? Only cells in direction 1?
Tally returns the number of major cells of Y. This can also be expressed as the length of the leading axis or 1 if Y is a scalar. Tally is equivalent to the function {⍬⍴(⍴⍵),1}.
Ignoring the statement about equivalence for now, I ran a few tests:
≢1 2 3
3
≢3 1⍴1 2 3
3
≢1 3⍴1 2 3
1
≢[2]1 3⍴1 2 3
SYNTAX ERROR
≢[2]1 3⍴1 2 3
∧
It feels strange so in the 3rd case I don't get 3 again but 1 instead. What does the "major cells" in the definition actually mean? Only cells in direction 1?
Re: Tally
Imagine a larger array of shape 4 5 6 7 (say).
Imagine it stacked as a 4 by 5 by 6 by 7 hypercube of scalars.
Imagine it stacked as a 4 by 5 by 6 block of vectors each of shape 7. (It's easy if you try)
Imagine it stacked as a 4 by 5 plane of planes each of shape 6 7. (It isn't hard to do)
Imagine it stacked as a 4 vector of blocks each of shape 5 6 7. (I wonder if you can)
Imagine it stacked as a scalar hypercube of shape 4 5 6 7. (Damned if I can)
The "cell" in each case is the scalar, the vector, the plane, the block and the hypercube. The "frame" is the prefix, the shape of the array of cells.
So we see that the shape of an array is the catenation of its frame and cell.
Imagine it stacked as a 4 by 5 by 6 by 7 hypercube of scalars.
Imagine it stacked as a 4 by 5 by 6 block of vectors each of shape 7. (It's easy if you try)
Imagine it stacked as a 4 by 5 plane of planes each of shape 6 7. (It isn't hard to do)
Imagine it stacked as a 4 vector of blocks each of shape 5 6 7. (I wonder if you can)
Imagine it stacked as a scalar hypercube of shape 4 5 6 7. (Damned if I can)
The "cell" in each case is the scalar, the vector, the plane, the block and the hypercube. The "frame" is the prefix, the shape of the array of cells.
So we see that the shape of an array is the catenation of its frame and cell.
In that above:Major cells for any array have rank 1 less than the array itself and the number of them is the tally. For that above we have 4 major cells each of rank 3, shape 5 6 7.
frame cell
4 5 6 7 -
4 5 6 7
4 5 6 7
4 5 6 7
- 4 5 6 7