Finding the nth occurrence of a number
Finding the nth occurrence of a number
Has anyone written an efficient Dfn to find the nth occurrence of a number in a numerical vector that examines only so much of that vector until the nth occurrence is found?
Re: Finding the nth occurrence of a number
I should have thought that almost any iterative process that looks at one item at a time is going to take longer than doing a scalar equality and a plus scan but I suppose given a sufficiently long vector, some kind of probabilistic approach might help. For the Nth occurrence of number M in vector V you could take the first N items of V and check those in case they're all equal to M. Then assuming you found H hits in the first N you look at the next (N-H)×N÷H+H=0 that'll look at a different length sub-vector proportional to the remainder required and inversely proportional to the frequency of hits so far. Of course the one you want might be the last item of V in which case rather a lot more effort has been expended than was entirely necessary.
Re: Finding the nth occurrence of a number
⍳ finds the first occurrence of a given number in a vector v, and I presume that the interpreter of this expression does not scan v in its entirety if the number occurs before the last element of v. What I am seeking is an efficient extension of ⍳.
Re: Finding the nth occurrence of a number
This:
gives identical results to and is generally quicker than this deliberately naive coding:
mf compares results of 2 functions given the same arguments.
cf compares timings of 2 functions given the same arguments.
On tracing it turns out that more often than not the answer is found on the second iteration. Perhaps it's worth it after all so long as the target number is sufficiently frequent in the source.
Could be improved.
∇
nth←{
(req num)←⍺
max←⍴vec←⍵
{
try←⍵
(try≥max)∨req≤hit←+/res←num=try↑vec:req⍴res/⍳⍴res
∇⌈req×try÷hit+hit=0
}req
⍝ ⍺ req num
⍝ ⍵ vec
⍝ ← indices of first req instances of num in vec
}
gives identical results to and is generally quicker than this deliberately naive coding:
∇
Nth←{
(req num)←⍺
req⍴{⍵/⍳⍴⍵}⍵=num
}
⎕RL←16807
z←?1000000⍴10
10 3 nth z
9 17 34 38 53 62 70 90 91 128
10 3 Nth z
9 17 34 38 53 62 70 90 91 128
mf compares results of 2 functions given the same arguments.
10 3 nth mf Nth ?100000⍴10
1
cf compares timings of 2 functions given the same arguments.
10 3 nth cf Nth ?100⍴10
2.41814946619217
10 3 nth cf Nth ?1000⍴10
1.48673946957878
10 3 nth cf Nth ?10000⍴10
0.223495702005731
10 3 nth cf Nth ?100000⍴10
0.032520325203252
10 3 nth cf Nth ?1000000⍴10
0
On tracing it turns out that more often than not the answer is found on the second iteration. Perhaps it's worth it after all so long as the target number is sufficiently frequent in the source.
Could be improved.
Re: Finding the nth occurrence of a number
Best I can do using iota is this abomination:
it doesn't work for num=0 although it can be made to with a time penalty and still it's no quicker than the naive coding above. Oh yes. And an index error ensues if there are insufficient instances of the target.
'nuff
∇
Nth←{
(req num)←⍺
vec←⍵
vec⍳{vec[vec⍳num]+←num}/⍳req
}
it doesn't work for num=0 although it can be made to with a time penalty and still it's no quicker than the naive coding above. Oh yes. And an index error ensues if there are insufficient instances of the target.
'nuff