Page 1 of 2

Combining Take and Drop on a matrix for efficiency?

Posted: Wed Jul 24, 2024 2:03 am
by Rav
Let's say I have a 3000 by 4000 matrix of 32-byte integers. I want to do this:

mat←1000 1000↑1000 1000↓mat

Is there some language feature in Dyalog APL which would allow me to "combine" the take and drop operations into some sort of unitary operation, such that Dyalog APL would recognize that what I really want is a certain 1000x1000 segment of the matrix? In other words, rather than APL first traversing the matrix and returning 1000 1000↓mat (which would return a 2000x3000 matrix containing a lot of data I don't want), and then traversing that intermediate result and returning 1000 1000↑ , can the operation be made to just extract the 1000x1000 segment I want? I know I can do this with indexing, but in timings I've done that's even more cpu-intensive than doing the separate drop and the take. I read about Function Composition but didn't understand it well enough to figure out how to make it do what I want, if it even can. Thanks for any thoughts. / Rav

Re: Combining Take and Drop on a matrix for efficiency?

Posted: Sat Jul 27, 2024 12:06 am
by Rav
If there is no such feature currently, would Dyalog consider implementing it as a primitive function, perhaps called TakeDrop?

Or, perhaps implement it as an idiom, recognizing V1↑V2↓array as TakeDrop?

Re: Combining Take and Drop on a matrix for efficiency?

Posted: Sat Jul 27, 2024 2:51 pm
by petermsiegel
I was intrigued that indexing would be significantly slower; here's my naive test code (based on your problem statement) running on a Macbook.

Code: Select all

      size← 3000 4000
      m← size⍴⍳×/size
      ⍴m
3000 4000
      ⎕dr m
323
      cmpx  '1000 1000↑1000 1000↓m' 'm[j;j←1000+⍳1000]' '(2⍴⊂1000+⍳1000)⌷m'  'm⌷⍨2⍴⊂1000+⍳1000'
  1000 1000↑1000 1000↓m → 3.7E¯3 |   0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
  m[j;j←1000+⍳1000]     → 7.5E¯4 | -80% ⎕⎕⎕⎕⎕⎕⎕⎕                                
  (2⍴⊂1000+⍳1000)⌷m     → 7.2E¯4 | -81% ⎕⎕⎕⎕⎕⎕⎕⎕                                
  m⌷⍨2⍴⊂1000+⍳1000      → 7.1E¯4 | -81% ⎕⎕⎕⎕⎕⎕⎕⎕  

⍝ for comparison: degenerate case without dropping first
      cmpx '1000 1000↑m'
  5.0E¯4                              
  
(Of course, this test code lacks any bound-checking.)

I tried pre-generating the indices (e.g. j←1000+⍳1000, etc.), but the timings were comparable to generating them inline.

Re: Combining Take and Drop on a matrix for efficiency?

Posted: Sat Jul 27, 2024 3:02 pm
by Rav
Peter: Hmmm, I just tried your timings and got the same timings you did. I don't remember how I did my timings at the time (it's gone from my session history), but I could have sworn that indexing was slower. So thanks for that. At any rate, I still think it could be significantly faster and waste less memory to have either a dedicated TakeDrop or an idiom that recognizes when that's occurring. The separate take and drop do more work and waste space, and indexing with Iota vectors likewise is inefficient as APL doesn't know it's an iota vector and it has to extract the result elements one at a time.

Re: Combining Take and Drop on a matrix for efficiency?

Posted: Sat Jul 27, 2024 4:12 pm
by Rav
I wrote a ⎕CALL-able assembly language routine for APL+Win to implement TakeDrop. Given a 1200 by 2400 matrix of integers, to do the equivalent of 100 100↑200 400↓mat, my assembly language routine was about four times faster than mat[200+⍳100;400+⍳100] and a whopping 125 times faster than 100 100↑200 400↓mat. Plus it takes much less memory. Those are timings in APL+Win, of course. Since as far as I know there's no way to do ⎕CALL in Dyalog APL I can't port my code there to try it there. I wish I could!

Re: Combining Take and Drop on a matrix for efficiency?

Posted: Sat Jul 27, 2024 7:28 pm
by petermsiegel
Rav,
By the way, I second your suggestion for V1↑V2↓array as an "idiom," as that is a valuable way Dyalog helps ensure frequently used expressions perform well. Perhaps Dyalog has a list of such candidate expressions, showing which benefit from "idiomisation" (so to speak) and which may not.

BTW, have you looked at ⎕NA in Dyalog that handles calls to DLL/shared library routines? I can't say whether or not it will be fast enough for your purposes, but back in the day I used to write APL-callable C-language routines to get at services not then available. The interface is easy enough once you familiarize yourself with it, allowing you to control which arrays are input, output, or both.

-Pete

Re: Combining Take and Drop on a matrix for efficiency?

Posted: Sun Jul 28, 2024 2:00 am
by Rav
Pete,

I have used ⎕NA frequently to call Windows built-in Win32 functions. For example I've used it to read display pixels from APL forms. I don't know C so wouldn't know how to write a C routine that ⎕NA could call. I actually thought about the idea of using BitBlt (https://learn.microsoft.com/en-us/windo ... gdi-bitblt) to perform TakeDrop on an APL array, since its argument includes starting row and column and width and height, but although ⎕NA appears to support pointing to APL arrays (type A), there aren't examples in the documentation and I couldn't figure out how to do so. It also wasn't clear if BitBlt was the appropriate solution or if it was the entire solution. It might require so many other necessary ⎕NA calls (such as CreateCompatibleBitmap, GetDC, GetDIBits, SelectObject, DeleteObject, DeleteDC) that any efficiency gained might be lost to all that overhead. At any rate, do you know how to address the data portion of APL arrays with ⎕NA (both reading and writing)? I might find that very useful. Thanks for your replies.

Re: Combining Take and Drop on a matrix for efficiency?

Posted: Sun Jul 28, 2024 7:04 am
by Adam|Dyalog
I've long been contemplating extending Take and Drop to also be Outfix and Infix functions (what you're computing here is an infix). It would work by allowing every scalar in the traditional left argument to also be a two-element vector of a non-negative and non-positive number. This way, we can restate
1000 1000↑1000 1000↓m
to
¯1000 ¯2000↓1000 1000↓m
and my extension would then allow writing
(1000 ¯1000)(1000 ¯2000)↓m
and since this is a single operation, the interpreter could optimise it even more than we can using indexing, since it wouldn't need to build the indices.

Re: Combining Take and Drop on a matrix for efficiency?

Posted: Mon Jul 29, 2024 7:31 am
by RichardP|Dyalog
I would still just vote for recognising N↑M↓ and N↓M↑ as idiom rather than extending the syntax.

Re: Combining Take and Drop on a matrix for efficiency?

Posted: Mon Jul 29, 2024 8:05 am
by Adam|Dyalog
↑ extended to be Outfix can do more than such idioms would allow. For example, (⊂4 ¯1)↑'Racecar' would give 'Racer', but more importantly, the extension of ↓ to Infix would allow ⌺ to work on small arguments:
{∊⍕¨'('⍺'↓'⍵' → '(⍺↓⍵)')'}⌺5⍳5
(2↓0 0 1 2 3 → 1 2 3)    
(1↓0 1 2 3 4 → 1 2 3 4)  
(0↓1 2 3 4 5 → 1 2 3 4 5)
(¯1↓2 3 4 5 0 → 2 3 4 5) 
(¯2↓3 4 5 0 0 → 3 4 5)
If we shrink the argument to less than 4 elements, we get an error because ⌺ cannot provide a left argument for the operand function such that this value can be used to remove the padded 0s (since there'll be 0s on both sides of the original data). With the extension to ↓ it becomes possible to give an enclosed 2-element value that does the trick.