repeated squaring
Posted: Sat May 16, 2020 12:51 am
If function f “multiplies” then f⍨ squares; squaring n times raises to power 2*n and is therefore a fast way to exponentiate.
One way to compute the n-th power for an arbitrary n (when n may be not a power of 2), is to find the squares corresponding to 1 bits in the binary representation of n, and then compute the product.
The examples show that it would be nice if ⍣ accepts an array right operand. (In that case, the squaring would be done 4 times, rather than 0 times and 3 times and 4 times.) Until that blessed day, the repeated squaring computation can be coded as follows:
In the opening sentence above, multiplication is in quotes because the method works for other kinds of multiplication, such as matrix multiplication or ⍵[⍵] (as in the text fitting problem). The last example 3 +RS 25 illustrates that it works for + as “multiplication”, whose “squaring” is doubling and whose “exponentiation” is multiplication.
An Application: Fibonacci Numbers
We illustrate repeated squaring by some computations on the Fibonacci numbers.
The double recursion is quite inefficient: The number of function calls for fib n is fib n+1, which is exponential in n. More efficient formulations exist:
Whither repeated squaring? The relationship (fib n)=(fib n-2)+(fib n-1) can be written in matrix form, and:
fib n for n≥1 is the entry at the lower right corner of the (n-1)-th power of A. The last parenthesized expression, the (A +.× … +.× A), is where repeated squaring can be used.
The last benchmark shows that the matrix method with repeated squaring is competitive with the other methods. We next show that it is superior to the other methods for large n.
For large n, fib3, based on computations on irrational numbers (5*0.5), is not in the running, at least not without a large amount of work (which we won’t do here). For large n, we first develop functions for extended precision arithmetic:
“Numbers” are vectors of decimal digits, most significant digit first. xp is extended precision plus and xt is extended precision times. (The FFT-based xtimes in the dfns workspace is faster for large numbers, but we don’t need it to prove the point.) Modifying fib2 and fibm for extended precision computations:
Let us compute a Fibonacci number with 200 decimal digits. From fib3, we know that
Release the hounds!
How effective is repeated squaring (xfibm)?
Collected Definitions
The initial portion of this post is translated from the J Wiki Essay Repeated Squaring, 2005-09-28.
×⍨ 3
9
×⍨ ×⍨ ×⍨ ×⍨ 3
43046721
×⍨⍣4 ⊢ 3
43046721
3 * 2*4
43046721
One way to compute the n-th power for an arbitrary n (when n may be not a power of 2), is to find the squares corresponding to 1 bits in the binary representation of n, and then compute the product.
3*25
847288609443
2⊥⍣¯1⊢25
1 1 0 0 1
⍸ ⊖ 2⊥⍣¯1⊢25
0 3 4
{×⍨⍣⍵⊢3}¨ 0 3 4
3 6561 43046721
×⌿ {×⍨⍣⍵⊢3}¨ 0 3 4
847288609443
The examples show that it would be nice if ⍣ accepts an array right operand. (In that case, the squaring would be done 4 times, rather than 0 times and 3 times and 4 times.) Until that blessed day, the repeated squaring computation can be coded as follows:
RS← {⊃ ⍺⍺⌿ b⌿ (⊢,⊂∘(⍺⍺⍨)∘⊃∘⊖)⍣(¯1+≢b) ⊂⍺ ⊣ b←⊖2⊥⍣¯1⊢⍵}
⍝ └──────────────────────────┘
⍝ ←→ ⍺⍺⍨⍣(⍸b)
3 ×RS 25
847288609443
3 +RS 25
75
In the opening sentence above, multiplication is in quotes because the method works for other kinds of multiplication, such as matrix multiplication or ⍵[⍵] (as in the text fitting problem). The last example 3 +RS 25 illustrates that it works for + as “multiplication”, whose “squaring” is doubling and whose “exponentiation” is multiplication.
An Application: Fibonacci Numbers
We illustrate repeated squaring by some computations on the Fibonacci numbers.
fib← {1≥⍵:⍵ ⋄ (∇ ⍵-2)+∇ ⍵-1}
fib¨ ⍳10
0 1 1 2 3 5 8 13 21 34
The double recursion is quite inefficient: The number of function calls for fib n is fib n+1, which is exponential in n. More efficient formulations exist:
fib2← {⊃ +⍀∘⊖⍣⍵ ⍳2}
fib2¨ ⍳10
0 1 1 2 3 5 8 13 21 34
fib3← {⌊0.5+r÷⍨(2÷⍨1+r←5*0.5)*⍵}
fib3¨ ⍳10
0 1 1 2 3 5 8 13 21 34
cmpx 'fib 30' 'fib2 30' 'fib3 30'
fib 30 → 1.56E0 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
fib2 30 → 0.00E0 | -100%
fib3 30 → 0.00E0 | -100%
cmpx 'fib2 30' 'fib3 30'
fib2 30 → 9.75E¯6 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
fib3 30 → 1.81E¯6 | -82% ⎕⎕⎕⎕⎕⎕
Whither repeated squaring? The relationship (fib n)=(fib n-2)+(fib n-1) can be written in matrix form, and:
⊢ A←2 2⍴0 1 1 1
0 1
1 1
A +.× 0 1
1 1
A +.× A +.× 0 1
1 2
A +.× A +.× A +.× 0 1
2 3
A +.× A +.× A +.× A +.× 0 1
3 5
A +.× A +.× A +.× A +.× A +.× 0 1
5 8
(A +.× A +.× A +.× A +.× A) +.× 0 1
5 8
fib 6
8
¯1 ¯1↑(A+.×A+.×A+.×A+.×A)
8
fib n for n≥1 is the entry at the lower right corner of the (n-1)-th power of A. The last parenthesized expression, the (A +.× … +.× A), is where repeated squaring can be used.
fibm← {1≥⍵:⍵ ⋄ ⊃ ¯1 ¯1↑ A +.×RS ⍵-1}
fibm¨ ⍳10
0 1 1 2 3 5 8 13 21 34
cmpx 'fib2 50' 'fib3 50' 'fibm 50'
fib2 50 → 1.58E¯5 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
fib3 50 → 1.73E¯6 | -90% ⎕⎕⎕
fibm 50 → 1.70E¯5 | +7% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
The last benchmark shows that the matrix method with repeated squaring is competitive with the other methods. We next show that it is superior to the other methods for large n.
For large n, fib3, based on computations on irrational numbers (5*0.5), is not in the running, at least not without a large amount of work (which we won’t do here). For large n, we first develop functions for extended precision arithmetic:
carry ← {(+⌿∧⍀0=t)↓t←{1↓+⌿1 0⌽0,0 10⊤⍵}⍣≡0,⍵}
xp ← {carry (m↑⍺)+m↑⍵ ⊣ m←-(≢⍺)⌈≢⍵} ⍝ plus
xt ← {carry +⌿ (m-0 1) ⍴ (m←(≢⍺)+0,≢⍵) ↑ ⍺∘.×⍵} ⍝ times
3 1 4 1 5 9 2 6 5 3 xp 2 7 1 8 2 8
3 1 4 1 8 6 4 4 8 1
3141592653 + 271828
3141864481
3 1 4 1 5 9 2 6 5 3 xt 2 7 1 8 2 8
8 5 3 9 7 2 8 4 7 6 7 9 6 8 4
3141592653 × 271828
853972847679684
“Numbers” are vectors of decimal digits, most significant digit first. xp is extended precision plus and xt is extended precision times. (The FFT-based xtimes in the dfns workspace is faster for large numbers, but we don’t need it to prove the point.) Modifying fib2 and fibm for extended precision computations:
xfib2← {⊃ xp⍀∘⊖⍣⍵ ,¨⍳2}
xfib2¨ ⍳10
┌─┬─┬─┬─┬─┬─┬─┬───┬───┬───┐
│0│1│1│2│3│5│8│1 3│2 1│3 4│
└─┴─┴─┴─┴─┴─┴─┴───┴───┴───┘
⊢xA←,¨A
┌─┬─┐
│0│1│
├─┼─┤
│1│1│
└─┴─┘
xfibm← {1≥⍵:,⍵ ⋄ ⊃ ¯1 ¯1↑ xA xp.xt RS ⍵-1}
xfibm¨ ⍳10
┌─┬─┬─┬─┬─┬─┬─┬───┬───┬───┐
│0│1│1│2│3│5│8│1 3│2 1│3 4│
└─┴─┴─┴─┴─┴─┴─┴───┴───┴───┘
Let us compute a Fibonacci number with 200 decimal digits. From fib3, we know that
r←5*0.5
(2÷⍨1+r)⍟r×1e200
958.6666692945176
fib3 958
7.255617127345179E199
Release the hounds!
(xfib2 ≡ xfibm) 958
1
t←xfibm 958
⍴t
200
t
7 2 5 5 6 1 7 1 2 7 3 4 4 9 4 5 7 8 3 4 9 5 1 3 … 7 9 1 8 0 7 9
(54⍴10 1/1 0) \ 4 50 ⍴ ⎕d[t]
7255617127 3449457834 9513639664 8012357907 4308201883
3452265955 2369453901 0959098708 8044945653 6465064308
7041102422 5291077532 1041899600 8519495018 0745825717
5818735941 5023021847 2032795681 1455446059 3757918079
{⎕fr←1287 ⋄ ¯40 ⍕ r÷⍨(2÷⍨1+r←5*0.5)*958} ⍬ ⍝ sanity check
7.255617127344945783495136396648071______E199
How effective is repeated squaring (xfibm)?
cmpx 'xfib2 958' 'xfibm 958'
xfib2 958 → 1.53E¯2 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
xfibm 958 → 4.07E¯3 | -74% ⎕⎕⎕⎕⎕⎕⎕⎕
cmpx 'xfib2 2000' 'xfibm 2000'
xfib2 2000 → 4.50E¯2 | 0% ⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕⎕
xfibm 2000 → 6.50E¯3 | -86% ⎕⎕⎕⎕
Collected Definitions
⍝
RS ← {⊃ ⍺⍺⌿ b⌿ (⊢,⊂∘(⍺⍺⍨)∘⊃∘⊖)⍣(¯1+≢b) ⊂⍺ ⊣ b←⊖2⊥⍣¯1⊢⍵}
fib ← {1≥⍵:⍵ ⋄ (∇ ⍵-2)+∇ ⍵-1}
fib2 ← {⊃ +⍀∘⊖⍣⍵ ⍳2}
fib3 ← {⌊0.5+r÷⍨(2÷⍨1+r←5*0.5)*⍵}
A ← 2 2⍴0 1 1 1
fibm ← {1≥⍵:⍵ ⋄ ⊃ ¯1 ¯1↑ A +.×RS ⍵-1}
carry ← {(+⌿∧⍀0=t)↓t←{1↓+⌿1 0⌽0,0 10⊤⍵}⍣≡0,⍵}
xp ← {carry (m↑⍺)+m↑⍵ ⊣ m←-(≢⍺)⌈≢⍵} ⍝ plus
xt ← {carry +⌿ (m-0 1) ⍴ (m←(≢⍺)+0,≢⍵) ↑ ⍺∘.×⍵} ⍝ times
xA ← ,¨A
xfib2 ← {⊃ xp⍀∘⊖⍣⍵ ,¨⍳2}
xfibm ← {1≥⍵:,⍵ ⋄ ⊃ ¯1 ¯1↑ xA xp.xt RS ⍵-1}
The initial portion of this post is translated from the J Wiki Essay Repeated Squaring, 2005-09-28.