| |
Trifox Account closed
Registered: Mar 2006 Posts: 108 |
calculating of square roots ?
hi all, for my newest project i am in urgent need to calculate the length of a 2d vector, reminding pythagorian math i remember that i have to calculate the roots of a fixed point (8bits.8bits) number, how can that be mastered in a convenient way ?!?!?!
thx
|
|
... 92 posts hidden. Click here to view all posts.... |
| |
Graham Account closed
Registered: Dec 2002 Posts: 990 |
GO64 had an article about this by Krill, but the routine he presented was mine. He didn't explain much about the routine and how to get to it since I didn't explain to him. :)
Anyway here comes the explanation: Imagine you wanna have a square root:
R = sqrt(N)
That obviously gives you:
R^2 = N
Since we wanna establish a convergent algo, we simply use this during calculation:
R(n)^2 <= N
R(0) will be 0. To get closer to N we add a third value D to R as long as that formula is true. Since we work with binary computers, this D will be of the kind 2^x (128, 64, 32 etc).
Ok assuming we wanna have the root from a 16 bit number and since the output of a sqrt of max 65535 (highest 16 bit input) is maximum 255.998 we start with a D of 128 and LSR it down to 1. On each iteration we try (R+D)^2 <= N and if that formula is true, we have R=R+D. If not, R stays unmodified. The resulting algo looks like this:
R = 0
D = 128
while (D >= 1)
{
temp = R+D
if (temp*temp <= N) then R=T
D = D/2
}
As you see this algo is very simple and apart from the temp*temp it does not involve any time consuming operation. The D/2 is ofcourse just a single LSR.
Ok but we are not satisfied. One single integer multiplication is too much for our little 6510 so we have to get rid of it.
(end of part 1, part 2 soon) |
| |
Cruzer
Registered: Dec 2001 Posts: 1048 |
*holds breath* |
| |
Graham Account closed
Registered: Dec 2002 Posts: 990 |
(part 2)
Ok now we are not satisfied with the mul but how to get rid of it? First you have to see that (R+D)^2 is a simple binomical formula so you can do this:
(R+D)^2 = (R+D)*(R+D) = R*R + 2*R*D + D*D
Still this doesn't look THAT promising but let's try more formula changing:
R*R + 2*R*D + D*D = R*R + D*(2*R + D)
Ok what do we have now? 2*R is easy since it's just a simple ASL in assembler. A multiplication with D is also easy since D is always of the kind 2^x so D*something is equal to ASL something several times. But damnit, there's still that R*R biatch, let's send her to hell by introducing another variable called M:
M(n) <= N - R(n)^2
And since R(0) = 0:
M(0) = N
As you see M(n) is just the difference from N and "current N during algo" which gives us a new way to decide if D should be added to R or not. The only thing we need now is a way to calculate M(n) without calculating R(n)^2. This is a bit messy but it works with these given formulas:
N(n) <= (R(n)+D(n))^2
Given the easy formula transformation from above we have:
N(n) <= R(n)^2 + D(n)*(2*R(n)+D(n))
Just substract R(n)^2 from both sides and get:
N(n) - R(n)^2 <= D(n)*(2*R(n)+D(n))
Now look a few lines above at the M(n) formula. Do you notice something? Exactly, there we have the front part of this formula too so it is proven that:
M(n) <= D(n)*(2*R(n)+D(n))
Time for a party, that R*R is gone! But how do we calculate those M(n)? Quite simple actually:
M(n+1) = M(n) - D(n)*(2*R(n)+D(n))
So we got a completely new algo:
M = N
D = 128
while (D >= 1)
{
T = D*(2*R+D)
if (T <= M) then M=M-T : R = R+D
D = D/2
}
Yeeehaw! R*R is gone, and 2*R is easy (ASL) and D*whatever is easy too (ASL multiple times). To make this clear I change the muls into ASL's:
M = N
D = 128
for n = 7 to 0
{
T = (R+R+D) ASL n
if (T <= M) then M=M-T : R = R+D
D = D LSR 1
}
Wicked! BUT: We are not satisfied because of n-times ASL :)
(end of part 2, part 3 following in the near future) |
| |
PopMilo
Registered: Mar 2004 Posts: 146 |
Now isn't it nice to see something new (at least for me... )in math on c64 :) ?
|
| |
TNT Account closed
Registered: Oct 2004 Posts: 189 |
While you hold your breath you might want to check out Apple Assemly Line Vol. 7, Issue 2.
http://bobsc5.home.comcast.net/aal/1986/aal8611.html |
| |
Graham Account closed
Registered: Dec 2002 Posts: 990 |
(part 3 of the square root talk)
Ok I stopped at the point where the multiplication was gone but a ASL with varying shift count appeared. That is no problem for big CPU's like 680x0 or x86, but our poor 6510 doesn't like this so that "ASL n" has to leave.
To do that, we don't investigate the math any further but take a look at how the variables behave during iteration of this algo:
R = 0
M = N
D = 128
for n = 7 to 0
{
T = (R+R+D) ASL n
if (T <= M) then M=M-T : R = R+D
D = D LSR 1
}
D, T, M and R now look like this:
D T M R
10000000 0100000000000000 xxxxxxxxxxxxxxxx x0000000
01000000 0x01000000000000 0xxxxxxxxxxxxxxx xx000000
00100000 00xx010000000000 00xxxxxxxxxxxxxx xxx00000
00010000 000xxx0100000000 000xxxxxxxxxxxxx xxxx0000
00001000 0000xxxx01000000 0000xxxxxxxxxxxx xxxxx000
00000100 00000xxxxx010000 00000xxxxxxxxxxx xxxxxx00
00000010 000000xxxxxx0100 000000xxxxxxxxxx xxxxxxx0
00000001 0000000xxxxxxx01 0000000xxxxxxxxx xxxxxxxx
Removing the "ASL n" would completely change the behaviour of T. But also M has to change because otherwise the T <= M compare and M-T math will not work. So if T is not left-shifted, M has to be right-shifted. The result looks like this:
D T M R
10000000 0100000000000000 xxxxxxxxxxxxxxxx x0000000
01000000 x010000000000000 xxxxxxxxxxxxxxx0 xx000000
00100000 xx01000000000000 xxxxxxxxxxxxxx00 xxx00000
00010000 xxx0100000000000 xxxxxxxxxxxxx000 xxxx0000
00001000 xxxx010000000000 xxxxxxxxxxxx0000 xxxxx000
00000100 xxxxx01000000000 xxxxxxxxxxx00000 xxxxxx00
00000010 xxxxxx0100000000 xxxxxxxxxx000000 xxxxxxx0
00000001 xxxxxxx010000000 xxxxxxxxx0000000 xxxxxxxx
So now instead of shifting T we shift M, but the advantage of shifting M is that it's only 1 bit per iteration. Look at the code now:
R = 0
M = N
D = 128
for n = 7 to 0
{
T = (R+R+D) ASL 7
if (T <= M) then M=M-T : R = R+D
M = M ASL 1
D = D LSR 1
}
Much much better since ASL 7 can be replaced by a single LSR in asm code later. This code is quite useful on 6510 already, BUT ofcourse this is not everything. Wait for part 4 :D
(part 4 soon on this station) |
| |
Graham Account closed
Registered: Dec 2002 Posts: 990 |
(part 4: finally some asm)
Taking a slightly modified version of the last pseudo code:
R = 0
M = N
D = 128
for n = 7 to 0
{
T = (R ASL 8) OR (D ASL 7)
if (T <= M) then M=M-T : R = R OR D
M = M ASL 1
D = D LSR 1
}
T can now be calculated quite easily. The high byte is simply R OR <some stuff> and the low byte is always 0 except for the last iteration where it is 128. The shifting of D is done via a table and voila, here is some 6510 code:
LDY #$00 ; R = 0
LDX #$07
.loop
TYA
ORA stab-1,X
STA THI ; (R ASL 8) | (D ASL 7)
LDA MHI
CMP THI
BCC .skip1 ; T <= M
SBC THI
STA MHI ; M = M - T
TYA
ORA stab,x
TAY ; R = R OR D
.skip1
ASL MLO
ROL MHI ; M = M ASL 1
DEX
BNE .loop
; last iteration
STY THI
LDA MLO
CMP #$80
LDA MHI
SBC THI
BCC .skip2
INY ; R = R OR D (D is 1 here)
.skip2
RTS
stab: .BYTE $01,$02,$04,$08,$10,$20,$40,$80
I hope I didn't make any mistakes in that routine :)
Input is MLO/MHI for N and output is Y-register for int(sqrt(N)).
(Anyone here who would have guessed that a routine like that calculates a square root?) |
| |
enthusi
Registered: May 2004 Posts: 677 |
weird shit for sure :)
Well, maybe there should be some hall of fame for non-intuitive algs :)
|
| |
Cruzer
Registered: Dec 2001 Posts: 1048 |
wow |
| |
Burglar
Registered: Dec 2004 Posts: 1101 |
graham, what do you do in real life? (as in work) |
Previous - 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 - Next |