[CSDb] - User Forums - How to make efficient double-sine calculations

You are not logged in - nap

CSDb User Forums

Forums > C64 Coding > How to make efficient double-sine calculations

2017-08-20 16:28

Trap

Registered: Jul 2010
Posts: 223

How to make efficient double-sine calculations

Hi,

I am trying to improve a little on my effect animation skills. To that purpose I'd like to hear how you guys solve the issue of double-sine table calculations. As I am by no means a math-guru - not even close, try to keep it at a practical level :)
Sure, there has to be some clever way around this. Usually I'd do something like the following mock-up code:

lda Counter1 // Copy counters to indexes
sta Index1
lda Counter2
sta Index2
ldx #TableSize
!CalcAnim: ldy Index1
lda SineWave1,y // Get first value
iny // Index1 Delta + 1
sty Index1
ldy Index2
clc
adc SineWave2,y // Add second value
iny // Index2 Delta + 1
sty Index2
tay
lda Lookuptable,y // Find the value and store it
sta Destinationtable,x
dex
bne CalcAnim-
lda Counter1
clc
adc #1 // Velocity 1
sta Counter1
lda Counter2
clc
adc #1 // Velocity 2
sta Counter2

Apart from unrolling the loop, I am short of good ideas on how to make this efficient. Use of ZP for the indexes saves a few cycles as well.
How do you guys approach this in your demos?

2017-08-20 17:46

Mixer

Registered: Apr 2008
Posts: 452

Consider whether some of the maths give constant results and precalculate those. For instance if the velocities are the same all the time, then (sin(a)+sin(b)) could perhaps be precalculated to a single lookup.

Set sin tables to start on page boundary and use the lsb of address as the index, and run the code on zp.

If the add or substract is always 1 then inc/dec may be better.

Sometimes the second lookup can be coded to the sine data bits. Depends on what is desired.

2017-08-20 18:04

Glasnost
Account closed

Registered: Aug 2011
Posts: 26

If it is very time critical, i would use speedcode, and x and y for the 2 counters. The following code would require that the tables are duplicated to fill eg 2x256:

( i times)
lda sin1+i,x
*clc
adc sin2+i,y
sta destination+i

*clc is optional in some cases.. You know your sines if they mess up the carry or not.

If you want it looped you can init zp1 to sin1+counter1, zp2 to sin2+counter2. This example works only for max i=128.

ldy #(i-1)
!loop:
lda (zp1),y
*clc
adc (zp2),y
sta destination,y
dey
bpl !loop-

Last a bit about the sine addition. Note that if you want better precision you can use:

lda sin1,x
adc sin2,y
ror

2017-08-20 19:21

Cruzer

Registered: Dec 2001
Posts: 1048

PROTIP:	Code
	looks
	better
	in
	a
	[code]
	block.

2017-08-20 19:46

Cruzer

Registered: Dec 2001
Posts: 1048

Quoting Glasnost

lda sin1,x adc sin2,y ror

Remember clc after ror. Alternatively, if the sum of the two sines is always < 256 you can use:

	lda sin1,x
	adc sin2,y
	alr #$fe   //throw away least significant bit and then lsr
	           //(always results in cleared carry)

2017-08-20 20:20

Digger

Registered: Mar 2005
Posts: 437

Great tip with bit shifting to smooth the sine, never though about that.

2017-08-21 09:12

ChristopherJam

Registered: Aug 2004
Posts: 1409

So, if all your tables are page aligned, and if you also follow the sinewave table with a second copy of itself, the following
should have the same result:

    lda Counter1 // Copy counters to indexes
    sta rna0+1
    lda Counter2
    sta rna1+1
    ldx #TableSize
    ldy #0
!CalcAnim:
    clc
!rna0
    lda SineWave1,y // Get first value
!rna1
    adc SineWave2,y             // Add second value
    sta rna2+1
!rna2
    lda Lookuptable             // Find the value and store it
    sta Destinationtable,x
    iny
    dex
    bne CalcAnim-

    lda Counter1
    clc
    adc #1             // Velocity 1
    sta Counter1
    lda Counter2
    clc
    adc #1             // Velocity 2
    sta Counter2

But the above loop only needs seperate indices for source and destination because Y is increasing and X is decreasing.
Also, as others have pointed out, you don't need the CLC if you know the results will never overflow (eg because your sine tables contain 64+63*sin(x*pi/128) )

So, you should be able to get the same effects from an inner loop like this

!loop:
    lda sin+counter1,y
    adc sin+counter2,y
    tax
    lda lut,x
    sta dst,y
    dey
    bne loop

You can also halve the number of DEY/BNEs by a partial unroll, dividing the output into first/second half:

    ldy#TableSize/2
!loop
    lda sin+counter1,y
    adc sin+counter2,y
    tax
    lda lut,x
    sta dst,y

    lda sin+counter1+TableSize/2,y
    adc sin+counter2+TableSize/2,y
    tax
    lda lut,x
    sta dst+TableSize/2,y
    dey
    bne loop

The ALR mentioned above is good to know about (it's news to me!), but if all you want is a straight divide by two of the eight bit result, you can fold that into the table lookup and save the cycles.

(Alternately, if you put a ROR after the ADC you can use 128+127*sin(x*pi/128) - the carry will contain noise, but even without a CLC before the ADC the result will still be more accurate than using the smaller scale factor on your tables.)

2017-08-22 05:08

Oswald

Registered: Apr 2002
Posts: 5094

depending on what your lookup values are you may skip the entire lookup by already putting the lookup values into sin1/sin2 or fabricating them so that after adc you get the right values :P :)

2017-08-22 06:16

lft

Registered: Jul 2007
Posts: 369

Or use character mode, and integrate the lookup table into the font.

2017-08-22 11:57

Trap

Registered: Jul 2010
Posts: 223

So much awesome info. Thanks guys.

I think some of you are thinking a traditional movement system, but that is not what I was after. Pre-calculating the movements for a table would produce an awful lot of data - for every line, I'd have at least 128 bytes (for a semi-smooth experience). I am not looking for a follow-path, but creating dynamic waves.

With your input I've managed a 33% speed improvement and there's still some cycles left I can shave off.

The ALR/ROR tips were awesome <3

Lft, that's an interesting thought - could you elaborate?

2017-08-22 14:21

Oswald

Registered: Apr 2002
Posts: 5094

the examples are not about precalculating movements into a table, the most extreme here is to precalculate your counter+velocity values into table lookups in an unrolled loop, and that unrolling (speedcode generation) also can happen realtime when changing effect movement.

unrolling your loop with your labels would give you:

ldx Counter1
ldy Counter2

for loopcount=0 to tablesize

lda sin1+LoopCount*Velocity1,x
clc
adc sin2+LoopCount*Velocity2,y
sta destinationtable+loopcount

next

(out of registers for lookuptable)

speed increase would be many fold instead of lowly 33%

... 5 posts hidden. Click here to view all posts....

Previous - 1 | 2 - Next

Refresh

Subscribe to this thread: