[CSDb] - User Forums - How easy is it to calculate sine tables from the bare bones in ML?

You are not logged in - nap

CSDb User Forums

Forums > C64 Coding > How easy is it to calculate sine tables from the bare bones in ML?

2008-08-05 09:52

Conrad

Registered: Nov 2006
Posts: 849

How easy is it to calculate sine tables from the bare bones in ML?

Hi!

I'd like to have a go at coding my own routine of calculating sine/cosine tables without use of the BASIC interpreters (pi, sin(), etc) and just use pure ML instead.

Can I ask, how easy (or how SMALL) is this to do? Do I need to do additional algorithms like factorial or are there KERNAL routines for that (if any)? Multiplication code is no problem as I've seen articles on those, I'm just thinking about the rest of math involved with sine calculation.

Thanx in advance.

2008-08-05 10:22

enthusi

Registered: May 2004
Posts: 677

Im probably the wrong person to answer this :)
but you can easyly abuse the BASIC routines in ML as well. Just set up the FAC etc adresses and launch the routines.

OR if you want it really basic withou BASIC routines you can do the Taylor-approach
sin(x)=x-1/x^3+1/x^5 etc.

There probably smarter ways around though *g*

2008-08-05 10:25

doynax
Account closed

Registered: Oct 2004
Posts: 212

You can generate a sine wave iteratively easily enough. Basically you just massage the trigonometric product formula a bit..

sin(x - y) + sin(x + y) = 2 * sin x * cos y <=>
sin(x + y) = 2 * sin x * cos y - sin(x - y)

Where 'y' is your step and x is the index being calculated. In other words we end up basing the next sine on the previous two, through a multiplication by a constant and a subtraction.

k = 2 * cos d

sin[i + d] = sin[i] * k - sin[i - d]

Now I guess if you apply a scaling/steps width such that the constant becomes a nice simple power-of-two then this could be implemented with less code then invoking the basic ROM functions.
Please note that it takes quite a bit of fixed-point precision to get decent values, and you may want to consider exploiting a bit of symmetry to generate the rest of the table from the first quadrant or so.

2008-08-05 10:34

Skate

Registered: Jul 2003
Posts: 494

There are some 256b intros with source code which generates sine tables. Check this one for example.

Too(C)o(M)p(L)ex

Of course try to learn it, not copy & paste it.

2008-08-05 10:38

Radiant

Registered: Sep 2004
Posts: 639

To avoid nasty multiplications you could approximate a circle using an accelerating vector. I think it was Zed Yago who suggested this solution on IRC some time ago, and it sounds good enough for me.

2008-08-05 10:58

Conrad

Registered: Nov 2006
Posts: 849

Thanks everyone for your help! I have a rough idea what to do now! :)

radiantx: what's an accelerating vector? Is it related to I/O registers or is a much more complex formula? Again, I could ask Yago myself :)

2008-08-05 11:02

WVL

Registered: Mar 2002
Posts: 902

Quote: Thanks everyone for your help! I have a rough idea what to do now! :)

radiantx: what's an accelerating vector? Is it related to I/O registers or is a much more complex formula? Again, I could ask Yago myself :)

I guess yago meant that you can approximate a sine with 2 parabolas, and to calculate those you only need to have an acceleration (which matches the amplitude and 'length' of the sine you want.)

2008-08-05 11:06

Cruzer

Registered: Dec 2001
Posts: 1048

@Skate: Too bad it's some ugly sines that are generated that way. It's really just a couple of reverse parabolas, which is not the same as a real sine.

2008-08-05 11:07

Conrad

Registered: Nov 2006
Posts: 849

Ah, now I get ya, after looking up on parabolas ;).
Thanks again, all! .... off to code.

cruzer: that's okay, quick'n'dirty is all I need.

2008-08-05 11:18

Shadow
Account closed

Registered: Apr 2002
Posts: 355

Anyone have an example on how to access the BASIC routines from asm to generate a sine? Also how compact is such a routine, are we talking like 50 bytes or 500 bytes?

2008-08-05 11:22

JackAsser

Registered: Jun 2002
Posts: 2014

Anything you come up with, plz write a small article about it and post it on codebase. Mkaytnku.

2008-08-05 11:36

Skate

Registered: Jul 2003
Posts: 494

@Cruzer: So, your source code comment is the liar! ;)

Well, I like using plain basic for sinus calculations (slow but more flexible with ranges) in small sized intros and precalculated tables in bigger projects.

2008-08-05 13:53

Cruzer

Registered: Dec 2001
Posts: 1048

"generate sine" - yeah, guess that's a stretch. :)

Well, sometimes these parabolic "sines" work pretty good, and other times it's obvious to see that something's wrong, especially in the middle, where it's hard to get the two parabolas to meet precisely. But this is based on 256b routines - it might be possible to tweak it better if the code doesn't have to be super small.

2008-08-05 15:33

doynax
Account closed

Registered: Oct 2004
Posts: 212

Quote: Anyone have an example on how to access the BASIC routines from asm to generate a sine? Also how compact is such a routine, are we talking like 50 bytes or 500 bytes?

Here's the best I could come up with on short notice (31 bytes):

table	= $0400		;; preferably a low page

loop	lda #<index
	ldy #>index
	jsr $bba2
	
	jsr $e277

	lda $61
	adc #7
	sta $61

	jsr $bc9b
	lda $65

index	.byte $90
	.byte $00
	sta table

	inc *-2
	bne loop

I suspect that the float-to-int portion could be shortened by making use of some sneaky ROM function somewhere. Interesting challenge though, anyone got a shorter implementation?

2008-08-05 18:03

Frantic

Registered: Mar 2003
Posts: 1648

@Doynax: Would it be OK to post that on Codebase?

2008-08-05 18:29

doynax
Account closed

Registered: Oct 2004
Posts: 212

Quote: @Doynax: Would it be OK to post that on Codebase?

Look.. It's a code snippet posted on a public forum, of course you can use it. Feel free to modify it, sell it, claim you wrote it yourself or print it out and use it as toilet paper.
There are few things as obnoxious as tutorial authors and the like thinking they've got some sort of claim to your program when you based your code on their work..

By the way personally I'd like to see a decent tutorial on the BASIC floating point library in the code base. This is a good beginning but it's incomplete and full of errors. For instance I'd like to know why taking sin(0) seems to crash the program.

2008-08-05 19:23

Frantic

Registered: Mar 2003
Posts: 1648

okok, I posted it and wrote that it was created by some lamer named Doynax (just kidding).

2008-08-05 22:08

Shadow
Account closed

Registered: Apr 2002
Posts: 355

Nice explanation that accompanied the codebase64 post, much appreciated!

2008-08-05 22:53

Frantic

Registered: Mar 2003
Posts: 1648

Credits for those goes to (some lamer called) Doynax. ;)

2008-08-06 02:39

doynax
Account closed

Registered: Oct 2004
Posts: 212

I managed to reduce the bloody thing by two bytes (to 29) by converting to integer through adding a 'magic' float. Well, it's actually a piece of code in the BASIC ROM but it works well enough as a float..

table	= $0400		;; Preferably a low page. Must be paged aligned!

loop	lda #<index	;; Load 5-byte float at 'index' into FAC, the fraction
	ldy #>index	;; of which is stepped between 0/256..255/256.
	jsr $bba2	;; However an integer bias is also added in order to fix
			;; the exponent and make hence it possible to increment the
			;; fraction as a normal binary byte, e.g. a version of the
			;; classic x86 float-to-int conversion trick.
	
	jsr $e277	;; Now calculate sine of FAC. Except skip the initial part
			;; of the BASIC function which divides by 2*PI to get
			;; a fraction out of radians since we've already got one.
			;; The integer bias is taken care by BASIC since sin()
			;; is supposed to be periodic.

	lda #<bias	;; Convert the output in FAC from a float in -1..+1 to a
	ldy #>bias	;; fixed-point value in -128..+127 at the LSB of the
	jsr $b867	;; mantissa by employing the same trick as before of
	lda $65		;; adding a high integer bias.

index	.byte $90	;; This is both a float *and* a piece of code. The exponent
	.byte $00	;; ($90 corresponds to 2^16) fixes our 8-bit fraction as the
	sta table	;; third byte of the mantissa and STA address' LSB (don't
			;; forget that BASIC floats are big-endian!). And $90/$00
			;; interpreted as code simply correspond to a harmless BCC *+2.
			;; Note that while the STA's opcode is an integer part which
			;; shouldn't affect the result, the table's high byte *does*
			;; serve as a small offset shifting the result by up to one
			;; index value. Some might even argue that placing the table
			;; at $8000 would produce the 'proper' rounding.

	inc *-2
	bne loop

bias	= $befa		;; A float with an exponent of $99 (2^25) and an LSB of
;;	.byte $99	;; zero is used to convert the output to binary. Such byte
;;	.byte $02	;; sequences can be found in six places in the BASIC/Kernal
;;	.byte $01	;; ROMs, at $befa/$bf04/$bf09/$fd53/$fd56/$ff38.
;;	.byte $a9	;; A version with an LSB of $80 would have been useful to
;;	.byte $00	;; create unsigned output (e.g. between $00 and $ff with the
			;; origin at $80) but unfortunately doesn't seem to exist.

2008-08-06 07:30

Cruzer

Registered: Dec 2001
Posts: 1048

Nice, much smaller than my faked routine. But how fast is it compared to Basic?

2008-08-06 08:27

doynax
Account closed

Registered: Oct 2004
Posts: 212

Quote: Nice, much smaller than my faked routine. But how fast is it compared to Basic?

The assembler version takes 5.89 seconds ($584f0f cycles) while a BASIC implementation needs about 17.2 seconds according to my stopwatch.

Then again BASIC isn't exactly my forte, so perhaps I made some stupid newbie mistake?

0 fori=0to255:poke1024+i,sin(i*.02454)*128and255:next

2008-08-06 08:40

Skate

Registered: Jul 2003
Posts: 494

I usually use something like;

0 FORI=0TO255:POKE1024+I,128+128*SIN(I/40.7):NEXT

where 40.7 is 256/(2*PI)

don't forget, we need less bytes for small intros and /40.7 is shorter ;)

2008-08-06 09:27

Cruzer

Registered: Dec 2001
Posts: 1048

5.89 is definitely fast enough for a tinytro, but I think I'll stick to generating them with a KickAss script for normal cases.

2008-08-06 11:19

JackAsser

Registered: Jun 2002
Posts: 2014

@Cruzer: I agree and it actually SAVES memory in the end because you will not have the generator in memory, only the end result. (And buhu what I miss trig-operations and floatingpoints in ca65...)

2008-08-06 12:34

doynax
Account closed

Registered: Oct 2004
Posts: 212

Quote: @Cruzer: I agree and it actually SAVES memory in the end because you will not have the generator in memory, only the end result. (And buhu what I miss trig-operations and floatingpoints in ca65...)

I'd say if you're not kicking out your initialization code then a sine generator is the least of your worries (drivecode anyone?).
Personally I prefer to mirror the first 64 entries to create the rest of the table, which saves 170 bytes or so and loads a tad faster. Hey.. things like this add up, especially for game code.

Oh, and I saved another byte in the generator (down to 28) at the cost of creating a negated sine table instead, though I seriously doubt that matters for 256-byte intro use.

2008-08-07 22:42

Monte Carlos

Registered: Jun 2004
Posts: 359

To avoid that the sinmax and sinmin is reached only for one byte, use a+(a-0.5)*sin(pi/180*i) instead of
a*(1+sin(pi/180*i)).

Monti Carlotti

2013-12-17 23:03

Cruzer

Registered: Dec 2001
Posts: 1048

Quoting doynax

Oh, and I saved another byte in the generator (down to 28) at the cost of creating a negated sine table instead, though I seriously doubt that matters for 256-byte intro use.

Do want! :)

2013-12-19 20:25

doynax
Account closed

Registered: Oct 2004
Posts: 212

Quoting Cruzer

Do want! :)

It took a bit of filesystem spelunking but here you go: http://pastebin.com/dMFP3F47

Now be sure to buy yourself something nice with that extra byte.

Sidenote: I must confess to some surprise at finding no less than three 6502 assembly dialects to choose from at pastebin. It's almost enough to get an embittered atheist into the Christmas spirit.

2013-12-20 22:29

ChristopherJam

Registered: Aug 2004
Posts: 1409

An alternate approach is to fill a table with a square wave, then filter it with a few smoothing passes to eliminate the harmonics. The following hasn't been optimised for size, but it's only 38 bytes, and only takes about 170,000 cycles to create a table of 127.5+63*cosine(i*pi/128) with an RMS error of under 1%

(warning, also trashes 52 bytes after the end of the table. Number of passes, and initial square wave phase and amplitude optimised by brute force search with a Python script. Output ranges from $40 to $bf).


	ldx#127
prefill
	sta cosine,x
	lda#19
	sta cosine+52,x
	lda#250
	sta cosine+128+52,x
	dex
	bpl prefill

	ldy#30
	clc
smoothloop
	lda cosine,x
	adc cosine,y
	ror
	sta cosine,x
	iny
	inx
	bne smoothloop
	dey
	bne smoothloop

2013-12-20 23:42

Cruzer

Registered: Dec 2001
Posts: 1048

Thanx, Doynax! The sine is too different to be used in my current project, and since I'm already under 256 bytes I'm good for now. But will keep it for another time.

2013-12-20 23:47

doynax
Account closed

Registered: Oct 2004
Posts: 212

Quoting ChristopherJam

An alternate approach is to fill a table with a square wave, then filter it with a few smoothing passes to eliminate the harmonics. The following hasn't been optimised for size, but it's only 38 bytes, and only takes about 170,000 cycles to create a table of 127.5+63*cosine(i*pi/128) with an RMS error of under 1%

Well done, that is exceedingly clever. I never would have considered that approach.

Any chance of a 16-bit version or one with (closer to) full 8-bit range?

2013-12-22 08:39

ChristopherJam

Registered: Aug 2004
Posts: 1409

Thanks doynax. I was probably somewhat influenced by having spent a couple of weeks playing with KarplusStrong string synthesis earlier this year (looking for fresh samples to try to play back on the c64 ;) - most of the sounds tended towards a sine wave eventually.

16 bit should be reasonably sane; I've just finished tuning a simulation of one that gives 16 bit results with a range from $0000 to $ffff, RMS error of around 31, or 0.04% of the full range.

It's all getting a little far from practical at this point, as it likely wouldn't save that much space over a table, which in turn you could probably LZ compress the deltas for reasonably well.

Here's the Python in any case - I leave translation to 6502 as an exercise for the reader, or my future self if I get really really bored. The N.sum(v)/256 terms could, of course, be pre-calculated.

import numpy as N

def sim(si,jo,dco,lv,hv,pw):
	v=N.zeros((256,),N.int)+lv
	v[dco:][:pw]*=0
	v[dco:][:pw]+=hv

	nv=1

	i=si
	j=(i+jo)&255
	while 1:
		nv=(v[i]+v[j])+(nv&1)
		v[i]=nv/2
		i=(i+1)&255
		j=(j+1)&255
		if(i&255==0):
			j=(j-1)&255
			if j==0:
				break

	v=(v+N.take(v,(N.arange(256)+64)%256))*256
	v-=N.sum(v)/256
	v=N.cumsum(v)/256
	v-=N.sum(v)/256
	v[0]+=0xfffe00
	v=N.cumsum(v)/256
	return v

v=sim(255, 31, 81, 0, 13097, 131)

Refresh

Subscribe to this thread: