| |
Conrad
Registered: Nov 2006 Posts: 849 |
How easy is it to calculate sine tables from the bare bones in ML?
Hi!
I'd like to have a go at coding my own routine of calculating sine/cosine tables without use of the BASIC interpreters (pi, sin(), etc) and just use pure ML instead.
Can I ask, how easy (or how SMALL) is this to do? Do I need to do additional algorithms like factorial or are there KERNAL routines for that (if any)? Multiplication code is no problem as I've seen articles on those, I'm just thinking about the rest of math involved with sine calculation.
Thanx in advance. |
|
| |
enthusi
Registered: May 2004 Posts: 677 |
Im probably the wrong person to answer this :)
but you can easyly abuse the BASIC routines in ML as well. Just set up the FAC etc adresses and launch the routines.
OR if you want it really basic withou BASIC routines you can do the Taylor-approach
sin(x)=x-1/x^3+1/x^5 etc.
There probably smarter ways around though *g* |
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
You can generate a sine wave iteratively easily enough. Basically you just massage the trigonometric product formula a bit..
sin(x - y) + sin(x + y) = 2 * sin x * cos y <=>
sin(x + y) = 2 * sin x * cos y - sin(x - y) Where 'y' is your step and x is the index being calculated. In other words we end up basing the next sine on the previous two, through a multiplication by a constant and a subtraction.k = 2 * cos d
sin[i + d] = sin[i] * k - sin[i - d] Now I guess if you apply a scaling/steps width such that the constant becomes a nice simple power-of-two then this could be implemented with less code then invoking the basic ROM functions.
Please note that it takes quite a bit of fixed-point precision to get decent values, and you may want to consider exploiting a bit of symmetry to generate the rest of the table from the first quadrant or so. |
| |
Skate
Registered: Jul 2003 Posts: 494 |
There are some 256b intros with source code which generates sine tables. Check this one for example.
Too(C)o(M)p(L)ex
Of course try to learn it, not copy & paste it.
|
| |
Radiant
Registered: Sep 2004 Posts: 639 |
To avoid nasty multiplications you could approximate a circle using an accelerating vector. I think it was Zed Yago who suggested this solution on IRC some time ago, and it sounds good enough for me. |
| |
Conrad
Registered: Nov 2006 Posts: 849 |
Thanks everyone for your help! I have a rough idea what to do now! :)
radiantx: what's an accelerating vector? Is it related to I/O registers or is a much more complex formula? Again, I could ask Yago myself :) |
| |
WVL
Registered: Mar 2002 Posts: 902 |
Quote: Thanks everyone for your help! I have a rough idea what to do now! :)
radiantx: what's an accelerating vector? Is it related to I/O registers or is a much more complex formula? Again, I could ask Yago myself :)
I guess yago meant that you can approximate a sine with 2 parabolas, and to calculate those you only need to have an acceleration (which matches the amplitude and 'length' of the sine you want.) |
| |
Cruzer
Registered: Dec 2001 Posts: 1048 |
@Skate: Too bad it's some ugly sines that are generated that way. It's really just a couple of reverse parabolas, which is not the same as a real sine.
|
| |
Conrad
Registered: Nov 2006 Posts: 849 |
Ah, now I get ya, after looking up on parabolas ;).
Thanks again, all! .... off to code.
cruzer: that's okay, quick'n'dirty is all I need. |
| |
Shadow Account closed
Registered: Apr 2002 Posts: 355 |
Anyone have an example on how to access the BASIC routines from asm to generate a sine? Also how compact is such a routine, are we talking like 50 bytes or 500 bytes? |
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
Anything you come up with, plz write a small article about it and post it on codebase. Mkaytnku. |
| |
Skate
Registered: Jul 2003 Posts: 494 |
@Cruzer: So, your source code comment is the liar! ;)
Well, I like using plain basic for sinus calculations (slow but more flexible with ranges) in small sized intros and precalculated tables in bigger projects. |
| |
Cruzer
Registered: Dec 2001 Posts: 1048 |
"generate sine" - yeah, guess that's a stretch. :)
Well, sometimes these parabolic "sines" work pretty good, and other times it's obvious to see that something's wrong, especially in the middle, where it's hard to get the two parabolas to meet precisely. But this is based on 256b routines - it might be possible to tweak it better if the code doesn't have to be super small.
|
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
Quote: Anyone have an example on how to access the BASIC routines from asm to generate a sine? Also how compact is such a routine, are we talking like 50 bytes or 500 bytes?
Here's the best I could come up with on short notice (31 bytes):table = $0400 ;; preferably a low page
loop lda #<index
ldy #>index
jsr $bba2
jsr $e277
lda $61
adc #7
sta $61
jsr $bc9b
lda $65
index .byte $90
.byte $00
sta table
inc *-2
bne loop I suspect that the float-to-int portion could be shortened by making use of some sneaky ROM function somewhere. Interesting challenge though, anyone got a shorter implementation? |
| |
Frantic
Registered: Mar 2003 Posts: 1648 |
@Doynax: Would it be OK to post that on Codebase? |
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
Quote: @Doynax: Would it be OK to post that on Codebase?
Look.. It's a code snippet posted on a public forum, of course you can use it. Feel free to modify it, sell it, claim you wrote it yourself or print it out and use it as toilet paper.
There are few things as obnoxious as tutorial authors and the like thinking they've got some sort of claim to your program when you based your code on their work..
By the way personally I'd like to see a decent tutorial on the BASIC floating point library in the code base. This is a good beginning but it's incomplete and full of errors. For instance I'd like to know why taking sin(0) seems to crash the program. |
| |
Frantic
Registered: Mar 2003 Posts: 1648 |
okok, I posted it and wrote that it was created by some lamer named Doynax (just kidding). |
| |
Shadow Account closed
Registered: Apr 2002 Posts: 355 |
Nice explanation that accompanied the codebase64 post, much appreciated! |
| |
Frantic
Registered: Mar 2003 Posts: 1648 |
Credits for those goes to (some lamer called) Doynax. ;) |
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
I managed to reduce the bloody thing by two bytes (to 29) by converting to integer through adding a 'magic' float. Well, it's actually a piece of code in the BASIC ROM but it works well enough as a float..
table = $0400 ;; Preferably a low page. Must be paged aligned!
loop lda #<index ;; Load 5-byte float at 'index' into FAC, the fraction
ldy #>index ;; of which is stepped between 0/256..255/256.
jsr $bba2 ;; However an integer bias is also added in order to fix
;; the exponent and make hence it possible to increment the
;; fraction as a normal binary byte, e.g. a version of the
;; classic x86 float-to-int conversion trick.
jsr $e277 ;; Now calculate sine of FAC. Except skip the initial part
;; of the BASIC function which divides by 2*PI to get
;; a fraction out of radians since we've already got one.
;; The integer bias is taken care by BASIC since sin()
;; is supposed to be periodic.
lda #<bias ;; Convert the output in FAC from a float in -1..+1 to a
ldy #>bias ;; fixed-point value in -128..+127 at the LSB of the
jsr $b867 ;; mantissa by employing the same trick as before of
lda $65 ;; adding a high integer bias.
index .byte $90 ;; This is both a float *and* a piece of code. The exponent
.byte $00 ;; ($90 corresponds to 2^16) fixes our 8-bit fraction as the
sta table ;; third byte of the mantissa and STA address' LSB (don't
;; forget that BASIC floats are big-endian!). And $90/$00
;; interpreted as code simply correspond to a harmless BCC *+2.
;; Note that while the STA's opcode is an integer part which
;; shouldn't affect the result, the table's high byte *does*
;; serve as a small offset shifting the result by up to one
;; index value. Some might even argue that placing the table
;; at $8000 would produce the 'proper' rounding.
inc *-2
bne loop
bias = $befa ;; A float with an exponent of $99 (2^25) and an LSB of
;; .byte $99 ;; zero is used to convert the output to binary. Such byte
;; .byte $02 ;; sequences can be found in six places in the BASIC/Kernal
;; .byte $01 ;; ROMs, at $befa/$bf04/$bf09/$fd53/$fd56/$ff38.
;; .byte $a9 ;; A version with an LSB of $80 would have been useful to
;; .byte $00 ;; create unsigned output (e.g. between $00 and $ff with the
;; origin at $80) but unfortunately doesn't seem to exist. |
| |
Cruzer
Registered: Dec 2001 Posts: 1048 |
Nice, much smaller than my faked routine. But how fast is it compared to Basic?
|
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
Quote: Nice, much smaller than my faked routine. But how fast is it compared to Basic?
The assembler version takes 5.89 seconds ($584f0f cycles) while a BASIC implementation needs about 17.2 seconds according to my stopwatch.
Then again BASIC isn't exactly my forte, so perhaps I made some stupid newbie mistake?0 fori=0to255:poke1024+i,sin(i*.02454)*128and255:next |
| |
Skate
Registered: Jul 2003 Posts: 494 |
I usually use something like;
0 FORI=0TO255:POKE1024+I,128+128*SIN(I/40.7):NEXT
where 40.7 is 256/(2*PI)
don't forget, we need less bytes for small intros and /40.7 is shorter ;) |
| |
Cruzer
Registered: Dec 2001 Posts: 1048 |
5.89 is definitely fast enough for a tinytro, but I think I'll stick to generating them with a KickAss script for normal cases.
|
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
@Cruzer: I agree and it actually SAVES memory in the end because you will not have the generator in memory, only the end result. (And buhu what I miss trig-operations and floatingpoints in ca65...) |
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
Quote: @Cruzer: I agree and it actually SAVES memory in the end because you will not have the generator in memory, only the end result. (And buhu what I miss trig-operations and floatingpoints in ca65...)
I'd say if you're not kicking out your initialization code then a sine generator is the least of your worries (drivecode anyone?).
Personally I prefer to mirror the first 64 entries to create the rest of the table, which saves 170 bytes or so and loads a tad faster. Hey.. things like this add up, especially for game code.
Oh, and I saved another byte in the generator (down to 28) at the cost of creating a negated sine table instead, though I seriously doubt that matters for 256-byte intro use. |
| |
Monte Carlos
Registered: Jun 2004 Posts: 359 |
To avoid that the sinmax and sinmin is reached only for one byte, use a+(a-0.5)*sin(pi/180*i) instead of
a*(1+sin(pi/180*i)).
Monti Carlotti
|
| |
Cruzer
Registered: Dec 2001 Posts: 1048 |
Quoting doynaxOh, and I saved another byte in the generator (down to 28) at the cost of creating a negated sine table instead, though I seriously doubt that matters for 256-byte intro use. Do want! :) |
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
Quoting CruzerDo want! :) It took a bit of filesystem spelunking but here you go: http://pastebin.com/dMFP3F47
Now be sure to buy yourself something nice with that extra byte.
Sidenote: I must confess to some surprise at finding no less than three 6502 assembly dialects to choose from at pastebin. It's almost enough to get an embittered atheist into the Christmas spirit. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
An alternate approach is to fill a table with a square wave, then filter it with a few smoothing passes to eliminate the harmonics. The following hasn't been optimised for size, but it's only 38 bytes, and only takes about 170,000 cycles to create a table of 127.5+63*cosine(i*pi/128) with an RMS error of under 1%
(warning, also trashes 52 bytes after the end of the table. Number of passes, and initial square wave phase and amplitude optimised by brute force search with a Python script. Output ranges from $40 to $bf).
ldx#127
prefill
sta cosine,x
lda#19
sta cosine+52,x
lda#250
sta cosine+128+52,x
dex
bpl prefill
ldy#30
clc
smoothloop
lda cosine,x
adc cosine,y
ror
sta cosine,x
iny
inx
bne smoothloop
dey
bne smoothloop
|
| |
Cruzer
Registered: Dec 2001 Posts: 1048 |
Thanx, Doynax! The sine is too different to be used in my current project, and since I'm already under 256 bytes I'm good for now. But will keep it for another time. |
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
Quoting ChristopherJamAn alternate approach is to fill a table with a square wave, then filter it with a few smoothing passes to eliminate the harmonics. The following hasn't been optimised for size, but it's only 38 bytes, and only takes about 170,000 cycles to create a table of 127.5+63*cosine(i*pi/128) with an RMS error of under 1% Well done, that is exceedingly clever. I never would have considered that approach.
Any chance of a 16-bit version or one with (closer to) full 8-bit range? |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
Thanks doynax. I was probably somewhat influenced by having spent a couple of weeks playing with Karplus–Strong string synthesis earlier this year (looking for fresh samples to try to play back on the c64 ;) - most of the sounds tended towards a sine wave eventually.
16 bit should be reasonably sane; I've just finished tuning a simulation of one that gives 16 bit results with a range from $0000 to $ffff, RMS error of around 31, or 0.04% of the full range.
It's all getting a little far from practical at this point, as it likely wouldn't save that much space over a table, which in turn you could probably LZ compress the deltas for reasonably well.
Here's the Python in any case - I leave translation to 6502 as an exercise for the reader, or my future self if I get really really bored. The N.sum(v)/256 terms could, of course, be pre-calculated.
import numpy as N
def sim(si,jo,dco,lv,hv,pw):
v=N.zeros((256,),N.int)+lv
v[dco:][:pw]*=0
v[dco:][:pw]+=hv
nv=1
i=si
j=(i+jo)&255
while 1:
nv=(v[i]+v[j])+(nv&1)
v[i]=nv/2
i=(i+1)&255
j=(j+1)&255
if(i&255==0):
j=(j-1)&255
if j==0:
break
v=(v+N.take(v,(N.arange(256)+64)%256))*256
v-=N.sum(v)/256
v=N.cumsum(v)/256
v-=N.sum(v)/256
v[0]+=0xfffe00
v=N.cumsum(v)/256
return v
v=sim(255, 31, 81, 0, 13097, 131)
|