[CSDb] - User Forums - SID envelope rate counter phase alignment

You are not logged in - nap

CSDb User Forums

Forums > C64 Coding > SID envelope rate counter phase alignment

2015-05-05 00:55

ChristopherJam

Registered: Aug 2004
Posts: 1409

SID envelope rate counter phase alignment

<Post edited by moderator on 17/3-2020 08:25>

Continuing the discussion that forked off Lft's post about Avoiding the ADSR bug in the decay phase.

I've been refining the part of the SID envelope reset that gets the internal rate counter into one of a number of states that are all equal modulo 9. LFT's state table was highly informative, but I still couldn't quite see what I was doing, so I wrote a script to generate diagrams from rate counter limit sequences.

In each of the images below, the horizontal axis is time, and the vertical has a pixel set for each possible rate counter value at that time. They are colour coded with (rate counter - time)%9, so any potential values that are equal modulo nine show up as streams of equal colour. Horizontal pink bars show the counter limit values. All but the last two have had the tops cut off so we can focus on the interesting bit, but you can see the highest rate counter reached in the annotations above the images.

original: 567 cycles, highest rc is 535 (runlog)

This was the code I initially contributed to the discussion. It takes forever! Only harvesting a result every two cycles through the 32 rate limit is extraordinarily wasteful (slaps self). This image is half the scale of the others just to avoid it breaking the page format.

lft: 342 cycles, highest rc is 324 (runlog)

LFT's contribution. Vastly improved

lft_compacted: 279 cycles, highest rc is 261 (runlog)

I then compacted this a bit, by dropping a redundant nine cycles from each iteration.

sieve: 248 cycles, highest rc is 252 (runlog)

First attempt at a different approach. The single cycles at Attack=0 should be doable by using an INC instruction, but that relies on SID reading zero. Not sure if this is safe?
Only way it could be faster would be to use the 220 cycle limit for the stream that would otherwise require eight 32 cycle resets.

sieve2: 280 cycles, highest rc is 288 (runlog)

A safer sieve, that only uses 4 cycle writes. Downside of the sieve is it still lets the rate count get pretty high.

bottle: 280 cycles, highest rc is 99 (runlog)

LFT's comment about recapturing got me thinking. The next phase of the reset would run faster if we could bottle as many streams as possible into the 63 cycle limit (attack=2).
We can only manage seven of them, but that still brings down max rc by a factor of three!

bottle2: 280 cycles, highest rc is 90 (runlog)

..a slight reshuffle of the last couple of iterations, and we save enough cycles to use a rate limit of 95 for whatever comes next

bottle3: 375 cycles, highest rc is 72 (runlog)

This last one's definitely more of theoretical interest, but for another 95 cycles we can group all the possible states into a single tight packet.

The benefit of all of the above is less than I thought when I first set out, as it was only a day or two ago that I finally looked into the envelope overflow/underflow, and made sense of LFT's remarks about using env3=0xff; hence there's only a couple of loops required at a rate limit of 95, one to drop back down to env3=0xfe, and another to recapture at decay=0
I'd initially thought I was saving thousands of cycles as we rose from env3=0xee, but it's not to be. C'est la vie!

Still, onward to implementation; bottle2 should still save a few raster lines at the point in time that there's work to be done by CPU.

... 17 posts hidden. Click here to view all posts....

2015-05-13 18:09

Mixer

Registered: Apr 2008
Posts: 452

Voice 1 and Voice 2 need that lenghty manipulation.

Voice 3 is a special case because ENV can be read. LFSR is at start when ENV value changes, thus there is a short period after ENV change when to do safe changes.

Smart implementation of above would be nice addition to the general case. (the kind that can be implemented in play routine)

2015-05-13 18:56

lft

Registered: Jul 2007
Posts: 369

Mixer, that's a good point! In my original code in the other thread (Avoiding the ADSR bug in the decay phase.), it should be possible to interleave code for restarting two voices. So in my original estimate of 10 rasterlines overhead, I was taking this into account (i.e. 5 rasterlines for the speedcode, times two because of the three voices). Now, that was an estimate, and ChristopherJam's figure is based on facts. But it would be really nice indeed if this could be squeezed down into 5 lines in total by using the ENV3 register as you suggest. Hmm...

2015-05-13 22:15

ChristopherJam

Registered: Aug 2004
Posts: 1409

Excellent points all, especially about voice 3 being readable. Should even be able to use that information to adjust the phase, just by letting it run at a limit of 32 for a number of loops dependent on the current phase.

Voices 1 and 2 should be doable in parallel to some extent, as per lft's original. The bottle I used is a 36 cycle loop with writes at cycles 3, 27 and 35, so interleaving additional writes to a second voice at cycles 19, 7 and 15 is pretty easy.

However, the implementation of new hard-restart (NHR? Need a better name for what it does..) I posted to codebase
then spends another five lines on the recapture and overflow, largely because bottle1 required a rate limit of 149 (ADSR=4x44)

That last phase might be doable at speed 3 (rate limit 96) if one switched to bottle2, but implementation becomes harder, and the bottles for the two voices would then have to be only partially overlapped, as there are some fairly dense updates towards the end (cf diagram and log above)

bottle3 may become worthwhile after all, as at least then there's a luxurious 23 cycles between the arrival of each chain of potential rc=0 events :)

2015-05-13 22:26

ChristopherJam

Registered: Aug 2004
Posts: 1409

Helpfully, if the voice is reset from a stable raster interrupt, it doesn't matter which raster line the reset is done on - the seven 9 cycle loops fit happily into a 63 cycle raster line! (sorry NTSC..)

Switching between rate limits of 9 and 63 (speeds 0 and 2) would then be safe as long as it's always done in the first nine cycles of a raster line.

Using the other speeds safely would be harder; you'd have to count frames at each rate to calculate the phase shift, then time the rate limit change accordingly..

2015-05-14 05:41

Oswald

Registered: Apr 2002
Posts: 5094

Fast Restart? :) btw no badlines and sprites allowed over that routine, right?

2015-05-14 07:26

ChristopherJam

Registered: Aug 2004
Posts: 1409

Quote: Fast Restart? :) btw no badlines and sprites allowed over that routine, right?

Sadly it's anything but fast - it still needs an ordinary hard-restart (OHR) before that last ten rasters' worth of speedcode.

There may be circumstances when the ratecounter can be recaptured more efficiently mind.

And yes, no DMA or interrupts allowed during the last 600 cycles.

Perhaps Stabilized Hard-Restart?

2015-05-14 08:13

Frantic

Registered: Mar 2003
Posts: 1648

...or simply "stable hard restart"?

2015-05-15 00:19

Mixer

Registered: Apr 2008
Posts: 452

Would this be what they call 'pushing the envelope?'

2015-05-15 01:39

ChristopherJam

Registered: Aug 2004
Posts: 1409

@Mixer: Hah! Yes, I guess it is :D

@Frantic: "stable hard restart" works for me.

2015-05-16 10:47

Laurent

Registered: Apr 2004
Posts: 40

mmmmh I am still missing the point, and sorry if the following is off-topic :)

I thought when you set R to 0 in release state there was indeed a high probability that we have to wait about 0x8000 cycles before the RC gets caught between values 0 and 8. Then setting attack to any value and gate to 1 should start the attack "instantly" with a jitter of 9 cycles isn't it ? I wonder why it isn't good enough as a "known state" ?

However, I have seen many players that seemed to voluntarily triggered a second delay bug before attack restart. This can be especially good for kick/drum instruments that mostly have a 0 attack and starts with a noise waveform.
To do this, after they caught RC between 0 and 8 in release state, they update R to a high value, wait more than 9 cycles so that RC > 9, set AD value (with A = 0), and set gate to 1, hence always triggering another delay bug.

I tried to understand this motivation (it's plain speculation) : there is always a 33ms delay before attack starts, so the noise waveform, held during 2 frames in the instrument table, is only heard for 7ms, which sounds much better than a 20ms noise held one full frame (like in AHX tunes for example).
To avoid the 33ms delay, we would need 2x tune and AD+GATE updated before SR, so that noise would be heard for 10ms, or 3x tune so it would be heard for 6.7ms...

I am certainly missing something big here !!! Please explain the overall goal :)

Previous - 1 | 2 | 3 - Next

Refresh

Subscribe to this thread: