Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
You are not logged in 
CSDb User Forums

Forums > C64 Coding > SID envelope rate counter phase alignment
2015-05-05 00:55

Registered: Aug 2004
Posts: 1031
SID envelope rate counter phase alignment

<Post edited by moderator on 17/3-2020 08:25>

Continuing the discussion that forked off Lft's post about Avoiding the ADSR bug in the decay phase.

I've been refining the part of the SID envelope reset that gets the internal rate counter into one of a number of states that are all equal modulo 9. LFT's state table was highly informative, but I still couldn't quite see what I was doing, so I wrote a script to generate diagrams from rate counter limit sequences.

In each of the images below, the horizontal axis is time, and the vertical has a pixel set for each possible rate counter value at that time. They are colour coded with (rate counter - time)%9, so any potential values that are equal modulo nine show up as streams of equal colour. Horizontal pink bars show the counter limit values. All but the last two have had the tops cut off so we can focus on the interesting bit, but you can see the highest rate counter reached in the annotations above the images.

original: 567 cycles, highest rc is 535 (runlog)

This was the code I initially contributed to the discussion. It takes forever! Only harvesting a result every two cycles through the 32 rate limit is extraordinarily wasteful (slaps self). This image is half the scale of the others just to avoid it breaking the page format.

lft: 342 cycles, highest rc is 324 (runlog)

LFT's contribution. Vastly improved

lft_compacted: 279 cycles, highest rc is 261 (runlog)

I then compacted this a bit, by dropping a redundant nine cycles from each iteration.

sieve: 248 cycles, highest rc is 252 (runlog)

First attempt at a different approach. The single cycles at Attack=0 should be doable by using an INC instruction, but that relies on SID reading zero. Not sure if this is safe?
Only way it could be faster would be to use the 220 cycle limit for the stream that would otherwise require eight 32 cycle resets.

sieve2: 280 cycles, highest rc is 288 (runlog)

A safer sieve, that only uses 4 cycle writes. Downside of the sieve is it still lets the rate count get pretty high.

bottle: 280 cycles, highest rc is 99 (runlog)

LFT's comment about recapturing got me thinking. The next phase of the reset would run faster if we could bottle as many streams as possible into the 63 cycle limit (attack=2).
We can only manage seven of them, but that still brings down max rc by a factor of three!

bottle2: 280 cycles, highest rc is 90 (runlog)

..a slight reshuffle of the last couple of iterations, and we save enough cycles to use a rate limit of 95 for whatever comes next

bottle3: 375 cycles, highest rc is 72 (runlog)

This last one's definitely more of theoretical interest, but for another 95 cycles we can group all the possible states into a single tight packet.

The benefit of all of the above is less than I thought when I first set out, as it was only a day or two ago that I finally looked into the envelope overflow/underflow, and made sense of LFT's remarks about using env3=0xff; hence there's only a couple of loops required at a rate limit of 95, one to drop back down to env3=0xfe, and another to recapture at decay=0
I'd initially thought I was saving thousands of cycles as we rose from env3=0xee, but it's not to be. C'est la vie!

Still, onward to implementation; bottle2 should still save a few raster lines at the point in time that there's work to be done by CPU.
... 17 posts hidden. Click here to view all posts....
2015-05-15 00:19

Registered: Apr 2008
Posts: 319
Would this be what they call 'pushing the envelope?'
2015-05-15 01:39

Registered: Aug 2004
Posts: 1031
@Mixer: Hah! Yes, I guess it is :D

@Frantic: "stable hard restart" works for me.
2015-05-16 10:47

Registered: Apr 2004
Posts: 21
mmmmh I am still missing the point, and sorry if the following is off-topic :)

I thought when you set R to 0 in release state there was indeed a high probability that we have to wait about 0x8000 cycles before the RC gets caught between values 0 and 8. Then setting attack to any value and gate to 1 should start the attack "instantly" with a jitter of 9 cycles isn't it ? I wonder why it isn't good enough as a "known state" ?

However, I have seen many players that seemed to voluntarily triggered a second delay bug before attack restart. This can be especially good for kick/drum instruments that mostly have a 0 attack and starts with a noise waveform.
To do this, after they caught RC between 0 and 8 in release state, they update R to a high value, wait more than 9 cycles so that RC > 9, set AD value (with A = 0), and set gate to 1, hence always triggering another delay bug.

I tried to understand this motivation (it's plain speculation) : there is always a 33ms delay before attack starts, so the noise waveform, held during 2 frames in the instrument table, is only heard for 7ms, which sounds much better than a 20ms noise held one full frame (like in AHX tunes for example).
To avoid the 33ms delay, we would need 2x tune and AD+GATE updated before SR, so that noise would be heard for 10ms, or 3x tune so it would be heard for 6.7ms...

I am certainly missing something big here !!! Please explain the overall goal :)
2015-05-16 13:33

Registered: Apr 2004
Posts: 21
To illustrate the "virtuous" side of the ADSR delay bug for the attack, here's voice #2 of /MUSICIANS/P/Prosonix/Hoff_Lars/Cowshit_Jam.sid

How it's originally playing thanks to the 33ms delay (7ms noise)

If ADSR delay bug was not forced (at best we would have 20ms noise, like in this example):

This is not an isolated tune, many exploit this feature to shorten the noise time at attack, and it was a major problem for most emulators before we finally got cycle-based ones, like resid.
2015-05-16 16:06

Registered: Aug 2004
Posts: 1031
Well, I originally had no intent on using stable hard-reset (SHR) directly in music players; I developed the original 80ms routine so I could do some cycle-exact reads of env3 with the intent of developing a more accurate envelope model than reSID provided. There are still un-emulated bugs that result in some envelope settings seeming safe under emulation that nonetheless occasionally fail or glitch on the real hardware. A faster SHR means I can measure envelopes more quickly, and gather data to compare emulation with hardware in less time per run.

That said, having a player that performed an SHR even once per pattern, when combined with calling the music from a stable raster interrupt, would at least give the musician the possibility to experiment with 'unsafe' envelope settings confident that any envelope glitches will be identical every time the track is played.

SHR could also be used to develop a player that avoids the ADSR bug altogether even with currently 'unsafe' values, by tracking the exact state of RC and only switching the RC limit when RC is below it (of course, it could also trigger the bug on demand :D ) Such a player however, would be a pretty major undertaking for which SHR is but one component.

As for the virtuous ADSR at start of note, yes I believe you are correct; it allows one to have a half frame of a given waveform without resorting to using a 2x player.
2016-02-19 10:00

Registered: Jul 2007
Posts: 359
Quoting ChristopherJam

sieve: 248 cycles, highest rc is 252
First attempt at a different approach. The single cycles at Attack=0 should be doable by using an INC instruction, but that relies on SID reading zero. Not sure if this is safe?

No, unfortunately not. Reading the SID will get you the last value that was on the bus, which is a VIC fetch. You could try to time the code so the VIC fetches a zero at the right moment.

However, the real reason I'm responding to this now is that I figured out a different approach that could allow this to be optimised even further: Exploit the bug that selects the decay rate for one cycle when enabling the gate. This allows us to briefly open the bottle, as it were, and in a very clean way allow each possible phase to slip out at the right moment.

So, for instance, if you start with a normal hard restart (clearing ADSR and the control register for two frames) and then do:

lda #$0f
sta $d405
ldx #$00

sta $d404
stx $d404
sta $d404
stx $d404
sta $d404
stx $d404
sta $d404
stx $d404
sta $d404
stx $d404
sta $d404
stx $d404
sta $d404
stx $d404
sta $d404
stx $d404
sta $d404

...then, in just 76 cycles you've distributed the phases into nine possible locations, exactly eight cycles apart.

That's just an illustration, of course, because we need to get them nine cycles apart. But I'm thinking that if we do something similar using rate 1 (32 cycles) as base, we should be able to reach that goal in no more than 32*8 cycles.
2016-02-19 11:40

Registered: Aug 2004
Posts: 1031
Quoting lft
That's just an illustration, of course, because we need to get them nine cycles apart.

No no, this is excellent. Eight cycles apart is exactly what we need, because each hole has to catch a different phase. Should be able to shave a good 200 cycles off the SHR this way. Nice work!

Must be said, I've not yet tested to see if there's a similar nybble selection bug with gate off; that might become pertinent at this point, but at worst that would just necessitate setting sustain to a magic number during the bottling.
2016-02-19 13:13

Registered: Jul 2007
Posts: 359
But the point of having them nine cycles apart is that we can then set the decay rate to 0, make use of the synchronous transition from attack to decay, and in that way bring every phase down to the same value. If they are spaced eight cycles apart, they'll still be in different locations once we bottle them up again.
2016-02-19 13:32

Registered: Aug 2004
Posts: 1031
Right you are. I realised that a few minutes ago over the washing up and came back here to admit I'm an idiot :)

Yes; as you pointed out, the holes need to be in the 32 cycle ceiling.
2020-03-17 12:28

Registered: Aug 2004
Posts: 1031
Quoting lft
Reading the SID will get you the last value that was on the bus, which is a VIC fetch. You could try to time the code so the VIC fetches a zero at the right moment.

Apparently the bus in question is the SIDs internal bus (cf (Ab)use of dummy accesses), not the system bus - so, much more controllable! But yes, using the single cycle at decay rate is still a saner option.

(also, cheers to the mods for letting me fix the image links at the start of this topic!)
Previous - 1 | 2 | 3 - Next
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Users Online
Guests online: 51
Top Demos
1 Uncensored  (9.7)
2 Coma Light 13  (9.7)
3 Edge of Disgrace  (9.7)
4 Comaland 100%  (9.6)
5 Unboxed  (9.6)
6 The Shores of Reflec..  (9.6)
7 Remains  (9.5)
8 Lunatico  (9.5)
9 We Come in Peace  (9.5)
10 C=Bit 18  (9.5)
Top onefile Demos
1 Dawnfall V1.1  (9.6)
2 Crystal Gazer  (9.6)
3 Space Demo  (9.5)
4 Field Sort  (9.5)
5 Instinct  (9.5)
6 The Tuneful Eight [u..  (9.5)
7 Smile to the Sky  (9.5)
8 Rewind  (9.5)
9 Onef1ler  (9.5)
10 Bad Boy  (9.5)
Top Groups
1 Performers  (9.6)
2 Oxyron  (9.4)
3 PriorArt  (9.4)
4 Booze Design  (9.4)
5 Censor Design  (9.4)
Top Sysops
1 Optic Freeze  (10)
2 pcollins  (9.9)
3 Pudwerx  (9.7)
4 Aycee  (9.6)
5 Taper  (9.3)

Home - Disclaimer
Copyright © No Name 2001-2020
Page generated in: 0.141 sec.