| |
oziphantom
Registered: Oct 2014 Posts: 490 |
More stable raster problems
Ok so getting super stable interupts issues ;)
First I did the standard inverted cycle counter 0-8 jitter as on Codebase, but it has a jitter of 0-3 clocks still, which makes it hard to chain the NMIs as each one will jitter and then after 4 of them you are looking at a jitter of 4 - 12 clocks per frame which will then just compound.
So Ninja method to the rescue, super stable to the clock, nice ;) However its uses the following
CIA #1 A
CIA #1 B
CIA #2 A
Which leaves me CIA #2 B, which is annoying. I need an NMI and an IRQ source. The NMI needs to be stable as I need to chain them, but the IRQ just needs to be about every 80 or so clocks, but the NMI needs to "win".
There was a look up opcodes and dynamically make some code to eat clocks, but that needs to trash places spread across the memory map and not really something I can work around I feel.
Any trick I'm missing, some other way to get an IRQ ever so many clocks? Or another way to mitigate the 1-3 jitter on the normal method? |
|
| |
chatGPZ
Registered: Dec 2001 Posts: 11386 |
giving advice will be easier if you'd first tell what exactly you are trying to do. the standard inverted timer thingy shouldnt leave you with 0-3 clocks of jitter, for example :=) |
| |
Oswald
Registered: Apr 2002 Posts: 5094 |
debug standard method, then you will have 3 timers left. also a bit sounds like you think if you dejitter one, further nmi's will be flicker free too, which is not the case. dejittering must be done for every individual interrupt. |
| |
soci
Registered: Sep 2003 Posts: 480 |
Given the right timing you'll loose NMIs when mixing them with IRQs. Unless the interrupts won't trigger close to each other or the target is a 65C02 or later.
As a workaround you may clear the CIA NMI request independent of the interrupt handler if a "lockup" was detected. This could work if rare glitches are not a big deal, for example while playing digis. |
| |
ws
Registered: Apr 2012 Posts: 251 |
I wasn't aware, that the badline would also affect NMI RTI return times, but this little experiment probably visualizes that fact. Yuck!
Thought i'd share.
; CBMPRG Studio Syntax / NMI Return Jitter display // WS^G*P
;------ BASIC - bootloader ---------------------------------
*=$0801
byte $0c,$08,$01,$00,$9e,$20,$32,$30,$36,$32,$00,$00,$00; 0 sys2062 ($080e)
*=$080e
;-----------------------------------------------------------
init cld ; usual nmi irq setup
sei
jsr $e544 ; clear screen
lda #$01
sta $d01a
sta $dc0d
lda #$35 ; gibe le ram
sta $01
lda #$1b ; helps
sta $d011
lda #<irq
sta $FFFE
ldy #>irq
sty $FFFF
lda #$00 ; first interrupt line
sta $d012
lda #$7f
sta $dc0d
lda $dc0d
cli
;-----------------------------------------------------------
idle sta $d020 ; Accumulator is fed by irq
sta $d021
jmp idle
;-----------------------------------------------------------
irq inc $d019 ;neurotypical flag stuff
inc raster+1
raster lda #$00 ;this value is being passed
sta $d012 ;to the idle loop
rti
;-----------------------------------------------------------
;-----------------------------------------------------------
;EOF |
| |
oziphantom
Registered: Oct 2014 Posts: 490 |
So Using the code from code base, I was using this code to stabilise an CIA based NMI
pha
lda $dc04
eor #7
sta *+4
bpl *+2
lda #$a9
lda #$a9
lda $eaa5
txa
pha
tya
pha
; the rest processes in the IRQ routine
asl $d019
pla
tay
pla
tax
pla
rti Where I swapped the asl $d019 for the CIA version. This has a problem in that dc04 can be 8, to which eor #7 gives $f and you end up with a jsr into nomans land.
Luckily on this forum post shortest CIA-stable raster ( which also contains the rest of the codebase code ) Spider-j points out he gets the same thing and offers
pha
lda $dd04
eor #7
sta *+4
bpl *+2
cmp #$c9
cmp #$c9
bit $ea24
bit $ea24
jmp *+8
nop
nop
jmp *+3
txa
pha
tya
pha to which I see a 0-2 cycle jitter on. There was some other code linked in the forum above
clc
adc #$ff-8;A=8-A so result will be 7
0 in A
eor #$ff
sta corr+1 ;self-writing code, the bpl jump-address = A
corr bpl *+2 ;the jump to (A) dependent byte (13 cycles so far)
cmp #$c9 ;A=8->A=0->BPL +2
cmp #$c9 ;
cmp #$c9 ;
cmp $ea ;3 =9 (13+9=22 max delay)looking at the page it was originaly posted, 6502.org, the writer made this table
Start Address
$1000 $1001 $1002 $1003 $1004 $1005 $1006 $1007 $1008
-------- -------- -------- -------- -------- -------- -------- -------- --------
cmp #$c9 cmp #$c9 cmp #$c9 cmp #$c9 cmp #$c9 cmp #$c5 cmp $ea nop
cmp #$c9 cmp #$c9 cmp #$c9 cmp #$c5 cmp $ea nop
cmp #$c9 cmp #$c5 cmp $ea nop
cmp $ea nop
-------- -------- -------- -------- -------- -------- -------- -------- --------
9 8 7 6 5 4 3 2 0
Cycles which shows it will also have a 0-2 jitter, to which it did and I got the same results as with Spider-J's code.
My test code was a NMI that did inc d020 dec d020 and the timer value was set to (312*63)-1 on VICE 3.0 X64SC PALC
(312*63) the colour change slid forwards
(312*63)-2 and the colour change slid backwards
The ultimate goal is to have 3~4 "d012 splits" on the screen.
1. opens the border but needs to be on the last cycle of the last line, as having hires bitmap and an open border = black border only
2. closes the border and sets up some other things background colour, multicolour mode, sprites
3. is the optional one, and it restores some sprites after doing a "sprite split" so I can show 4 sprites up top and then 7 below it
4. is a multi to hires split, and a background colour change, however it may have 0,upper 4, lower 4 or 7 sprites overlapping it.
Then there are DIGIS which play every now and then. To which as you can imagine wreck the IRQs badly.
My solution to this is ditch D012 for the "splits" and make them NMIs from a Timer, this way I can set 4 to start on the first clock of the screen ( i.e just past the border) then time out the 40 clocks for the display, then switch to hires. This should make it agnostic to any sprites.
Moving the DIGIs to be IRQ based so the NMIs will take precedent and if a dog bark or bird qwack holds a value for 1 update oh well. However seeing Soci's point oh dear... |
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
It feels that you're building something on a non-solid ground. When stabilizing using timers it's imperative that you have exact control when you start the timer in the first place. As a tip, simply have this in your interrupt handler:
pha
lda $dc04
sta $c000
inc *-1
dec $d019
pla
rti
The run it for a while and inspect $c0xx and make sure the values are in the correct range, i.e. 0-7. Also make sure they're the same range on each run of the code (write run manually to trigger some randomness in your start).
1) So if the range is wrong, then adjust the launch of the timer.
2) If the range differs on each run, then make sure you have a stable IRQ in the code that launches the timer.
When the above is correct you can start replacing the sta $c000 with eor #$07 and do the rest.
Also make sure the stabilization routine, specifically the bpl do NOT cross a page boundary. I.e. have an .align 256 just before the IRQ-handler to mitigate it. |
| |
oziphantom
Registered: Oct 2014 Posts: 490 |
So it gives a 1-8 consistantly.
If I switch to an 6526 and use the original code, it works fine. So the issue is I need slide by a clock for 6526 vs 6526A. |
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
Quote: So it gives a 1-8 consistantly.
If I switch to an 6526 and use the original code, it works fine. So the issue is I need slide by a clock for 6526 vs 6526A.
Or don't be so damn optimal. Just do:
sec
lda #$08
sbc $dc04
.
.
. |
| |
oziphantom
Registered: Oct 2014 Posts: 490 |
When do I want to load the timer with 8?
When VIC X Cycle & 7 = 0? |
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
Quote: When do I want to load the timer with 8?
When VIC X Cycle & 7 = 0?
With one chip u had 0-7 jitter, with the other 1-8. Taking 8 minus timer value gives you the number of bytes to branch to conpensate. Eor #7 is just an optimized 7-x that's independant of carry state. But since u get 8 also u can't use that optimization and must resort to carry-dependant subtraction. |
| |
oziphantom
Registered: Oct 2014 Posts: 490 |
This doesn't give me arbitary stablisation does it? It only gets me it stable from a fixed X location on the screen. So if I want to have something that is on the sample X spot but 8 lines down for example, then its if fine, as I tune the 0 of the CIA timer such that it hits 0 when the VIC hits the X I want, and then make sure that the gap between the points I want to be stable on, are perfect multiples of 8 away from said point as the counter counts in 8s.
So if I want the "delta" to be (63*8)-1 then it works, and is easy to tune with a nop here and a bit ea there and I will get perfect stable lines all the way down the screen. But if I want 63*8 i.e each line there after should be 1 clock over
------
-------
-------
it will fail, as being one over will push the adjust into 1-8 not 0-7 then the next one over will make it 2,3,4,5,6,7,8,8 then the next one will be 3,4,5,6,7,8,8,0 and so it doesn't work |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
Set the timer to count on a 63 cycle loop. Most of the line will be avoiding the 2,1,62,62,61 section. Put the dc04 value through a lookup table, or subtract from a magic number. |
| |
Monte Carlos
Registered: Jun 2004 Posts: 359 |
As the shortest opcode has 2 cycles, it's impossible to use timer timer delay with 0-7 cycles.
But you can of course time 2-9 cycles and correct for the two cycles at another place (f.e removing a nop there). |
| |
Copyfault
Registered: Dec 2001 Posts: 478 |
Quoting JackAsser[...]
EOR #$07 is just an optimized 7-x that's independant of carry state. But since u get 8 also u can't use that optimization and must resort to carry-dependant subtraction.
With a suitable timer alignment one could also optimize the case with delay states 0,...,8 by using EOR #$0F, no?? |
| |
lft
Registered: Jul 2007 Posts: 369 |
Yes. A similar approach is to set the timer to 63 cycles, and then eor #63. This will give you a positive value that you can write into a branch instruction, and jump into a clockslide. It won't be zero-based, so you have to pad with dummy bytes after the branch. It can be combined with my penalty cycle trick (Improved clock-slide). |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
Hang on a minute - if your interrupt source is a CIA timer running in continuous mode, you should be able to use that very same timer to tell you just how late you are running. The timer will be one of eight values a little below 312*63-1 (or whatever your limit is), so you should be able to just subtract the low byte from a magic number to tell you the required branch into the clock slide :) |
| |
oziphantom
Registered: Oct 2014 Posts: 490 |
Yes, you can. It was more just me understanding how the trick works, when you get part of a trick that is the super optimal version with only half the code and none of the caveats, you kind of have to go off the title and then flail around when things don't work. At one point I got it to work but couldn't get it to work again... it just so happened that the amount of code I had before the setting up of the 2nd timer was a right multiple to get the 0-7 range ;) but I didn't know at the time that was what I needed it to be.
The other aspect is I need to enter the routine, stabilise, set the timer and set a vic register in 40 clocks, so its tight, doing a 63 clock slide is not going to work ;) But I now need to work out the points I can hit, and the points that lead to death. I could be every 8,9 or 10, not sure at the moment, the CIAs are kooky and not well documented. and there is the A to consider as well. Then there is making a PAL version and a NTSC version which slides everything again ;)
I working on modifying VICE so the VIC will render the background colours to show if the CPU is executing
Normal
VIC IRQ
CIA IRA
CIA NMI
VIC DMA steal
so I can see when the various things fire, and to make sure my IRQS don't get to close to the NMIs as to block the NMIS as per Soci's post. |
| |
lft
Registered: Jul 2007 Posts: 369 |
ChristopherJam: Indeed, but a subtraction is four cycles whereas eor is two.
oziphantom: The overhead from the clockslide depends on the amount of jitter you need to support. Extra dummy-bytes that you always skip don't add to the overhead. |
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
Quote: Yes. A similar approach is to set the timer to 63 cycles, and then eor #63. This will give you a positive value that you can write into a branch instruction, and jump into a clockslide. It won't be zero-based, so you have to pad with dummy bytes after the branch. It can be combined with my penalty cycle trick (Improved clock-slide).
Nice how 63 cycles on a line happens to be a power of two minus one, making the eor to behave as a subtraction. :) |
| |
Copyfault
Registered: Dec 2001 Posts: 478 |
Setting up the timer to give $F,$E,...,$7 upon reading in the irq dejitter routine an EOR#$0F will produce fine branch-compatible bytes while being zero-based.
Ofcourse this also works with EOR#$3F, but you have to take care of the misalignment between the no.of timer values (there are 64 values with the 0) and the no.of cycles per line(one less)... plus the necessity to avoid the "wrong" $3F-byte when reading the timer value(as 0 is never read back but the starting value instead).
Therefore I proposed to use EOR#$0F to be on the save side;) |