[CSDb] - User Forums - Shortest code for stable raster timer setup

You are not logged in - nap

CSDb User Forums

Forums > C64 Coding > Shortest code for stable raster timer setup

2020-01-20 16:20

Krill

Registered: Apr 2002
Posts: 2980

Shortest code for stable raster timer setup

While working on my ICC 2019 4K entry (now postponed to ICC 2020, but i hope it'll be worth the wait), i came up with this (14 bytes):

initstabilise   lda $d012
                ldx #10          ; 2
-               dex              ;   (10 * 5) + 4
                bpl -            ; 54
                nop              ; 2
                eor $d012 - $ff,x; 5 = 63
                bne initstabilise; 7 = 70

                [...]; timer setup

The idea is to loop until the same current raster line is read at the very beginning (first cycle) and at the very end (last cycle) of a raster line, implying 0 cycles jitter.

With 63 cycles per line on PAL, the delay between the reads must be 63 cycles (and not 62), reading $d012 at cycle 0 and cycle 63 of a video frame's last line (311), which is one cycle longer due to the vertical retrace.

The downside is that effectively only one line per video frame is attempted, so the loop may take a few frames to terminate, and the worst case is somewhere just beyond 1 second.

The upside is that it always comes out at the same X raster position AND raster line (0), plus it leaves with accu = 0 and X = $ff, which can be economically re-used for further init code.

Now, is there an even shorter approach, or at least a same-size solution without the possibly-long wait drawback?

... 177 posts hidden. Click here to view all posts....

2020-12-06 19:28

Copyfault

Registered: Dec 2001
Posts: 478

Quoting Rastah Bar

Quoting Copyfault
Well, in large parts it's the same as what I proposed in post#86.

Yes, you are right. I lost track a bit of all the variants.
Quote:

But now that we entered the territory of over-stretching, here a version that does it in only 5 bytes (again putting all required reg settings on the decruncher's bill) :

$fdfc 9E D0 FD shx $fdd0,y $fdff D0 FC bne $fdfd
Comes with all constraints one could think of: mem loc fixed, y=$2f fixed val mandatory, x=$d1 fixed val mandatory, setting of vector $fc/$fd has influence on the no. of cycle that are taken when the loop is left, to be started with z-flag=0, ... maybe more! Ok. it's possible to do it with any branch-opcode, but this doesn't really make it any better;)

Very ingenious, but an 8-cycle loop doesn't work, doesn't it? See post #61.

It's actually a 12-cycle loop, cause the first branch is 4-cycles long (page-break!), the branch in the operand of the SHX takes 3 cycles and the SHX itself 5 -> 12 cycles in total;)

It could even be done with just 4 bytes (continuing the abuse of the byte-counting):

loop:  sha (vec),y
       bne loop

If this is located at the end of a page s.t. the BNE comes with a pb, it's a 10-cycle-loop in total.

Still, too far-fetched, too many things must be configured correctly. Personally, I think the 7-bytes-solution (as in post#110) that "only" comes with requirements on zp-values set in a special way is the best compromise between flexibility and byte-count!

2020-12-06 19:47

Rastah Bar
Account closed

Registered: Oct 2012
Posts: 336

Quoting Copyfault

Quoting Rastah Bar

Very ingenious, but an 8-cycle loop doesn't work, doesn't it? See post #61.
Quote:
It's actually a 12-cycle loop, cause the first branch is 4-cycles long (page-break!), the branch in the operand of the SHX takes 3 cycles and the SHX itself 5 -> 12 cycles in total;)

Yes, I misread the branch. I thought it was to $FDFC.
Quote:

It could even be done with just 4 bytes (continuing the abuse of the byte-counting):

loop: sha (vec),y bne loop
If this is located at the end of a page s.t. the BNE comes with a pb, it's a 10-cycle-loop in total.

Awesome! With SHA(vec),y even 3 bytes is possible for a 12-cycle loop. One example:

$5f00  SHA (VEC),y
$5f02  RTS

If we assume that the decruncher provides the following initial conditions: {A&X} = $EA (opcode of NOP), Y = 2, the ZP addresses VEC and VEC+1 point to $5F00 and the stack is completely filled with the return address $5F00. Without DMA the SHA writes $EA & {$5F+1} = $60 (opcode for RTS) and repeats that until a DMA makes it write an NOP.
Quote:

Still, too far-fetched, too many things must be configured correctly. Personally, I think the 7-bytes-solution (as in post#110) that "only" comes with requirements on zp-values set in a special way is the best compromise between flexibility and byte-count!

I'll leave that judgement to the people who want to use any of the variants.

2020-12-16 00:07

Copyfault

Registered: Dec 2001
Posts: 478

Quoting Rastah Bar

Awesome! With SHA(vec),y even 3 bytes is possible for a 12-cycle loop. One example:

$5f00 SHA (VEC),y $5f02 RTS

If we assume that the decruncher provides the following initial conditions: {A&X} = $EA (opcode of NOP), Y = 2, the ZP addresses VEC and VEC+1 point to $5F00 and the stack is completely filled with the return address $5F00. Without DMA the SHA writes $EA & {$5F+1} = $60 (opcode for RTS) and repeats that until a DMA makes it write an NOP.

Yeah, already told you that I like this approach for its level of insanity alone :)) Maybe instead of $5f00 one could choose $5f5f as "start adress" so the whole stack can be filled with the same byte and no matter at which position the SP will be, it will always return to the right spot!

2020-12-16 02:16

ChristopherJam

Registered: Aug 2004
Posts: 1409

Oh this is great.

May I suggest replacing the RTS with a BRK? Then you only need a single vector pointing at the routine start, instead of all of stack :) 13 cycle works I think; it divides cycles per frame but not cycles per character row.

(edit - assuming no issues with cycle stealing from all the stack writes for the BRK, of course. I've not tested this)

2020-12-16 12:55

chatGPZ

Registered: Dec 2001
Posts: 11386

read the whole thread again any by scientific measures you all turned out pretty insane.

2020-12-16 14:36

ChristopherJam

Registered: Aug 2004
Posts: 1409

Quoting Groepaz

read the whole thread again any by scientific measures you all turned out pretty insane.

Well, I can't really argue with that.

I can suggest replacing the BRK (or rather the contents of A) with the opcode of the next instruction in the init routine. Place the SHA in $ffxx, and it will zero out whatever is written until the time is right. Just saved another byte \o/

2020-12-16 15:24

Copyfault

Registered: Dec 2001
Posts: 478

Quoting ChristopherJam

Oh this is great.

May I suggest replacing the RTS with a BRK? Then you only need a single vector pointing at the routine start, instead of all of stack :) 13 cycle works I think; it divides cycles per frame but not cycles per character row.

(edit - assuming no issues with cycle stealing from all the stack writes for the BRK, of course. I've not tested this)

I also had this idea to do it with a BRK instead of RTS (I mean: when there's some ANDing in play, a $00-byte for the "continue-loop"-case feels tempting;)), but afaiu post#61 by Quiss, 13 cycles does not work. So I guess the complete stack would've to be "configured adequately" :)

Quoting Groepaz

read the whole thread again any by scientific measures you all turned out pretty insane.

Well, don't have a good argument against this - but insanity is the obvious state when stepping beyond science ;)

2020-12-16 15:30

Copyfault

Registered: Dec 2001
Posts: 478

Quoting ChristopherJam

[...]
(edit - assuming no issues with cycle stealing from all the stack writes for the BRK, of course. I've not tested this)

Oh wait... Quiss' calculation for the loop-length was based on R-cycle only... so the BRK *will* change it. Need to fiddle out the permitted loop-lengths under this new precondition - maybe it works...

2020-12-16 15:36

ChristopherJam

Registered: Aug 2004
Posts: 1409

I do hope it works. Because then we can use the entire SHA instruction and its opcode as the address operand of a preceding instruction (eg placing the zp pointer at an address that doubles as the high byte of an IO address), then the entire routine can vanish altogether. Zero bytes :D :D

2020-12-16 15:54

Copyfault

Registered: Dec 2001
Posts: 478

Quoting ChristopherJam

I do hope it works. Because then we can use the entire SHA instruction and its opcode as the address operand of a preceding instruction (eg placing the zp pointer at an address that doubles as the high byte of an IO address), then the entire routine can vanish altogether. Zero bytes :D :D

Oh, where are we now??? Insanity^infty /o\\ But ok, splendig idea to make SHA (vec),y an operand of a preceeding opcode, like STA $d093,y :)) First thought it'd rather be a 1-byte-solution due to the opcode following that STA $d093,y, but the sync-loop exits always when the full value was written, thus also this byte is completely free to choose.

Previous - 1 | ... | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 - Next

Refresh

Subscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search for in
All times are CET.

Search CSDb

Advanced

Users Online

instant
WVL/Xenon
Hagar/The Supply Team
Odkin/Art of the Uni..
Harry Potthead
Tchad/Jam
Guests online: 93

Top Demos

1 Next Level  (9.7)
2 13:37  (9.7)
3 Mojo  (9.7)
4 Coma Light 13  (9.6)
5 Edge of Disgrace  (9.6)
6 What Is The Matrix 2  (9.6)
7 The Demo Coder  (9.6)
8 Uncensored  (9.6)
9 Comaland 100%  (9.6)
10 Wonderland XIV  (9.6)

Top onefile Demos

1 Layers  (9.6)
2 No Listen  (9.6)
3 Party Elk 2  (9.6)
4 Cubic Dream  (9.6)
5 Copper Booze  (9.6)
6 Rainbow Connection  (9.5)
7 Dawnfall V1.1  (9.5)
8 Onscreen 5k  (9.5)
9 Morph  (9.5)
10 Libertongo  (9.5)

Top Groups

1 Performers  (9.3)
2 Booze Design  (9.3)
3 Oxyron  (9.3)
4 Triad  (9.3)
5 Censor Design  (9.3)

Top Swappers

1 Derbyshire Ram  (10)
2 Jerry  (9.8)
3 Violator  (9.7)
4 Acidchild  (9.7)
5 Cash  (9.6)

Page generated in: 0.047 sec.