[CSDb] - User Forums - shortest CIA-stable raster

Welcome to our latest new user Harvey ! (Registered 2024-11-25)

You are not logged in - nap

CSDb User Forums

Forums > C64 Coding > shortest CIA-stable raster

2009-04-04 12:39

Hermit

Registered: May 2008
Posts: 208

shortest CIA-stable raster

<Post edited by moderator on 4/4-2009 14:47>

Hi, Guys :)

While preparing for compo I've developed maybe the shortest
CIA-type stable raster solution (fits in 64 bytes, 24 asm-rows).
If you can do even shorter, I'm curious :)

It works fine in practice, don't have to type novels to achieve
stable raster, and no need for raster-IRQ,CMPd012 method is enough.
If you find it useful for fast & short demo-writing, we may implement it
into codebase64.

;setting the CIA1-timerA to beam in the program beginning:
-----------------------------------------------------------

     sei                   ;we don't want lost cycles by IRQ calls :)
sync cmp $d012             ;scan for begin rasterline (A=$11 after first return)
     bne *-3       ;wait if not reached rasterline #$11 yet
     ldy #8        ;the walue for cia timer fetch & for y-delay loop
     sty $dc04     ;CIA Timer will count from 8,8 down to 7,6,5,4,3,2,1
     dey           ;Y=Y-1 (8 iterations: 7,6,5,4,3,2,1,0)
     bne *-1       ;loop needed to complete the poll-delay with 39 cycles
     sty $dc05     ;no need Hi-byte for timer at all (or it will mess up)
     sta $dc0e,y   ;forced restart of the timer to value 8 (set in dc04)
     lda #$11      ;value for d012 scan and for timerstart in dc0e
     cmp $d012     ;check if line ended (new line) or not (same line)
     sty $d015     ;switch off sprites, they eat cycles when fetched
     bne sync      ;if line changed after 63 cycles, resyncronize it!
     .... the rest (this is also a stable-timed point, can be used for sg.)

B;EXAMPLE-using timerA to stabilize 7 cycle jitter when using CMPd012:
-----------------------------------------------------------------------
scan ldx #$31    ;a good value that's not badline, in border and 1=white
     cpx $d012   ;scan rasterline
     bne *-3     ;wait until rasterline will be $31
     lda $dc04   ;check timer A, here it jitters between 7...1
     eor #7      ;A=7-A so jitter will be 0...6 in A
     sta corr+1  ;self-writing code, the bpl jump-address = A
corr bpl *+2     ;the jump to timer (A) dependent byte
     cmp #$c9    ;if A=0, cmp#$c9; if A=1, cmp #$c9 again 2 cycles later
     cmp #$c9    ;if A=2, cmp#$c9, if A=3, CMP #$EA 2 cycles later
     bit $ea24   ;if A=4,bit$ea24; if A=5, bit $ea, if A=6, only NOP

     stx $d020   ;x was 1 so border is white at the stable cycle
     sty $d020   ;y ended in 0 in sync routine, so border black after 4 cycles
     jmp scan    ;go to the raster again (or can go new raster)

-----------------------------------------------------------------------
Opinions?

Hermit Software Hungary

... 20 posts hidden. Click here to view all posts....

2017-03-06 22:25

spider-j

Registered: Oct 2004
Posts: 498

Sorry to dig this up, but I also stumbled over this lda $dc04/$dd04 returns $08 and therefore eor #7 produces $0f "thingy". I did some experiments with NMI (CIA2 TIMER B counting PAL cycles and TIMER A to "stabilize") and Krill loader instead of IRQ and made a "dirty fix" like this to help myself:

Yes, I know it's a lot more bytes & cycles "wasted". I just saw while linking Trafolta that Achim used this code snippet and played around a little bit.
Maybe someone who is a bit more creative than me should update the codebase64 page with a proper solution. Or at least there should be some kind of warning ... or whatsoever...

2017-03-18 11:22

Repose

Registered: Oct 2010
Posts: 225

lol so I come back exactly 6 years later and this thread is still going.

What I meant was, I wanted to answer two questions, 1) What is a methodical approach to finding the shortest or quickest sync code 2) how to tell if you've found the best possible solution.

Instead of playing around with ideas and guessing, I was thoroughly going through every opcode to see how they could be combined to make various delays. By using that method, it can be proven the best way to do this and then say it's done and forever. I guess no one really understood it, but I still found my own post very interesting of course it makes sense to me.

I'll have to look over the latest proposed segments and decide if I feel any of them are probably the last answer.

2017-03-18 15:36

Repose

Registered: Oct 2010
Posts: 225

Ok so the conclusion of my msg is, "we can use just 3 bytes to write a delay between 4 to 12 cycles". I mean two instructions of the right type, a one byte and two byte one, together can add up to any possibility of time from 4 (nop:nop) to 12 (unspecified 4 cycle and 8 cycle instruction) cycles.
If we jump into each segment, that's one approach.
So what I'm saying is, for that approach this is the smallest code you could ever write.
Then I give two other approaches that could be shorter, but not faster overall (if that's important).

2017-03-20 06:12

ChristopherJam

Registered: Aug 2004
Posts: 1408

Repose, so it looks like you were searching for the smallest number of bytes for which there are a set of at least eight routines of that length that between them cover a run of eight different durations?

Nine delay states easily confirmable in Python thusly:

>>> b1={2,3,4}
>>> b2={2,3,4,5,6,8}
>>> set([x+y for x in b1 for y in b2])
{4, 5, 6, 7, 8, 9, 10, 11, 12}

I'm not clear how closely related that is to finding the shortest possible anti jitter routine mind, as multiple fragments would ordinarily introduce the cost of dispatching to them and returning. (also note that the jmp ($dc03) approach doesn't require same-length delay routines)

It does raise the entertaining possibility of this construct, mind:

  ldy $dc04
  lda frag1,y
  sta rna+1
  lda frag2,y
  sta rna+2
rna:
 .byt 0,0,0

24 to 32 cycles, depending on the content of $dc04. A 9 cycle timer wouldn't work because of the duplicated-8 issue, but a 63 cycle timer should be fine if the alignment is appropriate (alternately, avoid undefined opcodes in main).

2017-03-20 08:37

lft

Registered: Jul 2007
Posts: 369

That is an excellent idea, and it can be taken further!

We can do something like this:

        ldy     $dc04
        lda     opcodes,y
        sta     mod
mod     .byt    $00,$1b,$a9,$13,$ea

        ; 18-27 cycles later...

So that's six cycles faster, and one more cycle of jitter supported.

Here is a clarifying table:

y       code                    cycles  trashes

1       a9 1b|a9 13|ea          6
2       a5 1b|a9 13|ea          7
3       b5 1b|a9 13|ea          8
4       06 1b|a9 13|ea          9       1b
5       a1 1b|a9 13|ea          10
6       ea|1b a9 13|ea          11      13a9,y
7       ad 1b a9|13 ea          12      (ea),y
8       99 1b a9|13 ea          13      a91b,y  (ea),y
9       0e 1b a9|13 ea          14      a91b    (ea),y
10      1b 1b a9|13 ea          15      a91b,y  (ea),y

This can be varied according to taste, to trash different memory locations. Note that the value of Y is known, so the exact address of the trashed location is also known.

2017-03-20 14:13

ChristopherJam

Registered: Aug 2004
Posts: 1408

Ooh, very nice indeed.

I guess the next question is, what's the fewest cycles required for each jitter length; lft's 18 cycle minimum is likely optimal for ten jitter states; if there's only two (eg we know we're interrupting NOPs) one could just

  ldy $dc04
  lda $xxnn,y

(eight or nine cycles, depending whether $dc04 is greater than 255-nn)

I'm not wrapping my brain around the sets of bcX *+n above at the moment; it's been a long day. (oh, and I should have been doing STA to rna+0 and rna+1 two comments ago too, but you've probably guessed that already)

2017-03-23 23:50

Repose

Registered: Oct 2010
Posts: 225

Apparently I worked on this 5 years ago.

http://forum.6502.org/viewtopic.php?p=18148#18148

This does 14+A in the range A=(1,8) or 13 with A=0.

;A=1..8
*=$1000
clc
adc #$ff-8;A=8-A so result will be 70 in A
eor #$ff
sta corr+1 ;self-writing code, the bpl jump-address = A
corr bpl *+2 ;the jump to (A) dependent byte (13 cycles so far)
cmp #$c9 ;A=8->A=0->BPL +2
cmp #$c9 ;
cmp #$c9 ;
cmp $ea ;3 =9  (13+9=22 max delay)

Nothing innovative, just different idea.

Previous - 1 | 2 | 3 - Next

Refresh

Subscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search for in
All times are CET.

Search CSDb

Advanced

Users Online

Alakran_64
LMan
csabanw
Copyfault/Extend^tsn..
HCL/Booze Design
MAT64
EALL/HT
Guests online: 100

Top Demos

1 Next Level  (9.7)
2 13:37  (9.7)
3 Coma Light 13  (9.7)
4 Edge of Disgrace  (9.6)
5 Mojo  (9.6)
6 What Is The Matrix 2  (9.6)
7 The Demo Coder  (9.6)
8 Uncensored  (9.6)
9 Wonderland XIV  (9.6)
10 Comaland 100%  (9.6)

Top onefile Demos

1 Layers  (9.6)
2 Party Elk 2  (9.6)
3 Cubic Dream  (9.6)
4 Copper Booze  (9.6)
5 Libertongo  (9.5)
6 Rainbow Connection  (9.5)
7 Onscreen 5k  (9.5)
8 Morph  (9.5)
9 Dawnfall V1.1  (9.5)
10 It's More Fun to Com..  (9.5)

Top Groups

1 Performers  (9.3)
2 Booze Design  (9.3)
3 Oxyron  (9.3)
4 Nostalgia  (9.3)
5 Triad  (9.2)

Top Swappers

1 Derbyshire Ram  (10)
2 Jerry  (9.8)
3 Violator  (9.7)
4 Acidchild  (9.7)
5 Cash  (9.6)

Page generated in: 0.039 sec.