Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
You are not logged in - nap
CSDb User Forums


Forums > C64 Coding > shortest CIA-stable raster
2009-04-04 12:39
Hermit

Registered: May 2008
Posts: 208
shortest CIA-stable raster

<Post edited by moderator on 4/4-2009 14:47>

Hi, Guys :)

While preparing for compo I've developed maybe the shortest
CIA-type stable raster solution (fits in 64 bytes, 24 asm-rows).
If you can do even shorter, I'm curious :)

It works fine in practice, don't have to type novels to achieve
stable raster, and no need for raster-IRQ,CMPd012 method is enough.
If you find it useful for fast & short demo-writing, we may implement it
into codebase64.

;setting the CIA1-timerA to beam in the program beginning:
-----------------------------------------------------------

     sei                   ;we don't want lost cycles by IRQ calls :)
sync cmp $d012             ;scan for begin rasterline (A=$11 after first return)
     bne *-3       ;wait if not reached rasterline #$11 yet
     ldy #8        ;the walue for cia timer fetch & for y-delay loop
     sty $dc04     ;CIA Timer will count from 8,8 down to 7,6,5,4,3,2,1
     dey           ;Y=Y-1 (8 iterations: 7,6,5,4,3,2,1,0)
     bne *-1       ;loop needed to complete the poll-delay with 39 cycles
     sty $dc05     ;no need Hi-byte for timer at all (or it will mess up)
     sta $dc0e,y   ;forced restart of the timer to value 8 (set in dc04)
     lda #$11      ;value for d012 scan and for timerstart in dc0e
     cmp $d012     ;check if line ended (new line) or not (same line)
     sty $d015     ;switch off sprites, they eat cycles when fetched
     bne sync      ;if line changed after 63 cycles, resyncronize it!
     .... the rest (this is also a stable-timed point, can be used for sg.)

B;EXAMPLE-using timerA to stabilize 7 cycle jitter when using CMPd012:
-----------------------------------------------------------------------
scan ldx #$31    ;a good value that's not badline, in border and 1=white
     cpx $d012   ;scan rasterline
     bne *-3     ;wait until rasterline will be $31
     lda $dc04   ;check timer A, here it jitters between 7...1
     eor #7      ;A=7-A so jitter will be 0...6 in A
     sta corr+1  ;self-writing code, the bpl jump-address = A
corr bpl *+2     ;the jump to timer (A) dependent byte
     cmp #$c9    ;if A=0, cmp#$c9; if A=1, cmp #$c9 again 2 cycles later
     cmp #$c9    ;if A=2, cmp#$c9, if A=3, CMP #$EA 2 cycles later
     bit $ea24   ;if A=4,bit$ea24; if A=5, bit $ea, if A=6, only NOP

     stx $d020   ;x was 1 so border is white at the stable cycle
     sty $d020   ;y ended in 0 in sync routine, so border black after 4 cycles
     jmp scan    ;go to the raster again (or can go new raster)


-----------------------------------------------------------------------
Opinions?

Hermit Software Hungary
 
... 17 posts hidden. Click here to view all posts....
 
2011-07-20 12:44
ready.

Registered: Feb 2003
Posts: 441
I might be wrong as well, since I based my feedback on VICE monitor only (VICE 2.2). Still I confirm that in my routine ran in VICE sometimes I get $dc04=8. I checked the setup of the code and it is correct.

2011-07-20 20:39
Copyfault

Registered: Dec 2001
Posts: 478
@Frantic: the experiments I did back then showed that $DC04 will never reach value '0'. This was already mentioned in this thread and in the old one (look @some posts above).

@Hermit: this "eor #$07"-line in your code must indeed cause problems - due to the fact that $DC04 != 0. But ofcourse you could sync your timer to have e.g. values between $10..$17 at the reading cycle of "lda $DC04" - thus, "eor #$17" should fix your example code.
2012-01-09 07:48
ChristopherJam

Registered: Aug 2004
Posts: 1409
Another approach, albeit the same number of lines of code:

;setting the CIA1-timerA to beam in the program beginning:
;-----------------------------------------------------------

     sei           ;we don't want lost cycles by IRQ calls ;)
sync lda#$1c
     cmp$d012      ; scan for line to force DMA
     bne *-3
     sta $d011     ;trigger badline to absorb jitter
     lda #$11
     ldy #8        ;the walue for cia timer fetch & for y-delay loop
     sty $dc04     ;CIA Timer will count from 8,8 down to 7,6,5,4,3,2,1
     ldy#0
     sty $dc05     ;no need Hi-byte for timer at all (or it will mess up)
     sta $dc0e,y   ;forced restart of the timer to value 8 (set in dc04)
     dec $d011     ;undo tiny scroll from above
     bmi sync      ;oops, we were in the bottom border
2017-03-06 22:25
spider-j

Registered: Oct 2004
Posts: 498
Sorry to dig this up, but I also stumbled over this lda $dc04/$dd04 returns $08 and therefore eor #7 produces $0f "thingy". I did some experiments with NMI (CIA2 TIMER B counting PAL cycles and TIMER A to "stabilize") and Krill loader instead of IRQ and made a "dirty fix" like this to help myself:
                    pha
                    lda $dd04
                    eor #7
                    sta *+4
                    bpl *+2
                    cmp #$c9
                    cmp #$c9
                    bit $ea24
                    bit $ea24
                    jmp *+8
                    nop
                    nop
                    jmp *+3
                    txa
                    pha
                    tya
                    pha

Yes, I know it's a lot more bytes & cycles "wasted". I just saw while linking Trafolta that Achim used this code snippet and played around a little bit.
Maybe someone who is a bit more creative than me should update the codebase64 page with a proper solution. Or at least there should be some kind of warning ... or whatsoever...
2017-03-18 11:22
Repose

Registered: Oct 2010
Posts: 225
lol so I come back exactly 6 years later and this thread is still going.

What I meant was, I wanted to answer two questions, 1) What is a methodical approach to finding the shortest or quickest sync code 2) how to tell if you've found the best possible solution.

Instead of playing around with ideas and guessing, I was thoroughly going through every opcode to see how they could be combined to make various delays. By using that method, it can be proven the best way to do this and then say it's done and forever. I guess no one really understood it, but I still found my own post very interesting of course it makes sense to me.

I'll have to look over the latest proposed segments and decide if I feel any of them are probably the last answer.
2017-03-18 15:36
Repose

Registered: Oct 2010
Posts: 225
Ok so the conclusion of my msg is, "we can use just 3 bytes to write a delay between 4 to 12 cycles". I mean two instructions of the right type, a one byte and two byte one, together can add up to any possibility of time from 4 (nop:nop) to 12 (unspecified 4 cycle and 8 cycle instruction) cycles.
If we jump into each segment, that's one approach.
So what I'm saying is, for that approach this is the smallest code you could ever write.
Then I give two other approaches that could be shorter, but not faster overall (if that's important).
2017-03-20 06:12
ChristopherJam

Registered: Aug 2004
Posts: 1409
Repose, so it looks like you were searching for the smallest number of bytes for which there are a set of at least eight routines of that length that between them cover a run of eight different durations?

Nine delay states easily confirmable in Python thusly:
>>> b1={2,3,4}
>>> b2={2,3,4,5,6,8}
>>> set([x+y for x in b1 for y in b2])
{4, 5, 6, 7, 8, 9, 10, 11, 12}

I'm not clear how closely related that is to finding the shortest possible anti jitter routine mind, as multiple fragments would ordinarily introduce the cost of dispatching to them and returning. (also note that the jmp ($dc03) approach doesn't require same-length delay routines)

It does raise the entertaining possibility of this construct, mind:
  ldy $dc04
  lda frag1,y
  sta rna+1
  lda frag2,y
  sta rna+2
rna:
 .byt 0,0,0

24 to 32 cycles, depending on the content of $dc04. A 9 cycle timer wouldn't work because of the duplicated-8 issue, but a 63 cycle timer should be fine if the alignment is appropriate (alternately, avoid undefined opcodes in main).
2017-03-20 08:37
lft

Registered: Jul 2007
Posts: 369
That is an excellent idea, and it can be taken further!

We can do something like this:

        ldy     $dc04
        lda     opcodes,y
        sta     mod
mod     .byt    $00,$1b,$a9,$13,$ea

        ; 18-27 cycles later...


So that's six cycles faster, and one more cycle of jitter supported.

Here is a clarifying table:

y       code                    cycles  trashes

1       a9 1b|a9 13|ea          6
2       a5 1b|a9 13|ea          7
3       b5 1b|a9 13|ea          8
4       06 1b|a9 13|ea          9       1b
5       a1 1b|a9 13|ea          10
6       ea|1b a9 13|ea          11      13a9,y
7       ad 1b a9|13 ea          12      (ea),y
8       99 1b a9|13 ea          13      a91b,y  (ea),y
9       0e 1b a9|13 ea          14      a91b    (ea),y
10      1b 1b a9|13 ea          15      a91b,y  (ea),y


This can be varied according to taste, to trash different memory locations. Note that the value of Y is known, so the exact address of the trashed location is also known.
2017-03-20 14:13
ChristopherJam

Registered: Aug 2004
Posts: 1409
Ooh, very nice indeed.

I guess the next question is, what's the fewest cycles required for each jitter length; lft's 18 cycle minimum is likely optimal for ten jitter states; if there's only two (eg we know we're interrupting NOPs) one could just
  ldy $dc04
  lda $xxnn,y

(eight or nine cycles, depending whether $dc04 is greater than 255-nn)

I'm not wrapping my brain around the sets of bcX *+n above at the moment; it's been a long day. (oh, and I should have been doing STA to rna+0 and rna+1 two comments ago too, but you've probably guessed that already)
2017-03-23 23:50
Repose

Registered: Oct 2010
Posts: 225
Apparently I worked on this 5 years ago.

http://forum.6502.org/viewtopic.php?p=18148#18148

This does 14+A in the range A=(1,8) or 13 with A=0.
;A=1..8
*=$1000
clc
adc #$ff-8;A=8-A so result will be 7…0 in A
eor #$ff
sta corr+1 ;self-writing code, the bpl jump-address = A
corr bpl *+2 ;the jump to (A) dependent byte (13 cycles so far)
cmp #$c9 ;A=8->A=0->BPL +2
cmp #$c9 ;
cmp #$c9 ;
cmp $ea ;3 =9  (13+9=22 max delay)


Nothing innovative, just different idea.
Previous - 1 | 2 | 3 - Next
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
K-reator/CMS/F4CG
REBEL 1/HF
Flashback
psych
algorithm
morphfrog
Guests online: 91
Top Demos
1 Next Level  (9.7)
2 13:37  (9.7)
3 Mojo  (9.7)
4 Coma Light 13  (9.6)
5 Edge of Disgrace  (9.6)
6 What Is The Matrix 2  (9.6)
7 The Demo Coder  (9.6)
8 Uncensored  (9.6)
9 Comaland 100%  (9.6)
10 Wonderland XIV  (9.6)
Top onefile Demos
1 Layers  (9.6)
2 No Listen  (9.6)
3 Cubic Dream  (9.6)
4 Party Elk 2  (9.6)
5 Copper Booze  (9.6)
6 Rainbow Connection  (9.5)
7 Dawnfall V1.1  (9.5)
8 Onscreen 5k  (9.5)
9 Morph  (9.5)
10 Libertongo  (9.5)
Top Groups
1 Performers  (9.3)
2 Booze Design  (9.3)
3 Oxyron  (9.3)
4 Triad  (9.3)
5 Censor Design  (9.3)
Top Fullscreen Graphicians
1 Joe  (9.7)
2 Sulevi  (9.6)
3 The Sarge  (9.6)
4 Veto  (9.6)
5 Facet  (9.6)

Home - Disclaimer
Copyright © No Name 2001-2024
Page generated in: 0.052 sec.