Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
You are not logged in - nap
CSDb User Forums


Forums > C64 Coding > Improved clock-slide
2017-02-28 07:21
lft

Registered: Jul 2007
Posts: 369
Improved clock-slide

If you use timer-based jitter correction, or just VSP, here's a way to shave off one cycle:

http://codebase64.org/doku.php?id=base:improved_clockslide
 
... 16 posts hidden. Click here to view all posts....
 
2017-03-02 10:35
ChristopherJam

Registered: Aug 2004
Posts: 1409
Oh that's gorgeous. Nice work, both of you!
2017-03-02 12:08
Frantic

Registered: Mar 2003
Posts: 1648
@Copyfalut: Don't hesitate to write about that improvement on Codebase. If LFT don't mind, perhaps it can be written as an extension of his article?
2017-03-02 22:38
Copyfault

Registered: Dec 2001
Posts: 478
Reading the reactions to my idea makes me happy&smile :))

But to be fair: my ideas are not that much of an optimization as it looks like on first glance!

Sticking to lft's example, the routine can cope with a jitter of 10 which means 11 different latencies (or cycle delay states as I prefer to call it). Applying my "optimization" the number of delay states drops by one.
;-----------------------
;cycle no. taken from
;lft's example
;-----------------------
                 ;32..41 (31 not poss. with the opt)
                 ;A=0..9 (10 not poss.)
     sta bra+1   
                 ;36..45
bra  bne *       
                 ;38 (only for A=0)
     nop         
                 ;40 (branch taken, nop skipped)
     lda #$a9    
                 ;42
     lda #$a9    
                 ;44
     lda #$a9    
                 ;46
     lda $ea     
                 ;49
;-----------------------------
;page break here
;-----------------------------
code ...

You can easily see that the delay state "A=10" is not coped anymore after the optimization (it would most probably lead to a crash due to a branch to "code+1") whereas it is fully treated in lft's approach.

Thus the idea I had is more of a "cosmetic kind". In order to fully take advantage of that extra "branch taken" cycle, one would have to ensure that the very first byte of the clock slide code is reached by a taken branch (for the "A=0"-case, i.e. the one which has to compensate the most cycles) and also just by passing that branch instruction (usually the "A=1"-case), but this would require some more touch-up of the accu before starting the actual dejitter part.

So before feeding the codebase I better ask here if you still want the idea to be added there.
2017-03-02 23:18
Copyfault

Registered: Dec 2001
Posts: 478
Speaking of that touch-up of the accu in my previous post, one could do it using table lookup. So the code would be smth like
;---------------------------------
;trying to stick to lft's example
;with all the cycle numbers
;---------------------------------
                 ;23..33
     ldx timer   
                 ;27..37
     lda table,x
                 ;31..41
                 ;A=0..10
     sta bra+1   
                 ;35..45
bra  bpl *       
                 ;38 (35+3 for A=0 or 36+2 for A=$ff)
     nop         ;40
     lda #$a9    ;42
     lda #$a9    ;44
     lda #$a9    ;46
     lda $ea     ;49
code ...

table
     !by $09,$08,$07,$06,$05,$04,$03,$02,$01,$ff,$00         

This way all 11 different delay states can be coped with but at the cost of extra Bytes for the table. Now if I want to be smart, I'd align the table to also have a page break for the "A=0"-case ;))
;---------------------------------
                 ;23..33
     ldx timer   
                 ;27..37
     lda table,x ;if timer holds the max-val, the table access reads above the page end -> extra cycle!
                 ;32..41
                 ;A=0..9
     sta bra+1   
                 ;36..45
bra  bpl *       
                 ;39 (36+3 for A=0 or 37+2 for A=$ff)
     lda #$a9    ;41
     lda #$a9    ;43
     lda #$a5    ;45
     nop         ;48 (ends one cycle earlier as the first dejitter cycle is the lookup-table penalty cylce)
code ...

table
     !by $08,$07,$06,$05,$04,$03,$02,$01,$ff,$00
;-------------------------------------------------
;page break here
;-------------------------------------------------
     !by $00         

Needs even one byte less for the clock slide part... but ofcourse, any advantage is eaten up by all the drawbacks like page-break requirements (now for that table also!), need for an index Register, higher "minimum overhead cost" (lda #const: sbc timer is cheaper in this respect!), etc.

But maybe this idea qualifies a Little better for a contribution to the mighty codebase?!??

[Edit]
Oops, that was too optimistic ;)) Ofcourse is must be
;---------------------------------
                 ;23..33
     ldx timer   
                 ;27..37
     lda table,x ;if timer holds the max-val, the table access reads above the page end -> extra cycle!
                 ;32..41
                 ;A=0,0,$ff,1,..,8 (see table)
     sta bra+1   
                 ;36..45
bra  bpl *       
                 ;39 (36+3 for A=0 or 37+2 for A=$ff)
     lda #$a9    ;41
     lda #$a9    ;43
     lda #$a9    ;45
     lda $ea     ;48 (ends one cycle earlier as the first dejitter cycle is the lookup-table penalty cylce)
code ...

table
     !by $08,$07,$06,$05,$04,$03,$02,$01,$ff,$00
;-------------------------------------------------
;page break here
;-------------------------------------------------
     !by $00         

The clock slide part is ofcourse _one_ byte less, not two ;p
2017-03-04 06:11
ChristopherJam

Registered: Aug 2004
Posts: 1409
Mind like a sieve. Look what I found on an old disk image from somewhere around 1989-1992.

Pretty sure I got the BPL from John West, after he independently discovered VSP some time around 1989-90




edit: argh, all this proves is that we *didn't* discover copyfault's improvement. I need some more sleep.

Also, welcome to my horrible source code from before I recanted my 'all cross developing is cheating' stance.
2017-03-04 07:53
oziphantom

Registered: Oct 2014
Posts: 490
6510+?
2017-03-04 08:31
ChristopherJam

Registered: Aug 2004
Posts: 1409
FASSEM
2017-03-04 11:21
Copyfault

Registered: Dec 2001
Posts: 478
Woke up this morning with the thought "Ending up on cycle 48 feels odd somehow...". After taking a shower it became clear to me. Scratch that bullshit with the "ends up one cycle earlier...". It's simply wrong! Instead, the clock slide code part must start with a NOP.
;---------------------------------
                 ;23..33
     ldx timer   
                 ;27..37
     lda table,x ;if timer holds the max-val, the table access reads above the page end -> extra cycle!
                 ;32..41
                 ;A=0,0,$ff,1,..,8 (see table)
     sta bra+1   
                 ;36..45
bra  bpl *       
                 ;39 (36+3 for A=0 or 37+2 for A=$ff)
     Nop         ;41
     lda #$a9    ;43
     lda #$a9    ;45
     lda #$a5    ;47
     Nop         ;49
;-------
;pb here
:-------
code ...

table
     !by $08,$07,$06,$05,$04,$03,$02,$01,$ff,$00
;-------------------------------------------------
;page break here
;-------------------------------------------------
     !by $00         

@Frantic: codebase-worthy in this state?
2017-03-06 09:53
ChristopherJam

Registered: Aug 2004
Posts: 1409
Yes, all you need is to start and end with NOP.

Kind of wondering about bit shifting approaches now.. For A in 0..3:
  lsr a
  bcs plus1   ;2 or 3 cycles
plus1
  lsr a
  bcs plus2  ;2 or 4 cycles
plus2     ; <- page boundary


..But do you still get the extra cycle cost when branching to the next instruction?
2017-03-06 10:36
HCL

Registered: Feb 2003
Posts: 728
..that's the kind of timing i have seen in some Crest-demos i think. It saves some space, but needs a few more cycles for LSR and an extra branch.
    lsr
    sta br+1
    bcc br
br: bpl..
    nop
    nop
    ..
Previous - 1 | 2 | 3 - Next
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
Brittle/Dentifrice^(?)
Guests online: 77
Top Demos
1 Next Level  (9.7)
2 13:37  (9.7)
3 Mojo  (9.7)
4 Coma Light 13  (9.6)
5 Edge of Disgrace  (9.6)
6 What Is The Matrix 2  (9.6)
7 The Demo Coder  (9.6)
8 Uncensored  (9.6)
9 Comaland 100%  (9.6)
10 Wonderland XIV  (9.6)
Top onefile Demos
1 No Listen  (9.6)
2 Layers  (9.6)
3 Cubic Dream  (9.6)
4 Party Elk 2  (9.6)
5 Copper Booze  (9.6)
6 Dawnfall V1.1  (9.5)
7 Rainbow Connection  (9.5)
8 Onscreen 5k  (9.5)
9 Morph  (9.5)
10 Libertongo  (9.5)
Top Groups
1 Performers  (9.3)
2 Booze Design  (9.3)
3 Oxyron  (9.3)
4 Triad  (9.3)
5 Censor Design  (9.3)
Top Webmasters
1 Slaygon  (9.6)
2 Perff  (9.6)
3 Sabbi  (9.5)
4 Morpheus  (9.4)
5 CreaMD  (9.1)

Home - Disclaimer
Copyright © No Name 2001-2024
Page generated in: 0.045 sec.