[CSDb] - User Forums

Welcome to our latest new user Macc ! (Registered 2025-07-23)

You are not logged in - nap

CSDb User Forums

Forums > C64 Coding > Drivecode

2012-03-02 17:31

Bitbreaker

Registered: Oct 2002
Posts: 510

Drivecode

Hi guys,

finally i wanted to give drivecode a try, but the transfer is the bottleneck (and 2kb of memory sucks as well). Actually i only would push this further if transfer of two bytes is seriously faster than 154 cycles, as that is what i need to transform one vertice, what i thought of offloading to the drive. I'd love to also implement backface-culling within the drive, but that seems to be mostly impossible due to lack of memory (assumed we do more complex stuff than a cube).

So what i do so far is on c64 side:

-
     lda $d012
     sbc #$31
     bcc +
     clc
     and #$07
     beq -
+
     lda #%00001011
     sta $dd00
     nop
     eor #%00001000
     sta $dd00
     lda #$ff
     eor $dd00
     lsr
     lsr
     eor $dd00
     lsr
     lsr
     eor $dd00
     lsr
     asr #$fe     ;lets carry be cleared after lsr!
     eor $dd00

And on 1541 side:

!align 255,0
bin2ser
     !byte %1111, %0111, %1101, %0101, %1011, %0011, %1001, %0001
     !byte %1110, %0110, %1100, %0100, %1010, %0010, %1000, %0000


     ldx #$0f
     sbx #$00
     lsr
     lsr
     lsr
     lsr
     sta .y1+1     ;keep y free
     lda bin2ser,x
-
     ldx $1800
     bpl -
     sta $1800
     asl
     and #$0f
     sta $1800
.y1  lda bin2ser
     sta $1800
     asl
     and #$0f
     sta $1800

Any idea how to get this reasonably faster? I'd also be okay if just bit 0-6 are transferred form each byte, but that does not seem to help much, as bit 6 and 7 are the last in the transfer. I also thought of doing a burst of two bytes per sync, but that did somehow not work as i get jitter into the second byte then :-(

Bitbreaker

... 19 posts hidden. Click here to view all posts....

2012-03-05 09:36

Bitbreaker

Registered: Oct 2002
Posts: 510

Thanks for talking about the obvious and explaining the sense of drivecode again *sigh*. Now as we have discussed all the irrelevant stuff, i'd be happy to return to the core questions: Is it possible to save cycles within the code? Can we transfer 7 bit only in less time? At least the possibility of bursts is now clarified after the comments from MagerValp, so thanks for that! Seems as i have to do more proper syncing, but can therefor burst for quite a while. But still, is there more to optimize?

And: Don't think code, write code. Quickly writing some code-uploader for the floppy and the transfer routines i presented here, was a piece of cake with all that information and documentation at hand. Now it is about optimizing, that's the fun part.

2012-03-05 13:44

Fresh

Registered: Jan 2005
Posts: 101

Can't think anything better than a 2 byte burst copy (As you may have already tried). I'vent tested the code, take it just as a suggestion. IIRC, 1541's cpu is a bit faster than pal c64 so you may need to wait a cycle to prevent jittering. The instructions commented with (*) can be switched with sta $1800,y (provided you put an ldy #$00 somewhere before the routine).
You said you only need bit 0-6 so you may even skip the last 2 bit by previously rolling one bit of val2 in val1.
My humble 2 cents.

(C64)

	 
     ...
     lda #%00001011
     sta $dd00
     nop
     eor #%00001000
     sta $dd00	 
     lda #$ff
     eor $dd00
     lsr
     lsr
     eor $dd00
     lsr
     lsr
     eor $dd00
     lsr
     asr #$fe     ;lets carry be cleared after lsr!
     eor $dd00
     tay
     lda #$ff
     eor $dd00
     lsr
     lsr
     eor $dd00
     lsr
     lsr
     eor $dd00
     lsr
     asr #$fe     ;lets carry be cleared after lsr!
     eor $dd00	 
     ...          ; 1st byte on Y, 2nd byte in A

(1541)

     !align 255,0
bin2ser
     !byte %1111, %0111, %1101, %0101, %1011, %0011, %1001, %0001
     !byte %1110, %0110, %1100, %0100, %1010, %0010, %1000, %0000

     ...
     lda val1
     ldx #$0f
     sbx #$00
     stx .y0+1
     lsr
     lsr
     lsr
     lsr
     sta .y1+1
     lda val2
     ldx #$0f
     sbx #$00
     stx .y2+1
     lsr
     lsr
     lsr
     lsr
     sta .y3+1
.y0
     lda bin2ser
-
     ldx $1800
     bpl -
     sta $1800
     asl
     and #$0f
     sta $1800
.y1  
     lda bin2ser
     sta $1800
     asl
     and #$0f
     sta $1800
.y2
     lda bin2ser
     sta $1800 ; (*)
     asl
     and #$0f
     sta $1800 ; (*)
.y3
     lda bin2ser
     sta $1800 ; (*)
     asl
     and #$0f
     sta $1800
     ...

2012-03-05 19:16

Bitbreaker

Registered: Oct 2002
Posts: 510

@Freshness79:
I am afraid that this won't work out, as preparation of the data on 1541 side consumes too much cycles then (need to prepare 2 bytes, while on c64 side sync is only done once, rest of the transaction is 28 cycles per byte on both sides)
The aggregation of bit 6 of both bytes might however be an option, i'll think about that and see if it will be faster.

2012-03-06 01:17

Fresh

Registered: Jan 2005
Posts: 101

Ok, I've worked some more on the problem.
I've found a solution with.. ehm... some constrains:
- Only 7 bit supported, highest bit MUST be 0
- You have to live with scrambled bits (which however can be easily descrambled with 256 byte table).
Beware I'vent tested it!

(64)

	 ...
         lda #%00001011
         sta $dd00
         nop
         eor #%00001000
	 sta $dd00
	 nop
         lda $dd00	     	; c=x A=hf000011
	 lsr			; c=1 A=0hf00001
	 ora $dd00		; c=1 A=gef00011 (h must be 0!)
	 lsr			; c=1 A=0gef0001
	 ldx $dd00		; X = db000000
	 lsr			; c=1 A=00gef000 
	 ora $dd00		; c=0 A=cagef000
	 ora table,x		; Translation table only for moving X on lower bits
	 ...

(1541)

	...
	ldx #$0f	; 2 - abcdefgh
	sbx #$00	; 2 X = 0000efgh
	lsr		; 2
	lsr		; 2
	lsr		; 2
	tay		; 2 Y = 000abcde
	txa		; 2	
	asl		; 2
	ldx #%00001010	; 2 A = 000efgh0 => 0000C0D0
loop
	bit $1800
	bpl loop
	sax $1800	; 4 fh
	lsr		; 2
	sax $1800	; 4 eg
	tya		; 2
	sax $1800	; 4 bd
	lsr		; 2
	sax $1800	; 4 ac
	...

2012-03-06 07:20

Bitbreaker

Registered: Oct 2002
Posts: 510

Kudos for the nice brainfuck :-) And nice to see sax in action, but still, takes more cycles on c64 than my first proposal :-) The additional

ora table,x
tax
lda descramble,x

just adds 10 additional cycles (though 3 times lsr is saved).

2012-03-06 12:50

Fresh

Registered: Jan 2005
Posts: 101

Yep, ending with a scrambled byte may be considered cheating, I guess. :)
Anyway, I post a corrected version just for completeness:

(1541)

	...
	ldx #$0f		        ; 2 - abcdefgh
	sbx #$00		        ; 2 X = 0000efgh
	lsr				; 2
	lsr				; 2
	lsr				; 2
	tay				; 2 Y = 000abcde
	txa				; 2 A = 0000efgh
	ldx #%00001010	                ; 2 X mask 0000C0D0
loop
	bit $1800
	bpl loop
	sax $1800	        ; 4 save (eg)
	asl			; 2 A=000efgh0
	sax $1800	        ; 4 save (fh)
	tya			; 2 A=000abcde
	sax $1800	        ; 4 save (bd)
	lsr			; 2 A=0000abcd
	sax $1800	        ; 4 save (ac)
	...

(64)

	...
	lda #%00001011
	sta $dd00
	nop
	eor #%00001000
	sta $dd00
	nop
        lda $dd00		;4 A=ge000011
	lsr			;2 A=0ge00001
	ldx $dd00		;4 X=hf000011
	lsr			;2 A=00ge0000
	ora $dd00		;4 A=dbge0011
	lsr			;2 A=0dbge001	
	ora $dd00		;4 A=cdbge011
	ora table,x		;4 A=cdbgehf1 (table move bits 7,6 to 2,^1)
	...

Partially OT comment: this may not be useful in your case, but there are at least two situations in which this solution may become interesting:
- if transmitted 7 bits will be used as an index, then it's just a matter of scrambling the indexed table.
- if transmitted 7 bits need further calculations anyway, then you can build a table to descramble and do that calc in one go.

2012-03-07 08:33

Dano

Registered: Jul 2004
Posts: 242

don't know if this helps a little?

http://www.pagetable.com/?p=568

2012-03-07 10:23

The Human Code Machine

Registered: Sep 2005
Posts: 114

Scrambling the bytes in the floppy before sending the data should do the trick. Everything saving cycles on the c64 side should be highest priority.

2012-03-07 16:24

Fresh

Registered: Jan 2005
Posts: 101

Two different proposals for 1541 side, both of them include scrambling.
In the former the only gain is an iny 'inside' the transfer. In the latter you gain more cycles but it's quite expensive in terms of memory.
Can't imagine anything faster on c64 side: those 2 adjacent bits are a nightmare.
I post a link to avoid flooding the thread.

http://pastebin.com/ZS8kUNKb

2012-03-07 17:31

The Human Code Machine

Registered: Sep 2005
Posts: 114

I wouldn't do the scrambling inside the transfer loop. Just calculate your stuff using the floppy, scramble the data, sync with the c64 and then burst all data to the c64 as fast as possible.

Previous - 1 | 2 | 3 - Next

Refresh

Subscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search for in
All times are CET.

Search CSDb

Advanced

Users Online

theK/ATL
Thunder.Bird/HF/MYD!..
Guests online: 44

Top Demos

1 Next Level  (9.7)
2 13:37  (9.7)
3 Codeboys & Endians  (9.7)
4 Mojo  (9.7)
5 Coma Light 13  (9.6)
6 Harminc  (9.6)
7 Edge of Disgrace  (9.6)
8 Comaland 100%  (9.6)
9 Wonderland XIV  (9.6)
10 Signal Carnival  (9.6)

Top onefile Demos

1 Nine  (9.8)
2 Layers  (9.6)
3 Cubic Dream  (9.6)
4 Party Elk 2  (9.6)
5 Copper Booze  (9.5)
6 Scan and Spin  (9.5)
7 Onscreen 5k  (9.5)
8 Grey  (9.5)
9 Dawnfall V1.1  (9.5)
10 Rainbow Connection  (9.5)

Top Groups

1 Performers  (9.3)
2 Booze Design  (9.3)
3 Censor Design  (9.3)
4 Oxyron  (9.3)
5 Artline Designs  (9.3)

Top Cover Designers

1 Duce  (9.8)
2 Electric  (9.8)
3 Junkie  (9.6)
4 The Elegance  (9.5)
5 Mermaid  (9.3)

Page generated in: 0.041 sec.