| |
Krill
Registered: Apr 2002 Posts: 3070 |
Alt-history no-cost design changes with great value
Which things in the C-64 could have been implemented or connected differently without conceivable extra cost, for coding advantages?
Thinking of things like shuffling the chip register bits like VIC's $d011 and $d016 differently (such that some effects can be achieved with fewer register writes or less twiddling).
Or putting some IO register to $01 (and move the memory configuration somewhere else, somehow).
Maybe also having different PLA memory configurations (not necessarily more).
Or connecting external signals to the CIA port pins in a different order.
Discuss! =) |
|
... 65 posts hidden. Click here to view all posts.... |
| |
Krill
Registered: Apr 2002 Posts: 3070 |
Quoting OswaldI dont get how CJ's HW version would work with only a 8 bit store previous result byte and byte wise eor. Not so sure either. While VIC does have a line buffer, it only buffers screen data worth 40 bytes (whose update you can prevent by disabling badlines).
For vertical filling, the entire width (320 pixels) would have to be buffered for XORing a line later.
Edit: Hmm well, maybe he meant repurposing the line buffer for actual bit/pixel storage. Then the additional bits would only be needed with destructive shifting-out, for buffering that shifted-out byte. But then that extra buffer could well be only one bit, i guess. :)
And colours/screen data would be lost, so... each line a badline? =) |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1423 |
Why so complicated you lot? I was pretty clear I only need one additional byte of state.
Here's an extract from the fill code from Effluvium:
rqZapLine0
lda rqkLBuf +$00
sta rqkCSet0+$00*64,y
eor rqkLBuf +$01
sta rqkCSet0+$01*64,y
eor rqkLBuf +$02
sta rqkCSet0+$02*64,y
eor rqkLBuf +$03
sta rqkCSet0+$03*64,y
eor rqkLBuf +$04
sta rqkCSet0+$04*64,y
eor rqkLBuf +$05
sta rqkCSet0+$05*64,y
eor rqkLBuf +$06
sta rqkCSet0+$06*64,y
eor rqkLBuf +$07
sta rqkCSet0+$07*64,y
Obviously in this instance I was using a contiguous eorbuffer rather than plotting into the charset itself, but you can see how few operations the fill uses. |
| |
Krill
Registered: Apr 2002 Posts: 3070 |
Not quite immediately obvious how this translates to hardware.
Can you elaborate? |
| |
Oswald
Registered: Apr 2002 Posts: 5118 |
so how does that fill a line horizontally ? if I have
%00001100 %00000000 %00110000
this will fill with your code to:
%00001100 %00001100 %00111100
and not to:
%00001111 %11111111 %11110000 |
| |
Krill
Registered: Apr 2002 Posts: 3070 |
Okay, so... horizontal pattern fill in hardware, with one additional state byte.
The fill itself would be simple bitwise XOR from one pixel to the next, and that additional state byte is the pattern mask, applied (XOR/AND/OR) to the output with an 8-pixel granularity? |
| |
oziphantom
Registered: Oct 2014 Posts: 502 |
I want cart memory without kernal. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1423 |
No, the state byte is the result of the most recent xor. Perhaps an example would help.
Start by filling buffer with this:
(base 4, using _ for 0 to make it a bit easier to read)
____ __12 12__ ____ __11 11__ ____
____ _121 2___ ____ __12 12__ ____
____ 1212 ____ ____ _111 1___ ____
____ 2121 ____ ____ _212 1___ ____
___2 121_ ____ ____ 1111 ____ ____
__21 21__ ____ ____ 1212 ____ ____
Note that for each of the two edges in the desired output, we're writing two bytes to each line; one updating the pattern in the right side of a byte, the next updating the pattern in the left.
Then, scanning from left to right, replace each input byte with the XOR of itself with the last byte output:
____ __12 1212 1212 12_3 _3_3 _3_3
____ _121 2121 2121 2133 3333 3333
____ 1212 1212 1212 13_3 _3_3 _3_3
____ 2121 2121 2121 2333 3333 3333
___2 1212 1212 1212 _3_3 _3_3 _3_3
__21 2121 2121 2121 3333 3333 3333
Now we have three regions, black on the left, a 1/2 checker in the middle, and a 03/33 stipple on the right.
Here's the eorfiller, assuming the data is in a charset
ldx#127
loop:
lda cset,x
eor cset+128*0,x
sta cset+128*0,x
eor cset+128*1,x
sta cset+128*1,x
eor cset+128*2,x
sta cset+128*2,x
eor cset+128*3,x
sta cset+128*3,x
...
dex
bpl loop
Widening Oswald's example a little, you do this:
%00001111 %11110000 %00000000 %00101010 %10000000
this will fill with my code to:
%00001111 %11111111 %11111111 %11010101 %01010101 |
| |
Oswald
Registered: Apr 2002 Posts: 5118 |
this is fuckin genious man, and why is this better for efluvium than vertical eorfill ? |
| |
Oswald
Registered: Apr 2002 Posts: 5118 |
but there is extra overhead of having to make 2 bytes per edge, so why not just per pixel horizontal HW fill ? |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1423 |
Cheers, Oswald.
Well, for vertical fill you still need two writes if you're going to get a pattern fill - and tracking the pattern phase for that one hurt my brain too much haha.
Per pixel HW fill would be fine if you just want solid colour, but why not go the extra six bits and get patterns :)
The extra cost of the second write is fairly low compared to calculating the X offset anyway. |
Previous - 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 - Next |