| |
lft
Registered: Jul 2007 Posts: 369 |
GCR decoding on the fly
Here's how to do it:
http://linusakesson.net/programming/gcr-decoding/index.php |
|
... 149 posts hidden. Click here to view all posts.... |
| |
Krill
Registered: Apr 2002 Posts: 2982 |
Quoting HCLi would say the computer side is the bottle neck, at least i have NOPs in my transfer loop on the computer side.
If you have NOPs on the computer side to make up for drive slowness, how does that make the computer side the bottle neck? :) |
| |
lft
Registered: Jul 2007 Posts: 369 |
Quoting doynaxI think I've managed 16 cycles actually (66.5 per byte in practice with 2x unrolling.)
The trick is to reduce the delay between reading the bits and flipping ATN by combining both in a single RMW instruction (e.g. SLO/SRE.)
Yes, I also had this idea. But I couldn't figure out a way to do it without restricting the user to vicbank 0 (and maybe also 3). Did you find a way that works regardless of vicbank? |
| |
HCL
Registered: Feb 2003 Posts: 728 |
@Krill: The drive loop is faster (of course, else it would not work), it's the computer side that can not suck out the data faster because of that timing issue you just explained.. Even if i reduce the drive loop to 12 or 14 cycles, the computer side still has to be 18 cycles -> the bottle neck. |
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
Quoting KrillHmm, how does that speed up the drive side, which is the bottleneck here, as it has to wait for the C-64 and respond to ATN flips asap? Excellent question.
It seems to work in practice and has done so for a while even under IRQ/DMA heavy conditions, though that doesn't necessarily mean much given how few drives I've tested. At 15 cycles it starts to crap out once in a blue moon.
I suppose that I don't quite buy your 6 + 7 cycle sum for the ATN cost. Presumably if the first ATN is late then you've only got the branch of the first loop left to execute, plus the seven of the second, equals 9 in total.
Still, it's likely I'm just confused. Anyone care to write up a little simulator to generate some sequence diagrams? |
| |
Krill
Registered: Apr 2002 Posts: 2982 |
doynax: Yes, very likely my quickly-thought-out explanation is wrong.
HCL: Well my point was the relevant bottle-neck drive loop is the wait for ATN flip (the branch back to bit $1800), not the no-branch time between two bitpair updates. But maybe we're just saying the same thing in different words.
lft: I also had the VIC bank restriction thought about using RMW opcodes on $dd00, didn't find a solution either. |
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
Quoting lftYes, I also had this idea. But I couldn't figure out a way to do it without restricting the user to vicbank 0 (and maybe also 3). Did you find a way that works regardless of vicbank? To be honest I never even tried. I'm working on a game with somewhat limited VIC tricks so I've gotten away with using bank 3 almost exclusively. |
| |
lft
Registered: Jul 2007 Posts: 369 |
This is how I understand the timing constraints. On the drive side, here's how you transmit two bit pairs:
; prepare value in A
bit $1800
bmi *-3
sta $1800
; prepare value in A
bit $1800
bpl *-3
sta $1800
It is clear that a bit pair cannot be guaranteed to be on the serial bus earlier than 13 cycles after ATN changes, because if ATN changes just after it was sampled during the last cycle of a bit instruction, we need 3 (bpl) + 4 (bit) + 2 (bpl) + 4 (sta) = 13 cycles to put the new value into the VIA.
For this reason, we can use up to 7 cycles to prepare each bit pair. The C64 will not toggle ATN earlier than 4 cycles after reading out the last bit pair. Following this cycle, 3 (remaining preparation) + 4 (bit) + 2 (bpl) + 4 (sta) = 13 cycles.
On the C64 side, after reading a bit pair, we spend 4 cycles writing a new value to ATN. Then we read the new bit pair after 14 cycles. Hence, 18 in total. Why can't we read already after 13 cycles? This is because the clocks of the C64 and the 1541 are almost always out of phase. After updating ATN on a C64 clock tick, it will take on average half a cycle before the next 1541 clock tick. When sending the bits back, there is again a delay before the next C64 clock tick, and the total delay will be one C64 cycle (unless we're really lucky and the 1541 cycles, being a tad shorter, fit perfectly in between the C64 cycles).
C64 1-------2-------3-------4-------
1541 ----1------2------3------4------
(not to scale)
|
| |
HCL
Registered: Feb 2003 Posts: 728 |
Ah, for once i think i understand :). LFT, what is that book you have? everyone should have it ;). |
| |
Krill
Registered: Apr 2002 Posts: 2982 |
Yes, this explains everything. :) |
| |
tlr
Registered: Sep 2003 Posts: 1791 |
Quoting lftQuoting doynaxI think I've managed 16 cycles actually (66.5 per byte in practice with 2x unrolling.)
The trick is to reduce the delay between reading the bits and flipping ATN by combining both in a single RMW instruction (e.g. SLO/SRE.)
Yes, I also had this idea. But I couldn't figure out a way to do it without restricting the user to vicbank 0 (and maybe also 3). Did you find a way that works regardless of vicbank?
Couldn't the $dd00 bank bits just be kept 00? Then switching can be done via $dd02. |
Previous - 1 | ... | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 - Next |