| |
Krill
Registered: Apr 2002 Posts: 2982 |
Release id #197710 : Transwarp v0.64
General Q&A thread, also report problems and error logs here. |
|
... 162 posts hidden. Click here to view all posts.... |
| |
Krill
Registered: Apr 2002 Posts: 2982 |
Quoting BiGFooTByte 2&3 contains the load address, 4&5 is the end
andress. Worth noting here, imho, is that Byte 1 contains a CRC8 checksum over the entire file. This is only computed (because slow) and checked when there are block checksum errors with retries, but ultimately a successful file load.
It does happen that funky drives or semi-broken disks load successfully eventually, and this second line of defence decreases the likelihood of false positives (good checksum despite bad data) quite a bit, even with the first line being somewhat more sophisticated than simple byte-wise EOR. :)
Ah, and byte 0 is the file's starting track. :D |
| |
Krill
Registered: Apr 2002 Posts: 2982 |
Another trick: JMP ($DD00) for less jitter when reacting to drive signals.
Normally you'd do things like- bit $dd00
bpl/bmi/bvc/bvs - to wait for a state change on the serial bus. This has a jitter of 7 cycles (on PAL, 8 cycles on NTSC/PAL-N because the C-64's CPU is clocked faster than the drive's CPU),
Using JMP ($DD00), this is reduced to 5 (6) cycles of jitter.
$DD01 (the RS232/parallel port bits connected to the user port) needs to be set to output and initialised to the high-byte of a jump table.
So it's trading size for speed, as this jump table is slightly more than $c0 bytes big, due to the incoming bus state bits sitting on bits 7 and 6. (Can of course stuff unrelated stuff into the gaps of this sparse table.)
Using this trick, it is possible to have a minimum of 6 cycles between serial bus signal updates (instead of 8 normally) without requiring further jitter reduction (using the half-variance technique), such that things likeldx #$0a ; CLKOUT + DATOUT
sax $1800; 2 data bits
asl
sax $1800; 2 more data bits sending multiple bitpairs over serial can be done quicker. |
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
Quote: Another trick: JMP ($DD00) for less jitter when reacting to drive signals.
Normally you'd do things like- bit $dd00
bpl/bmi/bvc/bvs - to wait for a state change on the serial bus. This has a jitter of 7 cycles (on PAL, 8 cycles on NTSC/PAL-N because the C-64's CPU is clocked faster than the drive's CPU),
Using JMP ($DD00), this is reduced to 5 (6) cycles of jitter.
$DD01 (the RS232/parallel port bits connected to the user port) needs to be set to output and initialised to the high-byte of a jump table.
So it's trading size for speed, as this jump table is slightly more than $c0 bytes big, due to the incoming bus state bits sitting on bits 7 and 6. (Can of course stuff unrelated stuff into the gaps of this sparse table.)
Using this trick, it is possible to have a minimum of 6 cycles between serial bus signal updates (instead of 8 normally) without requiring further jitter reduction (using the half-variance technique), such that things likeldx #$0a ; CLKOUT + DATOUT
sax $1800; 2 data bits
asl
sax $1800; 2 more data bits sending multiple bitpairs over serial can be done quicker.
Neat trick. Love it! Also like how you use the word ’stuff’ both as a verb and a noun in the same sentence! ❤️ Almost like the Smurfs smurfing smurfs up. |
| |
Richard
Registered: Dec 2001 Posts: 621 |
Transwarp's speed is amazing. It would probably make a great C64 filebrowser/menu system ;) |
| |
Krill
Registered: Apr 2002 Posts: 2982 |
User Comment
Submitted by ThunderBlade [PM] on 27 November 2020
I've waited for this since Heureka Sprint. The 64'er magazine had an article on it concluding with "In theory it should be possible to get even faster than that". Proven now! :) Hmm, but (slightly) faster systems than Heureka Sprint have been existing since before it was invented*. =)
It also seems like some conscious concessions were made for Heureka Sprint, paying with slightly slower speed for standard format compatibility. It uses a 70% encoding, not 75% as the original Vorpal (and its knock-off Action Replay Warp*25) does.
Transwarp does the same. If it were not for the intended d64 compatibility, it could use a slightly denser encoding and thus slightly more speed. But that decision was easy, nobody really wants to transfer weirdo g64 images to real disks. :)
* https://www.lemon64.com/forum/viewtopic.php?t=41482 |
| |
Krill
Registered: Apr 2002 Posts: 2982 |
Quoting RichardTranswarp's speed is amazing. It would probably make a great C64 filebrowser/menu system ;) Not before there is a built-in 2-rev fallback fastloader for standard format. And that one will probably be somewhere slightly above half the speed of the custom format. |
| |
Copyfault
Registered: Dec 2001 Posts: 478 |
Quoting KrillAnother trick: JMP ($DD00) for less jitter when reacting to drive signals.
[...] This is so nice!!! And personally I don't care if this requires a jump-table, usually there's enough mem space on c64-side.
Thanks for sharing! |
| |
Krill
Registered: Apr 2002 Posts: 2982 |
Quoting KrillOk, so who'd be in for a second challenge, then, with a high-entropy plaintext file that is still sensible (and rewarding to decrypt, not just random bitsalad)? :) Quoting BiGFooTI'm in with https://en.wikipedia.org/wiki/Frequency_analysis ;) New challenge image added, also containing some bonus material. =)
(But i'm really not so sure how much frequency analysis would help with this kind of scrambling.) |
| |
BiGFooT
Registered: Mar 2002 Posts: 33 |
My understanding of "plaintext" was "plain text", so freq. analysis came from that. Anyway, I don't want to play alone. |
| |
Krill
Registered: Apr 2002 Posts: 2982 |
Another technique: using interrupts to break out of the block read loop without stop condition checks.
In a loader, the loop to read data from disk normally terminates after having collected the block's data.
This is usually done by things like "inx : bne loop" or "tsx : bne loop" when storing block data on the stack using pha.
This requires at least an index register, and 2 cycles to increase it or to retrieve the stack pointer, both then setting the Z flag for loop termination or continuation.
This overhead can be avoided, or rather shifted out of the read loop (which needs to be extra-tight, and every cycle comes at a premium).
Since block read and transfer is tightly coupled in a fastloader, the C-64 can signal breaking out of the read/transfer loop, and it can do so by triggering an interrupt on the drive side via asserting the ATN line.
This causes the drive ROM's interrupt handler to be executed. However, one cannot simply install an interrupt hook or overwrite the IRQ vector, as it is possible on the C-64. (1571 has an interrupt hook, though.)
With some preparations (the disk controller VIA timer needs to have been underrun) the ROM interrupt handler will then check the job code table at $00 for active jobs (job codes $80+).
Having a jump job code ($d0) for $0700 then would execute the block read user code at $0700 to handle the next block rolling by, effectively breaking out of the read loop which was executing before the ATN interrupt was triggered.
Note that the job table is checked backwards, starting with the RAM-less job for $0800, then going down to $0300. So having the code to execute at $0700 spends the least cycles in the ROM interrupt handler. |
Previous - 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | ... | 18 - Next |