| |
Bitbreaker
Registered: Oct 2002 Posts: 508 |
Release id #139503 : Spindle 2.0
So with the spin mode it was now easy to quickly do a speedtest with the files i usually test with (most of the files from cl13 side1).
It turns out that spindle nearly loads as fast as bitfire with on the fly depacking. While bitfire chews in the tracks a tad faster, it has to make breaks to finalize the depacking. So data arrives a bit too fast first and blocks pile up to be decrunched. Spindle manages to have a continuous flow due to its blockwise packing scheme here.
Therefore the 18 files used get squeezed down to 491 blocks, as with bitfire down to 391 blocks. So Spindle leeches an additional 100 blocks in about the time bitfire requires for additional depacking.
However, under load the speed of spindle turns down rapidly, with 25% cpu load it is no faster than krill's loader, with 75% load it takes eons to leech the 491 blocks in :-( What's happening there?!
When is the 50x version from Krill done? :-D HCL, what's the penis length of your loader? :-D
Results here. |
|
... 91 posts hidden. Click here to view all posts.... |
| |
Oswald
Registered: Apr 2002 Posts: 5095 |
fixed order loading is the key for all those loaders beating krill's one. you dont need extra time to wait for sectors or find out their order beforehand. |
| |
Burglar
Registered: Dec 2004 Posts: 1105 |
Quoting Oswaldfixed order loading is the key for all those loaders beating krill's one. you dont need extra time to wait for sectors or find out their order beforehand.
no, yes, you just described out of order loading, and yes thats why spindle+bitfire are on top, and they beat krill cause they don't need to scan the track first to figure out whats what. but my question remains, why does HCL beat Krill? bd-loader does not load out of order, it must miss a sector every once in a while, so whats the deal? ;) |
| |
Bitbreaker
Registered: Oct 2002 Posts: 508 |
Burglar, mind the note i added for BD-loader, i chose teh best interleave per load, in real of course you need to decide for one interleave and then certain loads will suck in performance. The old tradeoff. HCL also gave me a version where the motor is on and it nearly reaches spindle/bitfire speeds, when cheating the benchmarks like that :-D So with the OOO-loaders performance will decrease linearly, with in order exorbitant. |
| |
HCL
Registered: Feb 2003 Posts: 728 |
I don't know why setting up the sector orden manually is cheating, it must be some pro-ooo-loading propaganda. Also i will not get bad performance if cpu-load varies since i can change the interleave for each file. Of course ooo solves that problem nicely, but not always faster.
Some words about the motor on/off thing. It became clear to me after seing my loader/decruncher in Bitbreaker's benchmark that i'm loosing helluvalot in the beginning of each file since the motor is turned off rapidly after each load. I just made a small patch and let it run instead just like all other loaders in the benchmark (i suppose), and voila that gave me ~25% lower loading times. That is of course since this benchmark has quite many small files, but please am i cheating the benchmark!?!?! I just wonder why you put my slowest numbers in that chart, is it because the better numbers voidifies the belief of ooo? You don't gain any 30% or such, it's merely at around 5% or what?! The saga continues :).. |
| |
Ksubi Account closed
Registered: Nov 2007 Posts: 87 |
Out of curiosity... how does Mr Wegi's loader compare to these?
Bongo Linking Engine |
| |
Bitbreaker
Registered: Oct 2002 Posts: 508 |
Quote: I don't know why setting up the sector orden manually is cheating, it must be some pro-ooo-loading propaganda. Also i will not get bad performance if cpu-load varies since i can change the interleave for each file. Of course ooo solves that problem nicely, but not always faster.
Some words about the motor on/off thing. It became clear to me after seing my loader/decruncher in Bitbreaker's benchmark that i'm loosing helluvalot in the beginning of each file since the motor is turned off rapidly after each load. I just made a small patch and let it run instead just like all other loaders in the benchmark (i suppose), and voila that gave me ~25% lower loading times. That is of course since this benchmark has quite many small files, but please am i cheating the benchmark!?!?! I just wonder why you put my slowest numbers in that chart, is it because the better numbers voidifies the belief of ooo? You don't gain any 30% or such, it's merely at around 5% or what?! The saga continues :)..
You drama queen! :-D Turning the motor off so fast gives of course a penalty in your case and leaving it on is valid, but the different interleaves are kind of cheating. Of course you can write each file with a different interleave, but you won't be able to place 2 files with different interleave on a same track. All unnecessary pain in the ass that is easily covered with OOO. Even i had to learn that, by first loading in order on my first prototypes. Now i just compile a demo and know that speeds will always behave well, no fiddling with interleaves, more time to draw boobs and do bitching :-D I'll update the numbers with motor soonish, need to try some new stolen code first on my loader :-) |
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
Just a thought that's emerging in my mind, a thought I'll never implement hence I just share it without and see where it leads: Wouldn't it be possible to devise a compression method that guarantees no two consecutive zero-bits in a row, thus remove the need for GCR-encoding completely. I.e. each nibble read from the disk has a direct meaning to the decompressor? |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
Quote: Just a thought that's emerging in my mind, a thought I'll never implement hence I just share it without and see where it leads: Wouldn't it be possible to devise a compression method that guarantees no two consecutive zero-bits in a row, thus remove the need for GCR-encoding completely. I.e. each nibble read from the disk has a direct meaning to the decompressor?
Absolutely! However, as there would be less information per bit, you would then need to transfer more bits from the drive. Whether that would be a win or not probably depends on CPU load; if you're just trickling a little data per frame then the disk reads and GCR decode are effectively free, so you'd likely gain nothing. However, if you're screen blanked and hammering the sector reads, it might actually help :) |
| |
HCL
Registered: Feb 2003 Posts: 728 |
I understand that you are proud of your loader Bitboy, i would be as well. But when it comes to speed, it seems like old-school demo-shit works about just as good as your modern stuff.
Then i admit it must be a bugger for you to run my shit in the benckmark since you have to transfer 4 disks with different interleave if you want to run it on real HW :P, i'm sorry for that, but is it cheating? Normally you re-generate your .d64 all the time anyway. It's not that you generate it just once and then start thinking of how to load it (or avoid thinking of it in your case).. Now i have to think one minute on how the files are placed on the disk if i want maximum performance (which i rarely do btw, but that's another story :P). Btw, you are already cheating then because you're using a special d64-tool that places your files in track-order(!).
Quote:need to try some new stolen code first on my loader :-) (!) :O /o\.. and you call *me* a drama queen ;). |
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
Isn't it about time to settle on a standard loader test corpus?
Expecting a single developer to accurately compare his or her own carefully-tuned work to semi-documented off-the-shelf libraries seems dubious, whether in the C64 scene or in academia.
Admittedly no data set is ever going to be perfectly representable and certainly reliability, whether in the form of hardware compatibility or insensitivity to variance, would be penalized by a pure speed metric. It would still serve as an interesting challenge though.
The condition may also be modified slightly, say by introducing a random CPU load and a range of rotational speeds to accommodate.
Quoting JackAsserWouldn't it be possible to devise a compression method that guarantees no two consecutive zero-bits in a row, thus remove the need for GCR-encoding completely. I.e. each nibble read from the disk has a direct meaning to the decompressor? Possibly. You could certainly device a Huffman-esque binary prefix code avoiding unrepresentable sequences but it may be difficult to rival the performance of traditional LZ with literal bytes, especially in terms of RAM for tables/buffers. The real kicker is that you'd be shifting load from the drive to the main CPU, both due to additional decode work and transfer overhead.
Plus the 10x one-bit limit is surprisingly annoying.
Quoting BitbreakerOf course you can write each file with a different interleave, but you won't be able to place 2 files with different interleave on a same track. I don't see why not. Admittedly loaders without CBM linking or custom formatting would require per-track interleave tables.
The results of mixed strides will naturally be worse but a decent optimizer can still do a reasonable job of it. After all things never did line up perfectly even with a fixed interleave.
The ideal in-order loader would use a profiling pass with a dummy loader (e.g. a cartridge build) to measure the elapsed time between sector transfers. Data which would then be fed back to the layout optimizer. Perhaps refined with a programmable safety margin and priorities or deadline measurements to weigh non-critical files. |
Previous - 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 - Next |