| |
Bitbreaker
Registered: Oct 2002 Posts: 508 |
Release id #139503 : Spindle 2.0
So with the spin mode it was now easy to quickly do a speedtest with the files i usually test with (most of the files from cl13 side1).
It turns out that spindle nearly loads as fast as bitfire with on the fly depacking. While bitfire chews in the tracks a tad faster, it has to make breaks to finalize the depacking. So data arrives a bit too fast first and blocks pile up to be decrunched. Spindle manages to have a continuous flow due to its blockwise packing scheme here.
Therefore the 18 files used get squeezed down to 491 blocks, as with bitfire down to 391 blocks. So Spindle leeches an additional 100 blocks in about the time bitfire requires for additional depacking.
However, under load the speed of spindle turns down rapidly, with 25% cpu load it is no faster than krill's loader, with 75% load it takes eons to leech the 491 blocks in :-( What's happening there?!
When is the 50x version from Krill done? :-D HCL, what's the penis length of your loader? :-D
Results here. |
|
... 91 posts hidden. Click here to view all posts.... |
| |
Oswald
Registered: Apr 2002 Posts: 5095 |
out of order ?:)
I dont understand when you use the term fixed, and when out of order loading. I thought fixed is with fixed interleave what the loader expects, and out of order is a loader which can deal with whatever sector order. but you corrected me, so how is this ? |
| |
Jammer
Registered: Nov 2002 Posts: 1336 |
What Ksubi asked - what about Wegi's solution? His system seems not that popular and maybe it should? ;) |
| |
Oswald
Registered: Apr 2002 Posts: 5095 |
ah, guess its about sector order. |
| |
lft
Registered: Jul 2007 Posts: 369 |
Gentlemen.
While you were entertaining yourselves here, I took the liberty of improving the performance of Spindle.
Details to follow. |
| |
chatGPZ
Registered: Dec 2001 Posts: 11391 |
Quote:@Groepaz: Are you serious, or did i just mis-interpret that smiley!?
nope. your loader/(some of your) demos are the only ones that will not work using chameleons external IEC feature (see here) - and the only sane explanation for that is that the bus timing is very much on the edge (the difference in timing between internal and external is less than a c64 cycle). it might even fail on real drive/c64 when one of the crystals is worn out enough to result in such subtle difference (i have not tested that). (jiffy dos on pal has a similar problem - and there the bus timing is so much on the edge that it barely works even on the real thing) |
| |
Krill
Registered: Apr 2002 Posts: 2982 |
Quoting Glasnost- do all loaders in the example have similar timing safety, eg. in stepping tracksm or inner read loop? I'd assume so. Some intolerances only show randomly after years with certain computer/drive combinations, but i guess all of those loaders have been tested with a reasonable number of device combinations.
Quoting Glasnost- Why are the testfiles so small on average? That gives ooo load (scatterload) a disadvantage in loading while decompressing. Indeed. It's a good idea to load a demo part in one single file, which minimises all sorts of overhead. Lots of small files isn't a realistic test setup for demo performance. |
| |
Bitbreaker
Registered: Oct 2002 Posts: 508 |
Summarizing again:
Out of order loading: Leech in any block that belongs to the file and place it in the right position in c64 mem.
In order: Fetch the blocks in the right order and place them one after another in mem.
Fixed interleave:
When relying to the standard sector layout with a track/sector-link in the beginning of each block, you can't determine which other blocks belong to your file, unless you first scan a whole track and find out, or load one block after another in a serial manner (in order loading again). When you store the files with a fixed interleave on disk you can predict what blocks belong to your file by having the start sector/track + blocklength/filelength at hand. Then, no scanning is necessary. While you are at that you can throw away the track/sector link and use the whole 256 byte of a sector for payload. This also seizes the annoying $fe sectorsize.
I might also give Bongo a try, but am a bit scared by the windows tool coming along with it :-D
As for the fileset i test with, it is most of the files form coma light 13 side 1. First of all, it is demodata, so it should be typical data being used in my preferred scenario. I only threw out a few files that went under io, as not all loaders handle that. Yes, there's small files, and i find that realistic for a demo, not necessarily for a benchmark. Why realistic in a demo setting? Because we tend to fill up gaps beforehand while an effect is on the screen to have less loading during a transition and to keep pace high in that example. I can of course also slap together a bunch of extremely huge Bob parts that fill all mem.
Lft: Good to hear of improvements! :-D |
| |
Oswald
Registered: Apr 2002 Posts: 5095 |
thanks! |
| |
Krill
Registered: Apr 2002 Posts: 2982 |
Quoting BitbreakerYes, there's small files, and i find that realistic for a demo, not necessarily for a benchmark. Why realistic in a demo setting? Because we tend to fill up gaps beforehand while an effect is on the screen to have less loading during a transition and to keep pace high in that example. Okay, i'm sceptical about those gaps, but never mind. But then, would it be possible to have the compressor handle these gaps? Such that you'd link all the small unpacked files and compress to one big file in the end, with all the advantages of minimising loader overhead and maximising pack ratio. (Maybe, at first, exceeding your magical $0200 bytes resident code size limit, as the decompressor will have to add an offset to its output pointer when skipping a gap.) |
| |
Bitbreaker
Registered: Oct 2002 Posts: 508 |
Not in the yet version, but that is what i do in the stream version prototype, having the disk filled with one huge blob, no partly filled sectors, every byte used. Then each block with destination address is fetched from that stream and placed in its respective place in mem. It is bound to in order yet, as no dir and no additional info is needed then. Therefore loading under IO is possible at no extra costs. Size is $131 + $100 buffer.
I btw. updated the numbers for spindle 2.1
Also did a test with huge files. Bitfire performs then at 9,10 kb/s, so it is true that huge files help. Loadraw performance is then at 5,67kb/s. |
Previous - 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 - Next |