| |
Bitbreaker
Registered: Oct 2002 Posts: 508 |
Release id #139503 : Spindle 2.0
So with the spin mode it was now easy to quickly do a speedtest with the files i usually test with (most of the files from cl13 side1).
It turns out that spindle nearly loads as fast as bitfire with on the fly depacking. While bitfire chews in the tracks a tad faster, it has to make breaks to finalize the depacking. So data arrives a bit too fast first and blocks pile up to be decrunched. Spindle manages to have a continuous flow due to its blockwise packing scheme here.
Therefore the 18 files used get squeezed down to 491 blocks, as with bitfire down to 391 blocks. So Spindle leeches an additional 100 blocks in about the time bitfire requires for additional depacking.
However, under load the speed of spindle turns down rapidly, with 25% cpu load it is no faster than krill's loader, with 75% load it takes eons to leech the 491 blocks in :-( What's happening there?!
When is the 50x version from Krill done? :-D HCL, what's the penis length of your loader? :-D
Results here. |
|
... 91 posts hidden. Click here to view all posts.... |
| |
Bitbreaker
Registered: Oct 2002 Posts: 508 |
Alright, updated the numbers for BD-loader including viagra, penis-enlargement surgery and cock-ring :-)
And the Frames mean just, that i have a frame counter running, increasing each vsync, so Frames / 50 is loading time in seconds. Then i block the cpu in the irq by wasting another 25/50/75% of the raster time. Screen is on however and of course badlines not spread evenly over the 4 blocks, but acuracy is not that much needed, real drives vary more anyway. E.g. that of THCM gets faster when it is warm, but usually a bit below the benchmarks. |
| |
Bitbreaker
Registered: Oct 2002 Posts: 508 |
On the contrary, removing the checksumming just brings about 5 frames gain in my case, just tried it. But i understand that it is a huge pain in the ass to calculate it with the data scattered over 8 locations (including different offsets and shifts for the decode tables) as with the BD-loader. No fun. Then however, not pulling the whole sector through at once and doing no offline decoding should give an additional 2 frames only. So maybe other things are worth the effort, like serial transfer. |
| |
Frantic
Registered: Mar 2003 Posts: 1648 |
I remember when I was running the LCP 2004 compo and was in the midst of showing the borderline demo by Triad in front of the crowd. Somehow I misread the notes I had on my papers, and I hadn't had the time to check the demo before the actual compo, so I was under the (drunken) impression that this demo was a onefiler. Therefore I thought that I could safely remove the disk from the drive while the demo was running and start preparing for the next demo to show only to notice how Iopop's face turned white, then green, and then raster-bar-striped, and telling me to insert that disk immediately again. So I did, and that fancy loader didn't complain, although I had removed the disk in the middle of actual loading (should have checked the led on the diskdrive eh..). According to the CSDb credits that loader was done by 6R6, so maybe that loader deserves som credits too, for being absolutely foolproof literally. :)
Speaking of checksumming and such, that is.. |
| |
HCL
Registered: Feb 2003 Posts: 728 |
@Frantic: Haha.. i can never imagine you doing such a thing ;)
@Bitbreaker: Wtf, according to latest results, my shit is even a fraction faster than all the other magic loaders when some CPU-load comes into play!??! Can hardly trust those numbers.. I'm not doing anything to be faster than anyone else, except skipping that checksum perhaps :P. I'm using the kernel as much as possible, only optimizing the read-loop and the transfer. On the computer-side i have the new ByteBoozer2, which is on par, but not faster than bitfire. Any explanation!?
One idea i had about ooo was.. to get optimal fetching of sectors perhaps you should prioritize the next sector in the file, since the decruncher might be waiting for it.. Then grabbing any sector on the way there that doesn't cause the high-prio sector to be missed. But perhaps you already do something like that!? |
| |
Bitbreaker
Registered: Oct 2002 Posts: 508 |
Of course yours is faster that way, as all sectors arrive just in time and in order with the right interleave, but this only happens under such straight conditiosn with a fixed load. Imagine with a variable load, you will never be able to have the right interleave and thus miss sectors giving you an extra revolution penalty worstcase. Also have fun calculating the right offset per file and mixing that offset per track that are occupied by different files. See it as the theoretical limit that you neve reach under demo conditions, whereas with ooo, it will reflect real demo conditions. That is why i say it is a bit of cheating. Let me see if i can do a testsituation with a variable load and then lets try again :-D
And yes, i made already tries to priorize the first sector, it performs better if you force the first sector to be block 0 of file, but under load it turns out to be worse again :-D Also increasing allowed distance to next block during loading does not help. Played around a lot with teh barriers, at no avail. All i can do is forcing in order by keeping the max allowed distance between blocks a 1. |
| |
Bitbreaker
Registered: Oct 2002 Posts: 508 |
Ok, quick test: increasing load per frame by one rasterline, best result i can get for your loader is $0a4c frames with interleave 7, wheras bitfire manages that in $09bc frames. That is where ooo hits in. Hard to predict a perfect interleave for in order under such conditions.
And yes, with checksumming it is no problem to just remove the disk and put it back in to make also bitfire and sure others to continue loading, as long as you don't rely on data that should have been loaded meanwhile :-D Floppy fuck! |
| |
Oswald
Registered: Apr 2002 Posts: 5095 |
ooo ? |
| |
Burglar
Registered: Dec 2004 Posts: 1105 |
Quoting Oswaldooo ? Out of Oswald loading of course |
| |
HCL
Registered: Feb 2003 Posts: 728 |
Ok, great now we know the pros and cons.. Thanx Bitbreaker for all the time you put on testing. Think i will settle down now after 5 intense days of csdb postin, and finish that loader/packer system :).
Then sure i'll check out Spindle 2.0 (!) |
| |
Glasnost Account closed
Registered: Aug 2011 Posts: 26 |
Reading this discussion, i wonder...
- do all loaders in the example have similar timing safety, eg. in stepping tracksm or inner read loop?
- Why are the testfiles so small on average? That gives ooo load (scatterload) a disadvantage in loading while decompressing. |
Previous - 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 - Next |