[CSDb] - User Forums - Exomizer on-the-fly loading/decompressing

You are not logged in - nap

CSDb User Forums

Forums > C64 Coding > Exomizer on-the-fly loading/decompressing

2015-09-17 14:05

cadaver

Registered: Feb 2002
Posts: 1160

Exomizer on-the-fly loading/decompressing

Hey,
anyone want to share, what is the lowest disk interleave you've managed to use with on-the-fly Exomizer decompression while loading?

I'm currently at 11, using 2-bit transfer and a lame drivecode (using jobcodes only) + 1 sector buffering. However I don't think the drivecode is the problem; if I try to decrease to IL 10 the C64 often doesn't have to wait for the drive at all for the next sector's data, but occasionally the depack will take too long, resulting in missed revolution.

I've already done some optimization to the depack routine, including inlining getting a single bit (literal/sequence decision, and reading the gamma).

Don't think I would switch to another packer just for speed, but nevertheless interested in any battle stories.

2015-09-17 19:16

cadaver

Registered: Feb 2002
Posts: 1160

Ok. Got it down to 10, not 100% reliably but reliably enough to result in a definite speed increase compared to 11.

Final changes were further inlining in the gamma/decision handling (do not jsr to a "refill bitbuf" routine, instead jsr directly to getbyte) and to not save the X register except when needed. I'm also purposefully disallowing literal sequences, since at least for my usecase they don't result in actual disk blocks being saved.

For those interested, the current loader code (may not be useful for general use): https://github.com/cadaver/hessian/blob/master/loader.s

(I'm checking for a sprite Y-coordinate range in the sector transfer, that could simply be removed if sprites were always off to save approx. 40 lines)

2015-09-17 19:23

Oswald

Registered: Apr 2002
Posts: 5094

cant help, but i hope this means a cool cadaver game :)

2015-09-18 07:18

tlr

Registered: Sep 2003
Posts: 1790

Perhaps you should try limiting copy lengths to 256 with -M256 as well. Then you can rewrite the copy loop and possibly even the bits/base decoder to optimize for that. Shouldn't reduce compression much.

2015-09-18 07:27

cadaver

Registered: Feb 2002
Posts: 1160

Actually I just did that :)

Though it seemed that it would still generate longer RLE sequences, so I patched Exomizer2 a bit to honor the -M parameters also for RLE.

2015-09-19 07:46

tlr

Registered: Sep 2003
Posts: 1790

So did you get any gain from doing that? It's going to generate a few more primary units so perhaps the increased number of bits used for encoding those eats up the gain in the loop?

2015-09-19 08:42

ChristopherJam

Registered: Aug 2004
Posts: 1409

Surely the 'perfect' interleave would depend on the data being decompressed? I'd expect a bunch of 'copy substrings' would consume a lot more cycles per byte of input than a chunk of literals.

2015-09-19 08:49

lft

Registered: Jul 2007
Posts: 369

Next up: Make a cruncher that simulates disk access times, and weighs that into the cost function that determines when to produce a copy or a literal unit. Too bad access times aren't really predictable.

2015-09-19 09:56

cadaver

Registered: Feb 2002
Posts: 1160

tlr: I first checked how often it was generating those longer sequences, in my case it wasn't often (a few times per the longer files).

The changes have resulted in a few bytes longer data here and there, but at least not in any more used blocks.

I haven't been able to bump interleave down from 10, but the optimizations have allowed a larger "safety factor" ie. ingame loading will be slower as there are raster IRQs and possibly sprites on, and having a faster decompressor reduces the chances of missed revolution hickups.

EDIT: lft: rather than to cripple the compressed output for speed, it should be possible to build the diskimage dynamically by emulating the CPU during loading/depacking to guarantee optimal interleave. Don't think I'll be going there, but it's a possibility.

2015-09-20 08:10

Krill

Registered: Apr 2002
Posts: 2980

I'm with ChristopherJam here, and i don't see an actual benefit unless you really have constant depacking time on every block. That is pretty unlikely to be achieved, except maybe with prohibitive degradation of overall performance.

The goal is the shortest combined loading and depacking time, and given that you want to download a block from the drive just when it is ready, you do that and prioritise this over depacking. Now, it may happen that time is wasted when the blocks arrive in a non-linear order, such that the depacker is doing nothing between new blocks arriving, until the next block in the linear packed data stream is ready.

However, it is fairly simple to have an interleave that makes all blocks arrive in linear order (even when the loader/depacker support out-of-order loading for the eventual hiccup). Then you can depack as much as possible until the next block is ready, and after downloading it from the drive, go on depacking where you left before. No time is wasted, and the rest of the file will be depacked when all blocks have been loaded.

2015-09-20 08:21

Krill

Registered: Apr 2002
Posts: 2980

lft, cadaver: IMHO, the biggest problem when factoring in actual run-time data to optimise combined loading and depacking is that you have at least one big source of error: the inter-track skew.

As conventional tools to transfer disk images to physical disks (just like most copy programs) do not line up the blocks on every track to the same respective sectors (blocks and sectors are different things here, as the 1541 does not have an index hole sensor), individual copies will start reading a different block when going from one track to the next.

Thus, you will see different timings from one disk to the next, even with the same drive.

... 23 posts hidden. Click here to view all posts....

Previous - 1 | 2 | 3 | 4 - Next

Refresh

Subscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search for in
All times are CET.

Search CSDb

Advanced

Users Online

MWR/Visdom
Guests online: 105

Top Demos

1 Next Level  (9.7)
2 13:37  (9.7)
3 Mojo  (9.7)
4 Coma Light 13  (9.6)
5 Edge of Disgrace  (9.6)
6 What Is The Matrix 2  (9.6)
7 The Demo Coder  (9.6)
8 Uncensored  (9.6)
9 Comaland 100%  (9.6)
10 Wonderland XIV  (9.6)

Top onefile Demos

1 No Listen  (9.6)
2 Layers  (9.6)
3 Cubic Dream  (9.6)
4 Party Elk 2  (9.6)
5 Copper Booze  (9.6)
6 Dawnfall V1.1  (9.5)
7 Rainbow Connection  (9.5)
8 Onscreen 5k  (9.5)
9 Morph  (9.5)
10 Libertongo  (9.5)

Top Groups

1 Performers  (9.3)
2 Booze Design  (9.3)
3 Oxyron  (9.3)
4 Triad  (9.3)
5 Censor Design  (9.3)

Top Musicians

1 Rob Hubbard  (9.7)
2 Mutetus  (9.7)
3 Jeroen Tel  (9.7)
4 Linus  (9.6)
5 Stinsen  (9.6)

Page generated in: 0.066 sec.