[CSDb] - User Forums - Theoretically optimal 1541 GCR encoding scheme

Welcome to our latest new user Artcore ! (Registered 2024-12-18)

You are not logged in - nap

CSDb User Forums

Forums > C64 Coding > Theoretically optimal 1541 GCR encoding scheme

2012-11-14 00:28

Kabuto
Account closed

Registered: Sep 2004
Posts: 58

Theoretically optimal 1541 GCR encoding scheme

I did some research on GCR encoding and ended up getting 7 data bits stored into 8 data bits each without violating GCR constraints (no 3 "0" bits in a row and no 10 "1" bits in a row).

This is barely above the theoretical limit.

And interestingly decoding is very fast using this scheme - but encoding is a more difficult.

This scheme uses arithmetic coding similar to what arithmetic packers use, just the other way round (storing constant-entropy 7-bit values in variable-entropy GCR bytes).

Here's a simple javascript implementation to show the idea:

http://kabuto.untergrund.net/files/c64stuff/gcrTest.js

EDIT: here's a sketched decoder loop, lacking drive sync stuff, though, that reads and decodes 8 raw bytes to 7 data bytes.

	; initially A contains the arithmetical decoder state (up to 7 bits) and C might contain a carry to be added to it
loop:
	LDX databyte
	ADC ari_lo, X  ; these table entries contain 7 bits each, too - erroneous entries contain $FF and thus might set C
	ROL            ; since we used bits 0..6 so far bit 7 contains the carry. Shift it into the C flag.
                       ; (And shift the C flag, which is hopefully set upon errors due to adding $FF (TODO change offsets to ensure that's actually the case), into the now-unused bit 0 of tmp.)
	STA tmp        ; Don't store the first decoded 7 bits but use them for filling the next 7-bit values
	LDA ari_hi, X  ; Fetch upper 7 arith bits

	; repeat the following block 7 times

	LDX databyte
	ADC ari_lo, X  ; Same here - add 7 lower bits of table entry for GCR byte
	ROL tmp	       ; Get a data bit and store error bit
	; in the last iteration tmp contains error flags from each of the 8 steps so we could thus BNE to an error handler
	ROL             ; Combine the 7 decoded data bits with the additional bit, also push the carry from bit 7 to flag C
	STA result+i, Y ; and store decoded byte
	LDA ari_hi, X   ; Get 7 upper bits of table entry
	
	; end repeat

... 20 posts hidden. Click here to view all posts....

2016-01-21 21:55

doynax
Account closed

Registered: Oct 2004
Posts: 212

Thank you! Now I finally "grok" it :)

Somehow I could never quite wrap my head around that the possible adjacency a suffix state (number of zeroes/ones) linearly adjoined to the next prefix state, so there is no need for carrying over explicit state. Brilliant!

The claim to being an optimal GCR encoding scheme is certainly false however as it is, in fact, not a grouped code at all.. Well, not unless anything with a fixed word-length like FM/MFM is to be considered technically "grouped".

Presumably another tricky bit was scaling it such that the width of every code ranges rounds down and therefore no input ends up being unrepresentable. Despite the limited precision and minimal extra entropy available as slack.

The "brute-force" encoder doesn't appear to be a practical problem for off-line encoding. There may be degenerate input but in practice it always seems to terminate after a shallow search. Much like carries in traditional arithmetic coding I should imagine.

Incidentally I can confirm that the coding works just fine on hardware, at least on the few drives which I have access to. Admittedly the inconvenience of foregoing standard tools and general compatibility nightmare doesn't make up for the space and speed savings in dropping standard CBM formatting (at most 40% or so) but it is still a lot of fun to tinker with.

2016-01-22 23:14

Kabuto
Account closed

Registered: Sep 2004
Posts: 58

Just rewrote the encoder, it uses no more recursion and runs much faster, you could even implement it directly on a C64 (though it still might be slightly too large for fitting into drive RAM):

http://kabuto.untergrund.net/files/c64stuff/gcrTest2.js

doynax: haha, totally overlooked that this is no longer GCR!

2016-01-23 05:04

Oswald

Registered: Apr 2002
Posts: 5094

how much more data does this allow on a disk?

2016-01-23 07:32

Hoogo

Registered: Jun 2002
Posts: 105

If you keep the current layout of sectors with 320 encoded bytes, then you will get 280 usable bytes instead of 256, about 10% more.

2016-01-23 07:35

ChristopherJam

Registered: Aug 2004
Posts: 1409

(7/8)/(5/4)=1.09375, so about 9%?

If you kept the header blocks the same and just replaced the 320 raw bytes in the data blocks, each block should grow from 256 encoded bytes to 280 encoded bytes.

(edit: @hoogo, snap!)

2016-01-23 13:39

Kabuto
Account closed

Registered: Sep 2004
Posts: 58

You could squeeze 25/23/21/20 sectors into each track depending on the speed zone in use, assuming you could keep overhead small, as opposed to the original encoding, so you'd get about 18% more disk space. Use tracks 36-40 or use higher speed zones than Commodore did and you'd get more space, but I think there was a reason Commodore chose the speed zones the way they did.

After all I wonder how useful this will really be. Potential benefits:

* Less disk sides for your demo. But OTOH there doesn't seem to be a "maximum number of disk sides" limit anymore. People often run demos from 1541 Ultimate & Co so having 3 disk sides instead of 2 doesn't make handling more difficult (like it does when you have to work with actual disks)
* "Flip Disk" less often - 15% to 20% less often.
* Load speed will be just a tiny bit faster than for GCR-based loaders. Comparing my proposed loader with LFT's existing loader: both loaders have comparable sizes, the only benefit of my routine would be that it would load a sector faster just because it's shorter on disk and it needs to change tracks less often, but that's just a small factor, in real life the advantage will be pretty small.

Compare that with disadvantages:

* .gcr is more difficult to handle than .d64. Both are handled equally well by emulators these days, but transferring .gcr to a standard disk drive requires a custom-made write/copy program (unless you have a drive with a parallel cable). It's not a big hurdle, but something that might be annoying when your favourite copier fails to copy a disk.

Previous - 1 | 2 | 3 - Next

Refresh

Subscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search for in
All times are CET.

Search CSDb

Advanced

Users Online

TPM/Silicon Ltd
Urban Space Cowboy
rexbeng
sailor/Triad
Murphy/Exceed
Paladin/G★P
Didi/Laxity
Acidchild/Padua
zscs
MWR/Visdom
iAN CooG/HVSC
Bieno/Commodore Plus
Sokrates
Guests online: 96

Top Demos

1 Next Level  (9.7)
2 13:37  (9.7)
3 Coma Light 13  (9.6)
4 Mojo  (9.6)
5 Edge of Disgrace  (9.6)
6 What Is The Matrix 2  (9.6)
7 The Demo Coder  (9.6)
8 Uncensored  (9.6)
9 Comaland 100%  (9.6)
10 Wonderland XIV  (9.6)

Top onefile Demos

1 Layers  (9.6)
2 No Listen  (9.6)
3 Party Elk 2  (9.6)
4 Cubic Dream  (9.6)
5 Copper Booze  (9.6)
6 Rainbow Connection  (9.5)
7 Dawnfall V1.1  (9.5)
8 Onscreen 5k  (9.5)
9 Morph  (9.5)
10 Libertongo  (9.5)

Top Groups

1 Performers  (9.3)
2 Booze Design  (9.3)
3 Oxyron  (9.3)
4 Triad  (9.3)
5 Censor Design  (9.3)

Top NTSC-Fixers

1 Pudwerx  (10)
2 Booze  (9.7)
3 Stormbringer  (9.7)
4 Fungus  (9.6)
5 Grim Reaper  (9.3)

Page generated in: 0.034 sec.