[CSDb] - User Forums - Native crunch/decrunch code.

You are not logged in - nap

CSDb User Forums

Forums > C64 Coding > Native crunch/decrunch code.

2024-04-20 05:56

Flavioweb

Registered: Nov 2011
Posts: 466

Native crunch/decrunch code.

Do you know if the source code of a native crunch/decrunch routine for C64 exists somewhere?
Not a complete utility with interface etc...
just the compression/decompression code.

... 13 posts hidden. Click here to view all posts....

2024-04-21 12:47

Krill

Registered: Apr 2002
Posts: 3098

Quoting Flavioweb

The "pure" RLE compression didn't convince me very much because, in practice, it doubled the space required for the code to be compressed, but it works well for the data part.

It seems like you've misunderstood how the algorithm/encoding works.

Actual RLE emits a counter/type byte to encode a run, with up to 128 bytes of repeating or literal bytes. (Ignore the inferior escape byte approach that was so popular on C-64.)

So, worst case would be one extra byte for every 128 input bytes.

2024-04-21 13:26

tlr

Registered: Sep 2003
Posts: 1814

Quoting Krill

Actual RLE emits a counter/type byte to encode a run, with up to 128 bytes of repeating or literal bytes. (Ignore the inferior escape byte approach that was so popular on C-64.)

Not sure I agree the escape byte approach is inferior, but it's more messy.

Though as you point out, some way to differentiate literals vs runs is required for RLE. There should be little or no expansion of uncompressible sections.

2024-04-21 13:30

Krill

Registered: Apr 2002
Posts: 3098

Quoting tlr

Not sure I agree the escape byte approach is inferior, but it's more messy.

It's inferior in the way the decompressor needs to look at every input byte to decide what to do,
while the 1.7bits control byte approach allows for loops or nice Duff's devices that don't need to branch for every input byte, as the length of both types of run is known beforehand.

(It's a bit like C-strings vs Pascal-strings. =D)

2024-04-21 14:21

tlr

Registered: Sep 2003
Posts: 1814

Quote: Quoting tlr
Not sure I agree the escape byte approach is inferior, but it's more messy.
It's inferior in the way the decompressor needs to look at every input byte to decide what to do,
while the 1.7bits control byte approach allows for loops or nice Duff's devices that don't need to branch for every input byte, as the length of both types of run is known beforehand.

(It's a bit like C-strings vs Pascal-strings. =D)

Yeah, but you have to count instead. In the RLE case it's also usually less efficient, although RLE is very inefficient to begin with.

2024-04-21 14:49

ChristopherJam

Registered: Aug 2004
Posts: 1424

So far as RLE is concerned, alternately you can just output a count of "how many more repeats" after each adjacent pair of identical values, and potentially huffman encode the counts to make them smaller.

(eg abbcdddeffffg becomes abb{0}cdd{1}eff{2}g )

But anyway, I'm kind of tempted to push a simple MTF+uneven symbol size codec out the door - a quick draft crunches zorro from 54105 bytes to 45892, and would be fairly simple to implement in 6502. It's not as good a ratio as even tinycrunch (38962 bytes for the same test file), but it'd compress in a matter of seconds, and would outperform RLE unless you're crunching quite a bit of of empty space. Of interest?

2024-04-21 15:51

Martin Piper

Registered: Nov 2007
Posts: 739

(number of literals)(output literals)(number of repeated bytes, repeat the previous byte)...

Since the number of repeats always follows a number of literals, there is no need for a control flag.
The "number of" can use some form of Elias gamma coding, for up to 16 bit values for the whole memory range, to keep things short.

2024-04-21 16:06

Krill

Registered: Apr 2002
Posts: 3098

Let's not let this degrade to Code Golf, shall we.

2024-04-21 16:40

Raistlin

Registered: Mar 2007
Posts: 771

Quote: Let's not let this degrade to Code Golf, shall we.

Why? I think the thread becomes more interesting that way… if “Code Golf” means what I think it means..? Or did you mean “Code Tennis”?

I like CJam’s idea, I never thought of doing RLE in that way and, annoyingly, it’s so damn obvious now that he’s suggested it.

2024-04-21 16:44

Krill

Registered: Apr 2002
Posts: 3098

Quoting Raistlin

Why? I think the thread becomes more interesting that way… if “Code Golf” means what I think it means..? Or did you mean “Code Tennis”?

No idea where the difference is, but i suggest another thread to discuss the relative pros and cons of various RLE schemes. =)

2024-04-22 11:18

ChristopherJam

Registered: Aug 2004
Posts: 1424

(RLE discussion continued at On the relative pros and cons of various RLE schemes)

Previous - 1 | 2 | 3 - Next

Refresh

Subscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search for in
All times are CET.

Search CSDb

Advanced

Users Online

bugjam
JackAsser/Booze Design
Andy/AEG
Airwolf/F4CG
Mason/Unicess
Jazzcat/Onslaught
Yogibear/Protovision
zzarko/Avatar
Guests online: 347

Top Demos

1 Next Level  (9.7)
2 13:37  (9.7)
3 Codeboys & Endians  (9.7)
4 Mojo  (9.6)
5 Coma Light 13  (9.6)
6 Edge of Disgrace  (9.6)
7 Signal Carnival  (9.6)
8 Wonderland XIV  (9.5)
9 Uncensored  (9.5)
10 Comaland 100%  (9.5)

Top onefile Demos

1 Nine  (9.7)
2 Layers  (9.6)
3 Cubic Dream  (9.6)
4 Party Elk 2  (9.6)
5 Copper Booze  (9.5)
6 Scan and Spin  (9.5)
7 Onscreen 5k  (9.5)
8 Grey  (9.5)
9 Dawnfall V1.1  (9.5)
10 Rainbow Connection  (9.5)

Top Groups

1 Artline Designs  (9.3)
2 Booze Design  (9.3)
3 Performers  (9.3)
4 Oxyron  (9.3)
5 Censor Design  (9.3)

Top Original Suppliers

1 Derbyshire Ram  (9.7)
2 Black Beard  (9.2)
3 Baracuda  (9.2)
4 hedning  (9.1)
5 Irata  (8.8)

Page generated in: 0.087 sec.