| |
Urban Space Cowboy
Registered: Nov 2004 Posts: 45 |
Release id #115758 : Bongo Cruncher
I'm happy to say that after some experimentation, I've got Exomizer + a straightforward delta coder/decoder to beat Bongo Cruncher!
How does this work? The uncrunched program contains a ton of unrolled loops repeating every 3 or 6 bytes. I got the best results with a delta encoding of addresses six apart ($0807 is replaced with $0807-$0801, $0808 with $0808-$0802, $0809 with $0809-$0803, and so on). The routine at $07e1 reverses the delta encoding then starts the demo. The demo is a part from "I Love the Cube" that uses drive code, so you'll need a 1541 drive attached.
How well does this improve on Bongo Cruncher's "lovely golden sequences"? A few bytes that don't save any blocks? Nope. A block or two? Nope. 19 BLOCKS. And the exomizer-crunched version actually decrunches a bit faster -- about 8 seconds versus bongo2passgolden's 12 seconds.29297 bytes (116 blocks) bongo2passgolden
24581 bytes (97 blocks) demodatad6.exo Check it out:
demodatad6.zip - 53.4 Kb |
|
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
You say that as if it's some sort of achievement.
No cruncher is universal and a bit of data-specific preprocessing to help out is the generally accepted method of achieving decent compression ratios at reasonable speeds.
I'd be more interested to hear how Bongo Cruncher with delta coding compares. Or any others for that matter.
In fairness I should say that good preprocessing techniques relies on deep knowledge of how the compression algorithms work, which I should think may be more of a problem in terms of the Bongo Cruncher. |
| |
Burglar
Registered: Dec 2004 Posts: 1102 |
Quote:The uncrunched program contains a ton of unrolled loops repeating every 3 or 6 bytes
I'd guess that you'll get an even better result if you write codegenerators for the unrolled loops. |
| |
Krill
Registered: Apr 2002 Posts: 2980 |
Plus code compresses better when you separate op-code and operand streams.
Also i'd say that the gold standard to compare pack ratios against is still Exomizer. |
| |
Urban Space Cowboy
Registered: Nov 2004 Posts: 45 |
Quoting doynaxI'd be more interested to hear how Bongo Cruncher with delta coding compares. Or any others for that matter. Well, that was motivation enough to learn how to compile Delphi Pascal. (In cruncher.dpr, add {$MODE DELPHI} above {$APPTYPE CONSOLE} -- then compile using Free Pascal: fpc -ocruncher cruncher.dpr)
24958 bytes (99 blocks) cruncher -i demodatad6.o -DepackAdr '$07dd' -o d6test1p.prg -prgfile -JMP '$07e1' -Value01 '$38' -nogoldenseq
24875 bytes (98 blocks) pucrunch -x0x7e1 +f -g0x38 -i0 -r59126 -fdelta demodatad6.o demodatad6.pu For comparison, results on the original uncrunched file without delta encoding:
40003 bytes (158 blocks) pucrunch -x0x801 +f -r23107 -p1 -m6 -fdelta demodata.prg demodata.pu
34124 bytes (135 blocks) exomizer sfx 0x801 -q -t64 -Di_irq_exit=0 -Di_ram_exit='$38' -m26365 -M174 -n -o "$FILE" demodata.prg
29297 bytes (116 blocks) cruncher -i demodata.prg -DepackAdr $0801 -o bongo2passwithgoldenseq.prg -prgfile -JMP '$080D' -Value01 '$37' -iterate |
| |
iAN CooG
Registered: May 2002 Posts: 3198 |
why -m26365 -M174 in exomizer? doesn't those switches always lead to worse compression? what is the result omitting those switches?
anyway, if you used a simple charpacker like IDIOTS Fx Bytepacker V2.1 then Rowscruncher V1.0 and finally exomize it you get 22283 bytes instead of 24581. Of course it takes a bit more to unpack. |
| |
Urban Space Cowboy
Registered: Nov 2004 Posts: 45 |
Quoting iAN CooGwhy -m26365 -M174 in exomizer? doesn't those switches always lead to worse compression? what is the result omitting those switches? No, not always. Sometimes there're edge cases where the "best" options give you slightly worse results, even after the optimization passes. I used a shell script to brute-force try different -m and -M options, same with -r for pucrunch.
34138 bytes (135 blocks) exomizer sfx 0x801 -q -t64 -Di_irq_exit=0 -Di_ram_exit='$38' -n -o blargh.exo demodata.prg
Quote:anyway, if you used a simple charpacker like IDIOTS Fx Bytepacker V2.1 then Rowscruncher V1.0 and finally exomize it you get 22283 bytes instead of 24581. Of course it takes a bit more to unpack. Thanks for the pointer, it's nice to know other people've experimented along similar lines before. More ideas to pillage! |
| |
Bitbreaker
Registered: Oct 2002 Posts: 508 |
lzma squeezes the demodata.prg down to 24776 bytes, but i guess it is PITA to write a fast and small decruncher for that? :-) One might start with PackFire 1.2h or such. Here would be some decompressors, but none for 6510 yet: http://jiggawatt.org/badc0de/decrunch/ |
| |
Bitbreaker
Registered: Oct 2002 Posts: 508 |
Quote: Plus code compresses better when you separate op-code and operand streams.
Also i'd say that the gold standard to compare pack ratios against is still Exomizer.
Most likely yes, but also consider that there is sometimes speedcode that is better generated on PC-side, as it is too complex or complex or memory/time intensive to generate it. In the sphere mapper i did for coma light 13 i incorporated the mapping as well as register reuse and such into the speedcode. That is already a mess in C, there's no urge to work that out in 6502, despite my autistic affection :-) |
| |
Krill
Registered: Apr 2002 Posts: 2980 |
Quoting BitbreakerIn the sphere mapper i did for coma light 13 i incorporated the mapping as well as register reuse and such into the speedcode. That is already a mess in C, there's no urge to work that out in 6502, despite my autistic affection :-)
Yes, i know what you mean. :)
The lens effect in +H2K also loads pre-generated speedcode from disk, also with optimized mapping and register re-use.
At some point generating it before running the effect took way too long on C-64, so i simply saved it for the final version rather than generating it. |