Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
You are not logged in - nap
CSDb User Forums


Forums > C64 Coding > Nucrunch 0.1
2016-02-04 10:02
ChristopherJam

Registered: Aug 2004
Posts: 1409
Nucrunch 0.1

Continuing from the benchmarks WVL posted in Doynamite 1.x:

I dusted off my unfinished nucrunch in December to pack just enough of the second page of Reutastic to give me some workspace for some precalculations. Pity I didn't schedule enough time to pack the entire demo, else it would have been ~90 blocks instead of 190, but I digress. I've spent bits of the past month cleaning up the code, optimizing the packer (mostly by porting it from python to rust :P), and adding reverse direction support.

It's still no more than a component, with an commandline packer and asm decrunch subroutine, but no tools yet for generating an executable from a single commandline. It does at least now support multiple input segments that are unpacked to their destination addresses, and it's also now useable enough to for me to do some benchmarking.

In short, doynamite's ratio looks pretty unbeatable for anything lz based; my ratio's almost identical despite a somewhat different encoding.

Where I can win though is speed at that ratio; nucrunch is usually ten to twenty percent faster. The one exception in the test corpus is 6.bin, where it's 20% slower; not sure why yet.

I've added the times for pucrunch -ffast below for to complete the comparison. Last two columns are nucrunch, and nucrunch -r (the latter decodes in reverse; should be a more useful component for single filers)

If anyone wants to have a play at this stage, poke me and I'll upload some source. Failing that I'll hold off until I at least have something that can make onefilers without any faffing about with relocating the last couple of pages by hand.

filesizes
#   bin   rle wvl-f wvl-s    tc    bb  pu-f doyna nucru rnucr
- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
1 11008  8020  4529  4151  4329  3383  3711  3265  3225  3230
2  4973  4314  3532  3309  3423  2648  3005  2512  2498  2490
3  3949  3498  2991  2617  2972  2187  2530  2108  2091  2093
4  7016  6456  4242  4085  4225  3681  3924  3617  3622  3614
5 34760 27647 25781 24895 25210 21306 21182 20405 20447 20516
6 31605 12511 11283 10923 11614  9194  9203  8904  8915  8894
7 20392 17295 12108 11285 11445  9627  9789  9289  9140  9144
8  5713  5407  4179  3916  3936  3251  3656  3132  3165  3187
9  8960  7986  6914  6896  6572  5586  6000  5430  5502  5486

filesize in %
#   bin   rle wvl-f wvl-s    tc    bb  pu-f doyna nucru rnucr
- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
1   100  72.9  41.1  37.7  39.3  30.7  33.7  29.7  29.3  29.3
2   100  86.7  71.0  66.5  68.8  53.2  60.4  50.5  50.2  50.1
3   100  88.6  75.7  66.3  75.3  55.4  64.1  53.4  53.0  53.0
4   100  92.0  60.5  58.2  60.2  52.5  55.9  51.6  51.6  51.5
5   100  79.5  74.2  71.6  72.5  61.3  60.9  58.7  58.8  59.0
6   100  39.6  35.7  34.6  36.7  29.1  29.1  28.2  28.2  28.1
7   100  84.8  59.4  55.3  56.1  47.2  48.0  45.6  44.8  44.8
8   100  94.6  73.1  68.5  68.9  56.9  64.0  54.8  55.4  55.8
9   100  89.1  77.2  77.0  73.3  62.3  67.0  60.6  61.4  61.2

number of frames to depack
#   bin   rle wvl-f wvl-s    tc    bb  pu-f doyna nucru rnucr
- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
1     0    11    13    14    15    58    54    27    22    22
2     0     5     7     7     9    38    39    17    14    14
3     0     4     6     6     7    28    31    12    10    10
4     0     8     9     9    10    43    51    20    17    18
5     0    36    39    42    59   300   298   119   104   107
6     0    20    25    25    37   126   152    49    59    59
7     0    22    25    26    32   138   139    60    51    52
8     0     6     8     8    10    43    47    18    16    17
9     0     9    12    12    16    73    81    32    28    29

kilobytes output per second
#   bin   rle wvl-f wvl-s    tc    bb  pu-f doyna nucru rnucr
- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
1        49.0  41.4  38.5  35.9   9.3  10.0  20.0  24.5  24.5
2        48.7  34.8  34.8  27.0   6.4   6.2  14.3  17.4  17.4
3        48.3  32.2  32.2  27.6   6.9   6.2  16.1  19.3  19.3
4        42.9  38.2  38.2  34.3   8.0   6.7  17.2  20.2  19.1
5        47.3  43.6  40.5  28.8   5.7   5.7  14.3  16.4  15.9
6        77.4  61.9  61.9  41.8  12.3  10.2  31.6  26.2  26.2
7        45.4  39.9  38.4  31.2   7.2   7.2  16.6  19.6  19.2
8        46.6  35.0  35.0  28.0   6.5   6.0  15.5  17.5  16.5
9        48.7  36.5  36.5  27.4   6.0   5.4  13.7  15.7  15.1

cycles per byte consumed
#   bin   rle wvl-f wvl-s    tc    bb  pu-f doyna nucru rnucr
- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
1     0    27    56    66    68   337   286   163   134   134
2     0    23    39    42    52   282   255   133   110   111
3     0    22    39    45    46   252   241   112    94    94
4     0    24    42    43    47   230   255   109    92    98
5     0    26    30    33    46   277   277   115   100   103
6     0    31    44    45    63   269   325   108   130   130
7     0    25    41    45    55   282   279   127   110   112
8     0    22    38    40    50   260   253   113    99   105
9     0    22    34    34    48   257   265   116   100   104

decrunch time for nucrunch/rnucrunch relative to doynamite
1:  81.5% (-18.5%)  81.5% (-18.5%)
2:  82.4% (-17.6%)  82.4% (-17.6%)
3:  83.3% (-16.7%)  83.3% (-16.7%)
4:  85.0% (-15.0%)  90.0% (-10.0%)
5:  87.4% (-12.6%)  89.9% (-10.1%)
6: 120.4% ( 20.4%) 120.4% ( 20.4%)
7:  85.0% (-15.0%)  86.7% (-13.3%)
8:  88.9% (-11.1%)  94.4% ( -5.6%)
9:  87.5% (-12.5%)  90.6% ( -9.4%)
 
... 95 posts hidden. Click here to view all posts....
 
2016-02-12 18:51
JackAsser

Registered: Jun 2002
Posts: 2014
Quote: Is this thread just being hijacked!?

Oswald 'The Hijacker'
2016-02-12 22:55
ChristopherJam

Registered: Aug 2004
Posts: 1409
Quoting algorithm
Actually should be able to do a delta every 4 lines..

Set up d800-d827 before the first badline..

1
when the badline occurs, do the delta update for the next 4 lines

force second badline 4 pixels down. at this stage, the updates to d800-d827 will occur

then write the delta to change back the values to how they were originally at d828-d84f

repeat from 1


Exactly what I did. Still waiting on art from someone, or you'd have seen a release by now.
2016-02-13 09:16
MagerValp

Registered: Dec 2001
Posts: 1078
Quote: I suppose you have a scratchpad somewhere in memory, like in the decruncher code to handle it then?

No, no extra memory, that's the beauty of it, only the end address or a counter. I'll sketch something out to show how it works, there might be a corner case that I'm not thinking of.
2016-02-13 10:47
Bitbreaker

Registered: Oct 2002
Posts: 508
But wouldn't that only work if we end with a match then? As long as the last thing to do is copying a literal this would still fail as with packers like Doynax, BB and similar, as one would have some control stream bits/bytes before that to signal a literal and give a length. If the following literal bytes would not overlap, the control bits would have been overwritten by the preceeding action and things would fail. In case things end with a match, all would be fine, only the copying of the match would then kill the already read control bits and the decruncher would end afterwards.
2016-02-13 10:52
ChristopherJam

Registered: Aug 2004
Posts: 1409
Update time!!
All results from above combined, as well as some new ones

Thanks MagerValp for LZMV* & exo*, and Bitbreaker for Bitfire.
Also thanks to HCL for his patience, I've finally switched to the fast decompressor for BB.

Columns sorted by output rate (guessed positions for LZMPi and exoraw, the only ones I don't yet have times for):
filesizes
#    bin    rle  wvl-f  wvl-s LZMV25     tc LZMV4k rnucru nucrun  bb2.0 bitfir doynax  LZMPi exomem   pu-f exoraw
- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------
1  11008   8020   4529   4151   4539   4329   4205   3230   3225   3322   3324   3265   3184   2988   3711   2988
2   4973   4314   3532   3309   3575   3423   3183   2490   2498   2513   2515   2512   2410   2225   3005   2241
3   3949   3498   2991   2617   3018   2972   2551   2093   2091   2098   2097   2108   1931   1808   2530   1817
4   7016   6456   4242   4085   4314   4225   4343   3614   3622   3682   3682   3617   3571   3442   3924   3454
5  34760  27647  25781  24895  26116  25210  23845  20516  20447  20530  20531  20405  20362  19715  21182  19631
6  31605  12511  11283  10923  11352  11614  10619   8896   8915   8998   9004   8904   8719   8322   9203   8337
7  20392  17295  12108  11285  12188  11445  11154   9145   9140   9241   9242   9289   9256   8765   9789   8751
8   5713   5407   4179   3916   3987   3936   3959   3187   3166   3165   3162   3132   3048   3081   3656   3059
9   8960   7986   6914   6896   6943   6572   6505   5487   5502   5491   5491   5430   5563   5304   6000   5295

filesize in %
#    bin    rle  wvl-f  wvl-s LZMV25     tc LZMV4k rnucru nucrun  bb2.0 bitfir doynax  LZMPi exomem   pu-f exoraw
- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------
1  100.0   72.9   41.1   37.7   41.2   39.3   38.2   29.3   29.3   30.2   30.2   29.7   28.9   27.1   33.7   27.1
2  100.0   86.7   71.0   66.5   71.9   68.8   64.0   50.1   50.2   50.5   50.6   50.5   48.5   44.7   60.4   45.1
3  100.0   88.6   75.7   66.3   76.4   75.3   64.6   53.0   53.0   53.1   53.1   53.4   48.9   45.8   64.1   46.0
4  100.0   92.0   60.5   58.2   61.5   60.2   61.9   51.5   51.6   52.5   52.5   51.6   50.9   49.1   55.9   49.2
5  100.0   79.5   74.2   71.6   75.1   72.5   68.6   59.0   58.8   59.1   59.1   58.7   58.6   56.7   60.9   56.5
6  100.0   39.6   35.7   34.6   35.9   36.7   33.6   28.1   28.2   28.5   28.5   28.2   27.6   26.3   29.1   26.4
7  100.0   84.8   59.4   55.3   59.8   56.1   54.7   44.8   44.8   45.3   45.3   45.6   45.4   43.0   48.0   42.9
8  100.0   94.6   73.1   68.5   69.8   68.9   69.3   55.8   55.4   55.4   55.3   54.8   53.4   53.9   64.0   53.5
9  100.0   89.1   77.2   77.0   77.5   73.3   72.6   61.2   61.4   61.3   61.3   60.6   62.1   59.2   67.0   59.1
- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------
~  100.0   80.9   63.1   59.5   63.2   61.3   58.6   48.1   48.1   48.4   48.4   48.1   47.1   45.1   53.7   45.1
- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------

number of frames to depack
#    bin    rle  wvl-f  wvl-s LZMV25     tc LZMV4k rnucru nucrun  bb2.0 bitfir doynax  LZMPi exomem   pu-f exoraw
- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------
1    0.0   11.5   13.5   14.5   13.7   15.5   16.9   21.3   22.0   24.3   24.4   27.5    0.0   57.9   54.7    0.0
2    0.0    5.5    7.5    7.5    8.4    9.5    9.6   13.6   13.7   15.1   15.1   17.5    0.0   38.5   39.6    0.0
3    0.0    4.5    6.5    6.5    6.2    7.5    7.1   10.1   10.0   10.9   10.9   12.5    0.0   28.5   31.9    0.0
4    0.0    8.5    9.5    9.5    9.4   10.5   11.5   17.1   16.6   18.1   18.1   20.5    0.0   53.1   52.0    0.0
5    0.0   36.5   39.5   42.5   46.6   59.5   58.4  100.6  101.4  107.7  107.4  119.5    0.0  295.9  298.6    0.0
6    0.0   20.5   25.5   25.5   38.2   37.5   35.8   56.6   57.9   61.5   62.0   49.5    0.0  142.3  152.8    0.0
7    0.0   22.5   25.5   26.5   29.1   32.5   35.0   50.0   50.3   54.5   54.5   60.5    0.0  139.2  139.8    0.0
8    0.0    6.5    8.5    8.5    8.9   10.5   10.3   15.8   15.6   16.9   17.0   18.5    0.0   44.8   47.7    0.0
9    0.0    9.5   12.5   12.5   14.2   16.5   16.6   27.2   27.1   29.6   29.4   32.5    0.0   78.9   81.7    0.0

kilobytes output per second
#    bin    rle  wvl-f  wvl-s LZMV25     tc LZMV4k rnucru nucrun  bb2.0 bitfir doynax  LZMPi exomem   pu-f exoraw
- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------
1          46.9   39.9   37.2   39.3   34.8   31.9   25.3   24.5   22.2   22.1   19.6           9.3    9.9       
2          44.3   32.5   32.5   29.0   25.6   25.4   17.9   17.7   16.1   16.1   13.9           6.3    6.1       
3          43.0   29.7   29.7   31.2   25.8   27.2   19.2   19.4   17.7   17.8   15.5           6.8    6.1       
4          40.4   36.2   36.2   36.5   32.7   29.9   20.1   20.6   18.9   19.0   16.8           6.5    6.6       
5          46.6   43.1   40.0   36.5   28.6   29.1   16.9   16.8   15.8   15.8   14.2           5.8    5.7       
6          75.5   60.7   60.7   40.5   41.3   43.2   27.3   26.7   25.2   25.0   31.3          10.9   10.1       
7          44.4   39.1   37.7   34.3   30.7   28.5   20.0   19.8   18.3   18.3   16.5           7.2    7.1       
8          43.0   32.9   32.9   31.4   26.6   27.2   17.7   18.0   16.5   16.4   15.1           6.2    5.9       
9          46.2   35.1   35.1   30.9   26.6   26.4   16.1   16.2   14.8   14.9   13.5           5.6    5.4       
- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------
~    0.0   47.8   38.8   38.0   34.4   30.3   29.9   20.1   20.0   18.4   18.4   17.4    0.0    7.2    7.0    0.0
- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------

cycles per byte consumed
#    bin    rle  wvl-f  wvl-s LZMV25     tc LZMV4k rnucru nucrun  bb2.0 bitfir doynax  LZMPi exomem   pu-f exoraw
- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------
1    0.0   28.2   58.6   68.7   59.3   70.4   79.0  129.4  134.0  143.6  144.1  165.6    0.0  380.9  289.7    0.0
2    0.0   25.1   41.7   44.6   46.2   54.6   59.3  107.5  108.0  118.4  118.4  136.9    0.0  340.1  259.3    0.0
3    0.0   25.3   42.7   48.8   40.4   49.6   54.7   94.6   93.8  102.4  102.0  116.6    0.0  309.8  247.9    0.0
4    0.0   25.9   44.0   45.7   42.8   48.8   52.0   93.1   90.3   96.8   96.4  111.4    0.0  303.2  260.4    0.0
5    0.0   26.0   30.1   33.6   35.1   46.4   48.1   96.4   97.5  103.1  102.8  115.1    0.0  295.0  277.1    0.0
6    0.0   32.2   44.4   45.9   66.1   63.5   66.3  125.1  127.7  134.3  135.4  109.3    0.0  336.1  326.3    0.0
7    0.0   25.6   41.4   46.2   46.9   55.8   61.7  107.4  108.2  115.9  116.0  128.0    0.0  312.2  280.7    0.0
8    0.0   23.6   40.0   42.7   43.9   52.4   51.1   97.2   96.6  105.2  105.9  116.1    0.0  285.8  256.4    0.0
9    0.0   23.4   35.5   35.6   40.2   49.3   50.2   97.4   96.9  105.9  105.4  117.6    0.0  292.4  267.7    0.0
- ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------ ------
2016-02-13 11:11
MagerValp

Registered: Dec 2001
Posts: 1078
Quote: But wouldn't that only work if we end with a match then? As long as the last thing to do is copying a literal this would still fail as with packers like Doynax, BB and similar, as one would have some control stream bits/bytes before that to signal a literal and give a length. If the following literal bytes would not overlap, the control bits would have been overwritten by the preceeding action and things would fail. In case things end with a match, all would be fine, only the copying of the match would then kill the already read control bits and the decruncher would end afterwards.

Yes, that's exactly the case. The trick is to ensure that the compressed stream ends with a match. If it ends with a literal, include the literal data without any control bytes at the correct location. Illustration here:

https://www.dropbox.com/s/9zttmj3avjm6krl/LZMV%20in%20place%20u..

Note that it packs backwards, so the start is at the top of memory and the ending literal "!" in the second example is at the bottom. Unpacking backwards enables dey/bpl copy loops for speed and lets you keep the load address. Speed for this should be on par with WVL's packers.
2016-02-13 11:24
MagerValp

Registered: Dec 2001
Posts: 1078
@ChristopherJam: Lovely! The pattern I see here is that there are three speed classes. You have the fast LZSS variants (WVL, LZMV), you have the efficient but slow LZ + Huffman (exo, pu), and in the middle you have nu/bb2/bitfire/doynax.

It also makes it obvious that LZMV (and maybe tc) needs speed optimization to be competitive :)
2016-02-13 11:37
Bitbreaker

Registered: Oct 2002
Posts: 508
Thanks MagerValp, that was the tiny bit of info that was missing, in deed one can leave the last literal just there without any more control bits :-D As an extra, when we are guaranteed to end with a match, it is also sufficient to do the end check on matches only, as before with that maximum run as an eof marker. It will however add a few more bytes to the decruncher, but speedwise it would be mostly as fast as a tay + beq as eof detection by having a cmp + beq (in most cases that is, as we would require to check on 16 bit in case)
I'll give this a try the next days :-)
2016-02-13 12:05
MagerValp

Registered: Dec 2001
Posts: 1078
Go for it! :)

And Excel just crashed on me while I was trying to make a pretty scatter plot of the results :/
2016-02-13 12:31
Martin Piper

Registered: Nov 2007
Posts: 726
Very slow LZMPi decompression cycles:
1.bin 1558777
2.bin 1044272
3.bin 813537
4.bin 1482657
5.bin 8471668
6.bin 4323352
7.bin 4044852
8.bin 1278691
9.bin 2307100

Hey, it's optimised for space not speed. :D
Previous - 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 - Next
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
iAN CooG/HVSC
Twoflower/ΤRIΛD
Higgie/Kraze/Slackers
Dano/Padua
CA$H/TRiAD
Worluk
Airwolf/F4CG
Guests online: 124
Top Demos
1 Next Level  (9.7)
2 13:37  (9.7)
3 Coma Light 13  (9.6)
4 Edge of Disgrace  (9.6)
5 Mojo  (9.6)
6 Uncensored  (9.6)
7 The Demo Coder  (9.6)
8 Comaland 100%  (9.6)
9 Wonderland XIV  (9.6)
10 What Is The Matrix 2  (9.6)
Top onefile Demos
1 Layers  (9.7)
2 Cubic Dream  (9.6)
3 Party Elk 2  (9.6)
4 Copper Booze  (9.6)
5 Rainbow Connection  (9.5)
6 Morph  (9.5)
7 Dawnfall V1.1  (9.5)
8 Libertongo  (9.5)
9 Katzen-Video.mp4  (9.5)
10 Onscreen 5k  (9.5)
Top Groups
1 Booze Design  (9.3)
2 Oxyron  (9.3)
3 Performers  (9.3)
4 Fairlight  (9.3)
5 Triad  (9.3)
Top Graphicians
1 Hend  (9.8)
2 Mirage  (9.7)
3 Archmage  (9.7)
4 Pal  (9.6)
5 Carrion  (9.6)

Home - Disclaimer
Copyright © No Name 2001-2025
Page generated in: 0.065 sec.