| |
oziphantom
Registered: Oct 2014 Posts: 490 |
Packer vs Code VM vs Code Generator
Has anybody experimented with/know of code VMs or Generators.
Just thinking to get code size down they might perform better than exomiser et al.
I know of Sweet16 by Woz any others? |
|
... 80 posts hidden. Click here to view all posts.... |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
I wonder if IO register setting code is common enough to benefit from a pseudo-op that moves an eight bit immediate into $Dxxx, where xxx has a number of implicit zeros - or possibly even a series of N such operations
|
| |
oziphantom
Registered: Oct 2014 Posts: 490 |
Well you can get
STA SSRRRRRR Value
where SS is
00 VIC
01 SID
10 CIA 1
11 CIA 2
And R is the register
allowing you to pack any IO address into a byte. In a game context I don't think that will save you much. Typically the screen mode and charset registers are set and forget. The multiplexer will deal with the sprites usually in one function and then you have a few D012,9,A for the IRQs. |
| |
oziphantom
Registered: Oct 2014 Posts: 490 |
I've been working under the impression that packing all the data near each other and the code together would help with compression.
so
CODE
DATA
is this misguided do you think? |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
Grouping is generally good.
Most crunchers encode "copy recent substring" with fewer bits the more recent the substring, so grouping all your code seperate to your data means the cruncher doesn't have to look as far to find prior examples, hence can use fewer bits for the copy offset. |
| |
Krill
Registered: Apr 2002 Posts: 2980 |
Quoting oziphantomit would actually consume more memory than taking the whole lot as a custom stream The two separate streams take exactly the same amount of memory as the conventional 6502 counterpart. Your example would translate to$a9,$8d,$a9,$8d,$a9,$8d
$7f,$15,$d0,$00,$84,$27,$d0,$84,$e7,$43 and it's plain to see that the op-codes would compress well, and data arguments too at some later point when using the same literals and addresses many times.
Of course, these two chunks will have to be re-combined after decompression, which adds an initial cost of about $0180 or so bytes (which in my case amortised for the 4K, so overall i saved a few pages).
But please, do look at the Artefacts source code at some point to see what i did. |
| |
soci
Registered: Sep 2003 Posts: 480 |
Storing opcode lengths in a 64 byte bit-array, what a waste of space ;) But the use of (,x) to switch streams is smart. |
| |
Krill
Registered: Apr 2002 Posts: 2980 |
:D One of the few occasions where (ZP,X) actually comes in handy. Which is mostly whenever you have a number of "buckets" to select from AND need to push or pull stuff to and fro.
As for op-code lengths, there didn't seem to be a nice enough pattern to store that more efficiently. Speak about non-orthogonal instruction set. So i went for a solution that somehow suited exomizer best. :) |
| |
soci
Registered: Sep 2003 Posts: 480 |
From the processor's viewpoint the least significant 5 bits of opcode fully determines the length of the addressing mode.
Still it seems that BRK/JSR/RTI/RTS are exceptions. But that's only because the modification of PC is covering up that they're actually 2 bytes long.
I think if you reduce the table size the result will be shorter even with the special handling added in for these opcodes. |
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
Quote: From the processor's viewpoint the least significant 5 bits of opcode fully determines the length of the addressing mode.
Still it seems that BRK/JSR/RTI/RTS are exceptions. But that's only because the modification of PC is covering up that they're actually 2 bytes long.
I think if you reduce the table size the result will be shorter even with the special handling added in for these opcodes.
It would be very slow, but couldn't you depack that stream without a table! :) Imagine placing a BRK after the opcode in the stream then use the IRQ to determine the opcode length by checking the pushed PC on the stack! Sneaky! |
| |
oziphantom
Registered: Oct 2014 Posts: 490 |
Quoting KrillQuoting oziphantomit would actually consume more memory than taking the whole lot as a custom stream The two separate streams take exactly the same amount of memory as the conventional 6502 counterpart. Your example would translate to$a9,$8d,$a9,$8d,$a9,$8d
$7f,$15,$d0,$00,$84,$27,$d0,$84,$e7,$43 and it's plain to see that the op-codes would compress well, and data arguments too at some later point when using the same literals and addresses many times.
Of course, these two chunks will have to be re-combined after decompression, which adds an initial cost of about $0180 or so bytes (which in my case amortised for the 4K, so overall i saved a few pages).
But please, do look at the Artefacts source code at some point to see what i did.
My line is referencing a normal packer like exo, not your split system. |
Previous - 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 - Next |