| |
oziphantom
Registered: Oct 2014 Posts: 490 |
modify Exomiser compressor to black list some memory locations
Does anybody know a way to modify the exomizer compression algorithm to black list FF0X memory locations to never be read from, i.e don't allow them to be used as part of a sequence?
I guess a post process would also work, if there is a simple way to convert use a sequence to literal bytes.. |
|
| |
Krill
Registered: Apr 2002 Posts: 2980 |
In order to avoid an XY problem here, what do you actually want to achieve? :) |
| |
oziphantom
Registered: Oct 2014 Posts: 490 |
Decompressing above FF00 on a 128.
I have the system modifed so it does a faster decompression that can copy the data to the other bank, allowing me to avoid any overlap issues, and just makes life easier on the whole. But FF00-5 are MMU and if you write to them it swaps the MMU config - this is not a problem as I have a method to write under the MMU. However there is still a problem when the SRC ptr tries to read from the MMU registers as the MMU also steals the reads, so it won't get the actual value it wants. If I can get the compressed data to only write to FF0X ( not 100% sure if its only 0-5 that has the issue of 0-8) then it will work fine. If not I will need to cut my input files into two X-FEFF and then a FFF8-FFFF and patch the bytes in the middle, a bit of a hassle and making the compressed file just only write and never ready from that range would be a lot simpler in the long run. As having VIC bank C000-FFFF is a really common thing 64 games do.. |
| |
Krill
Registered: Apr 2002 Posts: 2980 |
If you have modified the decompressor for faster operation on a C-128, can you use stack relocation for reading from any RAM location? It's 4 cycles per byte with built-in auto-increment... :) |
| |
Oswald
Registered: Apr 2002 Posts: 5094 |
why not just cut off the files at feff :P |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
nucrunch as is supports decrunching a set of segments with a single call. The ratio's not as good as exo mind.
Each segment's compressed separately at the moment (ie, doesn't refer back to substrings from earlier segments). Upsides and downsides for your use case.
Imma add that blacklist thing to the desiderata for my next cruncher (may be some time until release :P ) |
| |
oziphantom
Registered: Oct 2014 Posts: 490 |
converting rdecrunch to be 128 optimal/bank agnostic looks like it would be quite a challenge ;) exomizer's Y always decreases makes it easy ;) |
| |
oziphantom
Registered: Oct 2014 Posts: 490 |
Quote: If you have modified the decompressor for faster operation on a C-128, can you use stack relocation for reading from any RAM location? It's 4 cycles per byte with built-in auto-increment... :)
no,
a.) you will still need a counter to know when to inc the page byte, as the MMU doesn't extend the stack to a 16 pointer for you, so you would need to keep a ZP or register to detect the 0 overflow case.
b.) because I use the stack relocation to write the bytes. This lets me write to any bank, so I can read from bank 0, then write to bank 1 with it. It also always hits RAM, such that it will write under FF0X and IO while IO is banked in, so I don't even have to modify banking.
and this turns a sta (XX),y into a pha <- 3 clocks ;) and then 2mhz mode so 1.5..
The catch is the "read sequence" from data reads from uncompressed data and not compressed data, so I have to make the read more expensive. In that it needs a bank change, so I have to enable shared mem, so my code to read doesn't magic it self away ( one could just put the code into both places to avoid this however ) switch banks&disable IO(if I keep it on normally), good ole PCR registers to the rescue, lda (xx),y then switch the bank bank, PCR again, and disable shared memory. So the load eats 26 more clocks. I figure that since all the code has to be inlined (no stack operations are allowed in the decompressor ) and writing will happen for every byte, this works out a net win. Then 2mhz..
Since I don't have to worry about any overlap, I can now compress 2-ff00 (would be nice if it was 2-ffff), I can technically do 0- but those 2 bytes are kind of useless, and I don't have to move the data down at the start for "overlap safety" so I also pick up a massive boost of not having to shuffle any data around.
I also move the ZP into $1XX this allows me to put the variables and the 156 byte buffer into 'ZP' while leaving the actual ZP untouched, for even more speed.
now If I could just get it to not want to read from FF0X it would be perfect.. |
| |
Krill
Registered: Apr 2002 Posts: 2980 |
My loader has an option to allow loading under the IO space at $d000..$dfff.
It is implemented to favour speed over size: there are two getblock routines - one regular and one with all the slow bank switching in place. Depending on where the incoming block will go to, either the fast-RAM or the slow-IO incarnation is picked.
Could such a setup work for you? Technically, it is indeed possible to both read from and write to memory under MMU space at $ff0X, isn't it? |
| |
oziphantom
Registered: Oct 2014 Posts: 490 |
you can switch the stack to -1 src, pla, swap the stack back to dest pha to read/write under the ff0X block, but its a lot of overhead for 8bytes ;) You can't use the ZP relocation trick as you can't read 00/01 with it.
for a loader base solution, I write the exomizer file backwards to the disk, then the get byte just does
toggle clock for next byte
read byte from SSR
Bank switching on the 128 is not that bad, as you can set the Pre Config registers, so one has Bank 0 RAM + IO and 2 has Bank 0 RAM
then
STA $ff02
STA DEST
STA $ff01
only adds 8 clocks, really handy for the pesky IRQs as well, now it can just switch to its PRC and then just switch back to the "main threads" one, no more lda $1 pha lda #value sta $1 .... pla sta $1 :D or live on the edge with inc dec tricks
How does your loader handle banks? It would be nice to have a loader that handles Tass 3byte PRGs so I can load into both banks, just make one large file that will load from disk and unpack into both banks would be great.
However in this case I'm looking for RAM -> RAM based solutions, not needed for this particular case, I could just stream unpack from disk, but for future ideas I want to get it working. Under IO is not an issue, its reading from FF0X that is the issue. |
| |
Krill
Registered: Apr 2002 Posts: 2980 |
I was not suggesting any loader/IO-based solution, i was proposing to use the same approach as with loading under IO to solve your problem.
That is, use your normal sequence copy loop everywhere except for the problematic $ff0X range, where you use the overhead-ridden bank switching solution.
The overall performance impact should be minimal, but it eats a bit of memory to have two alternative routines for different memory ranges.
Quoting oziphantomHow does your loader handle banks? It would be nice to have a loader that handles Tass 3byte PRGs so I can load into both banks, just make one large file that will load from disk and unpack into both banks would be great. I have added native C-128 support just recently, mainly to implement burst support (and then see it's in fact not faster than the standard 2bit+ATN approach).
There is no support for generic loading across banks or 3-byte load addresses so far. The loader does not perform any bank-switching itself in the non-IO/non-KERNAL-fallback default variant, so it uses whatever bank it resides in by default, with or without common areas to load to the other bank. I have thought about adding something for full 128K support, but decided to ignore the problem for now, as i've only come up with cumbersome bank-switching thunk solutions so far, much like the OS does. |
... 16 posts hidden. Click here to view all posts.... |
Previous - 1 | 2 | 3 - Next |