[CSDb] - User Forums - modify Exomiser compressor to black list some memory locations

You are not logged in - nap

CSDb User Forums

Forums > C64 Coding > modify Exomiser compressor to black list some memory locations

2019-04-28 18:30

oziphantom

Registered: Oct 2014
Posts: 502

modify Exomiser compressor to black list some memory locations

Does anybody know a way to modify the exomizer compression algorithm to black list FF0X memory locations to never be read from, i.e don't allow them to be used as part of a sequence?

I guess a post process would also work, if there is a simple way to convert use a sequence to literal bytes..

... 16 posts hidden. Click here to view all posts....

2019-04-29 07:42

oziphantom

Registered: Oct 2014
Posts: 502

Quote: If you have modified the decompressor for faster operation on a C-128, can you use stack relocation for reading from any RAM location? It's 4 cycles per byte with built-in auto-increment... :)

no,
a.) you will still need a counter to know when to inc the page byte, as the MMU doesn't extend the stack to a 16 pointer for you, so you would need to keep a ZP or register to detect the 0 overflow case.
b.) because I use the stack relocation to write the bytes. This lets me write to any bank, so I can read from bank 0, then write to bank 1 with it. It also always hits RAM, such that it will write under FF0X and IO while IO is banked in, so I don't even have to modify banking.

and this turns a sta (XX),y into a pha <- 3 clocks ;) and then 2mhz mode so 1.5..

The catch is the "read sequence" from data reads from uncompressed data and not compressed data, so I have to make the read more expensive. In that it needs a bank change, so I have to enable shared mem, so my code to read doesn't magic it self away ( one could just put the code into both places to avoid this however ) switch banks&disable IO(if I keep it on normally), good ole PCR registers to the rescue, lda (xx),y then switch the bank bank, PCR again, and disable shared memory. So the load eats 26 more clocks. I figure that since all the code has to be inlined (no stack operations are allowed in the decompressor ) and writing will happen for every byte, this works out a net win. Then 2mhz..
Since I don't have to worry about any overlap, I can now compress 2-ff00 (would be nice if it was 2-ffff), I can technically do 0- but those 2 bytes are kind of useless, and I don't have to move the data down at the start for "overlap safety" so I also pick up a massive boost of not having to shuffle any data around.
I also move the ZP into $1XX this allows me to put the variables and the 156 byte buffer into 'ZP' while leaving the actual ZP untouched, for even more speed.

now If I could just get it to not want to read from FF0X it would be perfect..

2019-04-29 08:34

Krill

Registered: Apr 2002
Posts: 3083

My loader has an option to allow loading under the IO space at $d000..$dfff.

It is implemented to favour speed over size: there are two getblock routines - one regular and one with all the slow bank switching in place. Depending on where the incoming block will go to, either the fast-RAM or the slow-IO incarnation is picked.

Could such a setup work for you? Technically, it is indeed possible to both read from and write to memory under MMU space at $ff0X, isn't it?

2019-04-29 10:11

oziphantom

Registered: Oct 2014
Posts: 502

you can switch the stack to -1 src, pla, swap the stack back to dest pha to read/write under the ff0X block, but its a lot of overhead for 8bytes ;) You can't use the ZP relocation trick as you can't read 00/01 with it.

for a loader base solution, I write the exomizer file backwards to the disk, then the get byte just does
toggle clock for next byte
read byte from SSR

Bank switching on the 128 is not that bad, as you can set the Pre Config registers, so one has Bank 0 RAM + IO and 2 has Bank 0 RAM
then
STA $ff02
STA DEST
STA $ff01
only adds 8 clocks, really handy for the pesky IRQs as well, now it can just switch to its PRC and then just switch back to the "main threads" one, no more lda $1 pha lda #value sta $1 .... pla sta $1 :D or live on the edge with inc dec tricks

How does your loader handle banks? It would be nice to have a loader that handles Tass 3byte PRGs so I can load into both banks, just make one large file that will load from disk and unpack into both banks would be great.

However in this case I'm looking for RAM -> RAM based solutions, not needed for this particular case, I could just stream unpack from disk, but for future ideas I want to get it working. Under IO is not an issue, its reading from FF0X that is the issue.

2019-04-29 20:21

Krill

Registered: Apr 2002
Posts: 3083

I was not suggesting any loader/IO-based solution, i was proposing to use the same approach as with loading under IO to solve your problem.

That is, use your normal sequence copy loop everywhere except for the problematic $ff0X range, where you use the overhead-ridden bank switching solution.

The overall performance impact should be minimal, but it eats a bit of memory to have two alternative routines for different memory ranges.

Quoting oziphantom

How does your loader handle banks? It would be nice to have a loader that handles Tass 3byte PRGs so I can load into both banks, just make one large file that will load from disk and unpack into both banks would be great.

I have added native C-128 support just recently, mainly to implement burst support (and then see it's in fact not faster than the standard 2bit+ATN approach).

There is no support for generic loading across banks or 3-byte load addresses so far. The loader does not perform any bank-switching itself in the non-IO/non-KERNAL-fallback default variant, so it uses whatever bank it resides in by default, with or without common areas to load to the other bank. I have thought about adding something for full 128K support, but decided to ignore the problem for now, as i've only come up with cumbersome bank-switching thunk solutions so far, much like the OS does.

2019-04-30 14:24

oziphantom

Registered: Oct 2014
Posts: 502

Exomizer by default has 16bit offsets, I guess I could force it into 256 offsets, then once I get below fe00 jump to another copy.. or change the get byte routing in ZP to point to different code...

At the moment it works in X128, but fails on hardware and Z64K so I'm trying to work out what the magic combo is...

2019-04-30 18:18

Krill

Registered: Apr 2002
Posts: 3083

Limiting the offsets worsens the pack ratio, i'd try to avoid that.

You can add code to check if the sequence copy read range overlaps $ff00..$ff04 and then select the appropriate routine. Obviously the selection code should be highly optimised.

But if your trick to read RAM in that range only works in VICE and there is no way to achieve that on the real thing, this is all moot, of course.

2019-04-30 19:02

Oswald

Registered: Apr 2002
Posts: 5126

just use different code for the whole ffxx page then just one cmp for src hi byte.

2019-05-01 09:16

oziphantom

Registered: Oct 2014
Posts: 502

well fe and ff as y = ff and pointer is fe01 then lda (pointer),y still hits ff00. And you need to preserve C,V I think, and I can't use the stack... have to double check

2019-05-01 10:14

tlr

Registered: Sep 2003
Posts: 1807

I think the original idea of modifying the compressor not to reference certain areas would be quite doable. I've considered that for subsizer but haven't really seen an actual use case for it.

If it is only reading you are concerned about, a modification to your favourite compressor's match algorithm should do it. e.g you could have an extra bit (or byte) per data byte that says if this may be included in a match or not. This will result in the whole match database not containing any possible references to that data, always generating output without it in the later steps.

If writing is a problem, you would need to add some way of doing skips in the output.

2019-05-01 10:47

Krill

Registered: Apr 2002
Posts: 3083

Quoting oziphantom

well fe and ff as y = ff and pointer is fe01 then lda (pointer),y still hits ff00. And you need to preserve C,V I think, and I can't use the stack... have to double check

I think it's most sensible to check the read range in flat memory space to decide on plain copy or $ff0X copy, then apply all those banking/stack relocation shenanigans to actually copy bytes around. :)

The decision should be made per sequence copy, not per source byte.

Previous - 1 | 2 | 3 - Next

Refresh

Subscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search for in
All times are CET.

Search CSDb

Advanced

Users Online

Marco/DDM
B.A./QUANTUM
lotus_skylight
Mason/Unicess
Guests online: 385

Top Demos

1 Next Level  (9.7)
2 13:37  (9.7)
3 Codeboys & Endians  (9.7)
4 Mojo  (9.6)
5 Coma Light 13  (9.6)
6 Edge of Disgrace  (9.6)
7 Signal Carnival  (9.6)
8 Uncensored  (9.5)
9 Wonderland XIV  (9.5)
10 No Bounds  (9.5)

Top onefile Demos

1 Nine  (9.7)
2 Layers  (9.6)
3 Cubic Dream  (9.6)
4 Party Elk 2  (9.6)
5 Copper Booze  (9.5)
6 Scan and Spin  (9.5)
7 Onscreen 5k  (9.5)
8 Grey  (9.5)
9 Dawnfall V1.1  (9.5)
10 Rainbow Connection  (9.5)

Top Groups

1 Artline Designs  (9.3)
2 Booze Design  (9.3)
3 Oxyron  (9.3)
4 Performers  (9.3)
5 Censor Design  (9.3)

Top Musicians

1 Rob Hubbard  (9.7)
2 Jeroen Tel  (9.7)
3 Stinsen  (9.7)
4 LMan  (9.7)
5 Linus  (9.6)

Page generated in: 0.062 sec.