| |
Digger
Registered: Mar 2005 Posts: 437 |
setting nibbles for chunky FLI
A classic one but couldn't find any relevant threads in this forum
what's the fastest way for merging color nibbles for the screen RAM (chunky FLI, plasmas, whatever) that you've came across?
The non-optimised being:
lda #colorL
ldx #colorH
ora nibbleLUT,x
sta $0400
* = $80
nibbleLUT: .byte $00, $10, $20, $30
.byte $40, $50, $60, $70
.byte $80, $90, $a0, $b0
.byte $c0, $d0, $e0, $f0
|
|
| |
chatGPZ
Registered: Dec 2001 Posts: 11386 |
the one i thing i remember.... every other line use the colorram for the second color, so you can avoid merging completely |
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
Quote: the one i thing i remember.... every other line use the colorram for the second color, so you can avoid merging completely
That would require two STAs. One can do:
lda #colorL
ldx #colorH_eor_FF
sax $0400
|
| |
Oswald
Registered: Apr 2002 Posts: 5094 |
maybe you will find this helpful:
lda 0x0y,x
sta screen
where 0x and 0y are two lo nibbles to be merged.
,x helps to push the look up tables away from $0000-$0001.
$10 byte table needed troughout 4k pages. |
| |
Digger
Registered: Mar 2005 Posts: 437 |
Quote: That would require two STAs. One can do:
lda #colorL
ldx #colorH_eor_FF
sax $0400
Very neat! So, since SAX stores the bitwise AND of A and X, it needs the opposite nibbles filled with ones.
lda #$f(colorL)
ldx #$(colorH)f
sax $0400
|
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
or lda $8x8y
sta $0400 if you plot with colours $80 to $8f instead of $00 to $0f
|
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
That said, plotting into a routine with a stride of six bytes is not ideal.
Perhaps lda(buf+n*2),y
sta$400+n where each entry of buf contains $8x? It's two cycles slower than my previous submission, but you save two cycles on writing the buffer, the memory arrangement is saner, and the buffer filling code would be shorter too. |
| |
Copyfault
Registered: Dec 2001 Posts: 478 |
Quoting DiggerVery neat! So, since SAX stores the bitwise AND of A and X, it needs the opposite nibbles filled with ones.
lda #$f(colorL)
ldx #$(colorH)f
sax $0400
But this requires that you know beforehand which colour belongs to the low-/hi-nibble resp. Plus your algorithm has to also render those %1111-nibles accordingly.
Then again, it would be the same doing
lda #$0(colorL)
ora #$(colorH)0
sta $0400,
where the algo has to take care for the %0000 nibbles instead of the %1111.
If you only have the colour information as low-nibble-bytes ($00,..., $0f), maybe one could do
lda #0(colorL)
ora $0(colorH) -> bytes in zp with swapped nibbles, i.e. $0f -> $f0 etc.
sta $0400
But I guess the zp-adresses $00 and $01 will cause trouble here, so we again have to tweak the byte with the Hi-Colour-information (smth. like "ora $1(colourH)").
Maybe I missed the central advantage of that SAX-approach... |
| |
Rastah Bar Account closed
Registered: Oct 2012 Posts: 336 |
If you have the colours as $fx, x=0..$f, then maybe you could do
lda #f(colorL) -> for example $f6 for blue.
and $f(colorH) -> zp-addresses $f0 .. $ff contain $0f, $1f, ..., $ff
sta $0400
|
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
Quote: If you have the colours as $fx, x=0..$f, then maybe you could do
lda #f(colorL) -> for example $f6 for blue.
and $f(colorH) -> zp-addresses $f0 .. $ff contain $0f, $1f, ..., $ff
sta $0400
Ooh, I like that one! |
| |
Oswald
Registered: Apr 2002 Posts: 5094 |
Quote: That said, plotting into a routine with a stride of six bytes is not ideal.
Perhaps lda(buf+n*2),y
sta$400+n where each entry of buf contains $8x? It's two cycles slower than my previous submission, but you save two cycles on writing the buffer, the memory arrangement is saner, and the buffer filling code would be shorter too.
I dont get this one where are the lo/hi pixels merged? |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
Oswald, it accesses your sparse table, just relocated to $8L8H
The y register is zero, and a row of pixels are stored in a buffer in zero page. Each pair of pixels is treated as a pointer. |
| |
Oswald
Registered: Apr 2002 Posts: 5094 |
ah I get it.
actually its on par with the ABS method hence the writes to zp are faster by 1-1 cycle.
in some cases it might be the better choice over the abs one. usually sta's are fixed in 4x4 per pixel style effects, imho the benefit would come from that the writes happen to the same zp buffer and they get translated to fullscreen. |