| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
Faster charmap scrolling
Some interesting asides over in the Pixeling forum about speeding up charmaps (cf Graphician for intense EF game). As Oswald pointed out, switching from tiles to a straight unpacked charmaps doesn't really save you much, as you can avoid dealing with tile indices for most of the screen by just copying most of the chars from within VM. Besides, even tile index reads can be amortised over multiple VM writes.
However, there are other possibilities. If you've got a little RAM to spare (eg because all your level data is in EF), then why not unroll the update loop into one hardcoded routine per column?
Could easily dedicate 5k to
lda#$xx
sta vm,x
lda#$xx
sta vm+40,x
lda#$xx
sta vm+2*40,x
..
lda#$xx
sta vm+24*40,x
which gets you down to 7 cycles per char (14 if you also do video ram)
You only need to update one column of source each time you scroll one char, and call the columns in sequence with increasing values of X.
Might have to do divide into upper/lower half of screen to avoid tearing.
Of course, if you want to be really extravagant, you could generate a routine per column of level data, and skip any redundant loads by grouping identical indices; kind of like compiled sprites on PC.
That would eat shedloads of flash if you stored them all in advance of course (a tad less with duplicate removal), or you could try generating them on the fly
|
|
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
My latest full screen scroll code involves:
stx SCREEN1+$0000
stx SCREEN2+$0000
inx
stx SCREEN1+$0028
stx SCREEN2+$0028
inx
stx SCREEN1+$0050
stx SCREEN2+$0050
inx
stx SCREEN1+$0078
stx SCREEN2+$0078
inx
stx SCREEN1+$00a0
stx SCREEN2+$00a0
stx SCREEN1+$0001
stx SCREEN2+$0001
inx
stx SCREEN1+$00c8
stx SCREEN2+$00c8
stx SCREEN1+$0029
stx SCREEN2+$0029
inx
stx SCREEN1+$00f0
stx SCREEN2+$00f0
stx SCREEN1+$0051
stx SCREEN2+$0051
inx
stx SCREEN1+$0118
stx SCREEN2+$0118
stx SCREEN1+$0079
stx SCREEN2+$0079
inx
.
.
.
Segment (start, stop, size):
SCROLLER 007C00 0094CF 0018D0
For what and how it's used is a secret and will be revealed at a future demo party. :) |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
Haha, nice. I can think of a few things you could do with that.. |
| |
Oswald
Registered: Apr 2002 Posts: 5095 |
jackie, twister with half chars ? :) |
| |
Oswald
Registered: Apr 2002 Posts: 5095 |
CJ, took me a minute until I got it, thats an awesome idea :) tho the usage is limited to horizontal scrollin. |
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
Quote: CJ, took me a minute until I got it, thats an awesome idea :) tho the usage is limited to horizontal scrollin.
Free directional (mine that is). CJ's I dunno. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
Quoting OswaldCJ, took me a minute until I got it, thats an awesome idea :) tho the usage is limited to horizontal scrollin.
Thanks! Yes, I probably should have mentioned that limitation. |
| |
oziphantom
Registered: Oct 2014 Posts: 490 |
Given you have your chars mapped out like so
$8000 row 1 of chars
$8100 row 2 of chars
$8200 row 3 of chars
....
$a000 row 1 of colours
$a100 row 2 of colours
.....
And you are double buffered. Screen 1 at $4400 and Screen 2 at $4800
You have a window defined as Lindex and Rindex which start at 0 and 39.
Lets also assume that you are viewing Screen 1 to start with
When you scroll left, such that it appears the player has moved right.
You use an unrolled loop to copy
4401 -> 4800
4402 -> 4801
.....
You then need to plot your new column, in this case in the right edge of screen 2.
so
ldx Rindex
lda $8000,x
sta $4827
lda $8100,x
sta $484f
lda $8200,x
...
then you need to do the CRAM so again unrolled loop
d801 -> d800
d802 -> d801
.....
and plot the CRAM side
ldx Rindex
lda $A000,x
sta $d827
lda $A100,x
sta $484f
lda $d900,x
...
So you need 4 unrolled Screen copy routines
Screen1 to Screen2 forwards
Screen2 to Screen1 forwards
Screen1 to Screen2 backwards
Screen2 to Screen1 backwards
and 2 CRAM copy routines
CRAM+1 to CRAM
CRAM to CRAM+1
And 4 column routines
LeftEdge Screen1
LeftEdge Screen2
RightEdge Screen1
RightEdge Screen2
Now to move you move your "window", so to get to the next char row you inc Rindex and Lindex, to go back you dec them.
But this only gets you a 256 char wide map. So you need to add in a RindexBank and LindexBank as well. Once you roll over you either inc/dec the bank. Or even look up what the next bank is in a bank map, allowing you to repeat banks for extra length, or warp around maps. Since the Banks are always at $8000 you don't need to use any pointers, or indexes. Your unrolled loops can now service up to 1MB worth of map. Eats a good 24~5K but with all the map data and other code being able to be stored in ROM, it don't think it is going to matter, and it gives top speed with free range "boundless" map size, with unique colour per 1x1 map tile and you don't need any timing critical VIC tricks to worry about patching to support NTSC. However you can't modify the map, so dynamic parts of the map must become "entities". Could also be modified to support up/down and even possibly 8 way scrolling versions. |
| |
Oswald
Registered: Apr 2002 Posts: 5095 |
you're better off with reu, esp. since 1541u supports it so big user base. now reu can copy memory at 1 cycle / byte speed for scrolling. |
| |
cadaver
Registered: Feb 2002 Posts: 1160 |
Oziphantom: Agreed with Oswald, I believe you can not get appreciably faster whole screen or color-RAM update with just the CPU no matter what you're doing.
Unrolling will help, as will writing the same value to both screen & color-RAM (basically you could have 16 chars of the same color, Quod Init Exit IIm does this), but still the load/store operations dominate the load.
However you're probably not going to do the screen update every frame, so use that to your advantage, e.g. instead of waiting, you can have the main program always calculate at least 1 frame ahead, while IRQs perform the screen update. This means that the large chunk of CPU time taken by it isn't as devastating, as your next frame may already be half ready when interrupted by the screen update, which hopefully leaves enough time to finish. |
| |
Compyx
Registered: Jan 2005 Posts: 631 |
Seems like VSP would not be a bad idea, seeing how all scrolling is only horizontal. Saves a shitload of raster time, except for the one time you have to scroll colorram up.
But like Cadaver said, you can 'cheat' updating the colorram by carefully timing when it happens. But that might screw with any multiplexer, so perhaps move $d800-updating out of IRQ. |
... 20 posts hidden. Click here to view all posts.... |
Previous - 1 | 2 | 3 - Next |