| |
johncl
Registered: Aug 2007 Posts: 37 |
Screen copy CPU intensive
I am working on a little game for the C64 and have stumbled upon a slight problem. A smooth scroller is darn CPU intensive! Its a simple sidescroller where I want to scroll the upper 18 lines. The copy required to move the character and color ram is very expensive! I use kickassembler and at first I tried this:
.macro ScrollLines(screen,from,to) {
.var half = round([to-from]/2)
ldx #0
jmp !loop+
.align $100
!loop:
.for (var i=from;i<from+half;i++) {
lda screen+1+[i*40],x
sta screen+[i*40],x
lda SCREEN_COLOR+1+[i*40],x
sta SCREEN_COLOR+[i*40],x
}
inx
cpx #39
bne !loop-
ldx #0
jmp !loop+
.align $100
!loop:
.for (var i=from+half;i<=to;i++) {
lda screen+1+[i*40],x
sta screen+[i*40],x
lda SCREEN_COLOR+1+[i*40],x
sta SCREEN_COLOR+[i*40],x
}
inx
cpx #39
bne !loop-
}
So here I unroll a full column copy in two parts (because bne wont work if I unroll whole column). This even aligns the two loops so that the bne doesnt cross any page boundaries and end up costing another cycle. But still, this routine eats up my CPU big time.
I then tried a complete unroll of a pure copy:
.macro ScrollLines2(screen,from,to) {
.for (var i=from*40;i<=to*40;i++) {
lda screen+1+i
sta screen+i
lda SCREEN_COLOR+1+i
sta SCREEN_COLOR+i
}
}
This also copies the unecessary column too that I need to copy from my map data. But this complete unroll uses almost 2/3 of the screen raster time! Even with this I have to split up my copy so the top part is copied at the bottom redraw of the refresh and the bottom part on start of next screen refresh.
Am I doing something wrong here? |
|
... 25 posts hidden. Click here to view all posts.... |
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
Quote: Thanks for the tips. I managed to make it work in my game with 17 lines scrolling by splitting the copy in two, top half at the bottom of the raster redraw and the bottom half in the beginning of next screen. But it eats a lot of the available CPU time indeed.
Just before I read the answers here I also thought about what you have written here, to split up the copy in two halves and do double buffering by being a scroll screen ahead in time (depensing on direction player is moving). Since I'll maximum do 4 pixel scrolls (more likely 2-3 pixels) I should be fine.
Oswald, thanks for the code reference too, I must study that in detail to understand it as I am not very educated in the tricks that are used here. Does it do anything special to increase the number of available cycles to the CPU for the copy?
The trick is called VSP. Both VSP and normal double buffered scroll needs a double buffer.
But for normal scrolling you have 8 pixels to complete the other buffer, that is, you need to paste 1000/8 chars every frame (assuming 1 pixel / frame scrolling).
With VSP you only need to update one column per 8 pixels. So you only need to paste 25/8 chars. This has to be done twice because you must update the other screen which will replace the current one as soon as you have scrolled 40 chars. So all in all 25/8*2 chars per frame.
STILL! None of these techniques eliminates the update of the $d800 area. At some point all 1000 $d800 nybbles has to be updated in just one frame.
If you code for C128 you may double buffer $d800 aswell though, significantly increase the scrolling speed.
If you wanna learn more about VSP, read about the bad lines and the VSP trick in general in the VIC-article. It's quite good explained there and not as mysterious as many tries to make it.
/Andreas |
| |
Graham Account closed
Registered: Dec 2002 Posts: 990 |
VSP is a bad idea since it crashes on 50% of the machines. |
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
Quote: VSP is a bad idea since it crashes on 50% of the machines.
@Graham: U have to really beleive in it otherwise VSP will never work. It's like it knows u. |
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
The standard approach for games is to define color attributes in large blocks instead of on a per-tile basis. So if you've got 4x4 blocks of equal color values for instance then you'd only have to update every fourth column when shifting the screen horizontally. Double buffered video matrix scrolling is important as well so you can split the work across several frames.
There are certain alternative hardware tricks too with better compatibility than VSP. I hacked together a charmode with 16-pixel tall tiles for a game project, which then only requires half as much effort to scroll.
All things considered though efficient scrolling is damned hard on the C64, especially when you want to combine both horizontal and vertical scrolling. |
| |
PopMilo
Registered: Mar 2004 Posts: 146 |
@doynax: "There are certain alternative hardware tricks..."
Could you be more specific? :)
|
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
Quote: @doynax: "There are certain alternative hardware tricks..."
Could you be more specific? :)
Well.. To be honest I can't think of anything else that'd be useful off hand. But the hack I was talking about ought to qualify.
Basically I abort every other badline and switched between a pair of charsets every eight lines, which essentially creates a charmode with twice the normal tile height (i.e. half as many char indices/colors to manage and twice as much graphics data). By using the NMI timers kind of like in Ninja's FLI display it cost more than about 500 cycles per frame, what with the time regained from the badlines. And with the IRQ vector free for sprite multiplexing it's actually quite usable in a game, although screen splits and other cycle-exact effects are a mess.
A neat point is that the missed badlines still increase the video matrix pointer so by starting the display a row late you get color ram double buffering for vertical scrolling :)
Anyway, here's the proof-of-concept demo: http://www.minoan.ath.cx/~doynax/6502/Balls%20of%20the%20Scroll..
(I know it isn't all that fancy but I'm a novice when it comes to VIC hacking so I'd like to pretend I invented a new graphics mode ;) |
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
Quote: Well.. To be honest I can't think of anything else that'd be useful off hand. But the hack I was talking about ought to qualify.
Basically I abort every other badline and switched between a pair of charsets every eight lines, which essentially creates a charmode with twice the normal tile height (i.e. half as many char indices/colors to manage and twice as much graphics data). By using the NMI timers kind of like in Ninja's FLI display it cost more than about 500 cycles per frame, what with the time regained from the badlines. And with the IRQ vector free for sprite multiplexing it's actually quite usable in a game, although screen splits and other cycle-exact effects are a mess.
A neat point is that the missed badlines still increase the video matrix pointer so by starting the display a row late you get color ram double buffering for vertical scrolling :)
Anyway, here's the proof-of-concept demo: http://www.minoan.ath.cx/~doynax/6502/Balls%20of%20the%20Scroll..
(I know it isn't all that fancy but I'm a novice when it comes to VIC hacking so I'd like to pretend I invented a new graphics mode ;)
IMO that IS a perfect use of the C64 HW capabilities. Good work! I'm impressed. |
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
Quote: IMO that IS a perfect use of the C64 HW capabilities. Good work! I'm impressed.
Thank you :)
Hopefully I'll manage to finish that game sometime.. |
| |
johncl
Registered: Aug 2007 Posts: 37 |
That is a pretty nice demo of C64 trickery indeed! But for this game I will stick with more simple coding atm.
Even though my horizontal scroller works fine with 17 lines now, it seems I have to go for a double buffered solution simply because unrolling screen and color ram copies in both directions is just eating up all my memory. Since double buffering only requires me to unroll a color ram copy (6 bytes per char * 39 * 17 lines * 2 directions = 7956 bytes) and chances are that I dont need a full unroll but can do one using absolute,x indexing an still have enough cpu time for the rest. For the screen char data I will be fine since I will maximum have 4 pixel scrolls leaving a complete frame free for char copy (where I dont have to worry about where the raster is).
Its strange that this thing turned up to become a challenge after all. Which makes it all the more interesting a project. :) |
| |
cadaver
Registered: Feb 2002 Posts: 1160 |
I don't know how much you need leftover rastertime, but I just want to mention that games like Turrican I/II/III do up-to-4-pixels-per-frame scrolling without unrolling either the screen or color copy routine. An example:
frame 1: shift screen data in hidden buffer
frame 2: shift color data, split in two halves
frame 3: repeat above...
|
Previous - 1 | 2 | 3 | 4 - Next |