Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
You are not logged in - nap
CSDb User Forums


Forums > C64 Coding > Screen copy CPU intensive
2007-09-21 20:17
johncl

Registered: Aug 2007
Posts: 37
Screen copy CPU intensive

I am working on a little game for the C64 and have stumbled upon a slight problem. A smooth scroller is darn CPU intensive! Its a simple sidescroller where I want to scroll the upper 18 lines. The copy required to move the character and color ram is very expensive! I use kickassembler and at first I tried this:

.macro ScrollLines(screen,from,to) {
.var half = round([to-from]/2)
ldx #0
jmp !loop+
.align $100
!loop:
.for (var i=from;i<from+half;i++) {
lda screen+1+[i*40],x
sta screen+[i*40],x
lda SCREEN_COLOR+1+[i*40],x
sta SCREEN_COLOR+[i*40],x
}
inx
cpx #39
bne !loop-
ldx #0
jmp !loop+
.align $100
!loop:
.for (var i=from+half;i<=to;i++) {
lda screen+1+[i*40],x
sta screen+[i*40],x
lda SCREEN_COLOR+1+[i*40],x
sta SCREEN_COLOR+[i*40],x
}
inx
cpx #39
bne !loop-
}

So here I unroll a full column copy in two parts (because bne wont work if I unroll whole column). This even aligns the two loops so that the bne doesnt cross any page boundaries and end up costing another cycle. But still, this routine eats up my CPU big time.

I then tried a complete unroll of a pure copy:

.macro ScrollLines2(screen,from,to) {
.for (var i=from*40;i<=to*40;i++) {
lda screen+1+i
sta screen+i
lda SCREEN_COLOR+1+i
sta SCREEN_COLOR+i
}
}

This also copies the unecessary column too that I need to copy from my map data. But this complete unroll uses almost 2/3 of the screen raster time! Even with this I have to split up my copy so the top part is copied at the bottom redraw of the refresh and the bottom part on start of next screen refresh.

Am I doing something wrong here?
2007-09-21 20:21
tlr

Registered: Sep 2003
Posts: 1790
You need to ask yourself how fast this has to scroll.
Usually this is double buffered, and the copy process is split into several frames.
2007-09-21 20:24
chatGPZ

Registered: Dec 2001
Posts: 11386
for a simple sidescroller (always scrolling into the same direction with the same speed) might want to use doublebuffering, you only need to have an unrolled loop for the colorram then, for the screenram you'll have 7 frames (if you are scrolling with 1 pixel per frame).

and yes, scrolling is cpu intense, its one of the major problems with c64 game programming :=)
2007-09-22 08:13
Oswald

Registered: Apr 2002
Posts: 5094
here is how to move the screen horizontally using HW only.

http://codebase64.org/doku.php?id=base:horizontal_screen_positi..
2007-09-22 08:35
johncl

Registered: Aug 2007
Posts: 37
Thanks for the tips. I managed to make it work in my game with 17 lines scrolling by splitting the copy in two, top half at the bottom of the raster redraw and the bottom half in the beginning of next screen. But it eats a lot of the available CPU time indeed.

Just before I read the answers here I also thought about what you have written here, to split up the copy in two halves and do double buffering by being a scroll screen ahead in time (depensing on direction player is moving). Since I'll maximum do 4 pixel scrolls (more likely 2-3 pixels) I should be fine.

Oswald, thanks for the code reference too, I must study that in detail to understand it as I am not very educated in the tricks that are used here. Does it do anything special to increase the number of available cycles to the CPU for the copy?
2007-09-22 08:44
Oswald

Registered: Apr 2002
Posts: 5094
johncl, it is possible to move the screen by 40 chars without copying anything.

downsides:
- this process will move the screen also 1 char up (down? dont remember now) meaning you will need to work with 2 screens.
- needs stable raster, cycle exact timing
- doesnt works on some c64 revisions (this trick does change the memory refresh timing, and some rams does not tolerate that)

experiment with fungus' code if you're interested. turn all 65 cycle delays where commented into 63.
2007-09-22 09:46
JackAsser

Registered: Jun 2002
Posts: 2014
Quote: Thanks for the tips. I managed to make it work in my game with 17 lines scrolling by splitting the copy in two, top half at the bottom of the raster redraw and the bottom half in the beginning of next screen. But it eats a lot of the available CPU time indeed.

Just before I read the answers here I also thought about what you have written here, to split up the copy in two halves and do double buffering by being a scroll screen ahead in time (depensing on direction player is moving). Since I'll maximum do 4 pixel scrolls (more likely 2-3 pixels) I should be fine.

Oswald, thanks for the code reference too, I must study that in detail to understand it as I am not very educated in the tricks that are used here. Does it do anything special to increase the number of available cycles to the CPU for the copy?


The trick is called VSP. Both VSP and normal double buffered scroll needs a double buffer.

But for normal scrolling you have 8 pixels to complete the other buffer, that is, you need to paste 1000/8 chars every frame (assuming 1 pixel / frame scrolling).

With VSP you only need to update one column per 8 pixels. So you only need to paste 25/8 chars. This has to be done twice because you must update the other screen which will replace the current one as soon as you have scrolled 40 chars. So all in all 25/8*2 chars per frame.

STILL! None of these techniques eliminates the update of the $d800 area. At some point all 1000 $d800 nybbles has to be updated in just one frame.

If you code for C128 you may double buffer $d800 aswell though, significantly increase the scrolling speed.

If you wanna learn more about VSP, read about the bad lines and the VSP trick in general in the VIC-article. It's quite good explained there and not as mysterious as many tries to make it.

/Andreas
2007-09-22 11:50
Graham
Account closed

Registered: Dec 2002
Posts: 990
VSP is a bad idea since it crashes on 50% of the machines.
2007-09-22 12:03
JackAsser

Registered: Jun 2002
Posts: 2014
Quote: VSP is a bad idea since it crashes on 50% of the machines.

@Graham: U have to really beleive in it otherwise VSP will never work. It's like it knows u.
2007-09-22 13:45
doynax
Account closed

Registered: Oct 2004
Posts: 212
The standard approach for games is to define color attributes in large blocks instead of on a per-tile basis. So if you've got 4x4 blocks of equal color values for instance then you'd only have to update every fourth column when shifting the screen horizontally. Double buffered video matrix scrolling is important as well so you can split the work across several frames.

There are certain alternative hardware tricks too with better compatibility than VSP. I hacked together a charmode with 16-pixel tall tiles for a game project, which then only requires half as much effort to scroll.

All things considered though efficient scrolling is damned hard on the C64, especially when you want to combine both horizontal and vertical scrolling.
2007-09-22 18:09
PopMilo

Registered: Mar 2004
Posts: 146
@doynax: "There are certain alternative hardware tricks..."

Could you be more specific? :)
2007-09-22 18:33
doynax
Account closed

Registered: Oct 2004
Posts: 212
Quote: @doynax: "There are certain alternative hardware tricks..."

Could you be more specific? :)


Well.. To be honest I can't think of anything else that'd be useful off hand. But the hack I was talking about ought to qualify.

Basically I abort every other badline and switched between a pair of charsets every eight lines, which essentially creates a charmode with twice the normal tile height (i.e. half as many char indices/colors to manage and twice as much graphics data). By using the NMI timers kind of like in Ninja's FLI display it cost more than about 500 cycles per frame, what with the time regained from the badlines. And with the IRQ vector free for sprite multiplexing it's actually quite usable in a game, although screen splits and other cycle-exact effects are a mess.
A neat point is that the missed badlines still increase the video matrix pointer so by starting the display a row late you get color ram double buffering for vertical scrolling :)

Anyway, here's the proof-of-concept demo: http://www.minoan.ath.cx/~doynax/6502/Balls%20of%20the%20Scroll..
(I know it isn't all that fancy but I'm a novice when it comes to VIC hacking so I'd like to pretend I invented a new graphics mode ;)
2007-09-22 18:35
JackAsser

Registered: Jun 2002
Posts: 2014
Quote: Well.. To be honest I can't think of anything else that'd be useful off hand. But the hack I was talking about ought to qualify.

Basically I abort every other badline and switched between a pair of charsets every eight lines, which essentially creates a charmode with twice the normal tile height (i.e. half as many char indices/colors to manage and twice as much graphics data). By using the NMI timers kind of like in Ninja's FLI display it cost more than about 500 cycles per frame, what with the time regained from the badlines. And with the IRQ vector free for sprite multiplexing it's actually quite usable in a game, although screen splits and other cycle-exact effects are a mess.
A neat point is that the missed badlines still increase the video matrix pointer so by starting the display a row late you get color ram double buffering for vertical scrolling :)

Anyway, here's the proof-of-concept demo: http://www.minoan.ath.cx/~doynax/6502/Balls%20of%20the%20Scroll..
(I know it isn't all that fancy but I'm a novice when it comes to VIC hacking so I'd like to pretend I invented a new graphics mode ;)


IMO that IS a perfect use of the C64 HW capabilities. Good work! I'm impressed.
2007-09-22 19:13
doynax
Account closed

Registered: Oct 2004
Posts: 212
Quote: IMO that IS a perfect use of the C64 HW capabilities. Good work! I'm impressed.

Thank you :)

Hopefully I'll manage to finish that game sometime..
2007-09-22 20:26
johncl

Registered: Aug 2007
Posts: 37
That is a pretty nice demo of C64 trickery indeed! But for this game I will stick with more simple coding atm.

Even though my horizontal scroller works fine with 17 lines now, it seems I have to go for a double buffered solution simply because unrolling screen and color ram copies in both directions is just eating up all my memory. Since double buffering only requires me to unroll a color ram copy (6 bytes per char * 39 * 17 lines * 2 directions = 7956 bytes) and chances are that I dont need a full unroll but can do one using absolute,x indexing an still have enough cpu time for the rest. For the screen char data I will be fine since I will maximum have 4 pixel scrolls leaving a complete frame free for char copy (where I dont have to worry about where the raster is).

Its strange that this thing turned up to become a challenge after all. Which makes it all the more interesting a project. :)
2007-09-22 21:03
cadaver

Registered: Feb 2002
Posts: 1160
I don't know how much you need leftover rastertime, but I just want to mention that games like Turrican I/II/III do up-to-4-pixels-per-frame scrolling without unrolling either the screen or color copy routine. An example:

frame 1: shift screen data in hidden buffer
frame 2: shift color data, split in two halves
frame 3: repeat above...
2007-09-22 21:22
doynax
Account closed

Registered: Oct 2004
Posts: 212
Quote: I don't know how much you need leftover rastertime, but I just want to mention that games like Turrican I/II/III do up-to-4-pixels-per-frame scrolling without unrolling either the screen or color copy routine. An example:

frame 1: shift screen data in hidden buffer
frame 2: shift color data, split in two halves
frame 3: repeat above...


I'm amazed they had raster time left to do anything useful after that. Having to waste almost half the frame just on scrolling isn't exactly encouraging.

Of course they weren't full screen but still..
2007-09-22 21:55
raven
Account closed

Registered: Jan 2002
Posts: 137
I still havent seen any machine crash by running VSP, be it C64, C64C, C128 or SX64.

Is it verified that RAM refresh causes it?
2007-09-22 22:03
chatGPZ

Registered: Dec 2001
Posts: 11386
no, the actual cause is pretty much unknown. ram-refresh glitches is just the most popular guess :)
2007-09-22 22:20
cadaver

Registered: Feb 2002
Posts: 1160
Doynax: also notice the special "hell elevator" scene in Turrican II where you drop 8 pixels per frame :) *That* at least must consume all rastertime available..
2007-09-22 22:30
Hein

Registered: Apr 2004
Posts: 954
Quote: no, the actual cause is pretty much unknown. ram-refresh glitches is just the most popular guess :)

Guess??? wtf.. Graham guesses his way through life?
2007-09-23 00:31
TDJ

Registered: Dec 2001
Posts: 1879
Quote: Guess??? wtf.. Graham guesses his way through life?

Suddenly all those demos he did lose a lot of their meaning :(

"He never meant that effect to happen, he just typed some code and hoped for the best!"
2007-09-23 00:37
chatGPZ

Registered: Dec 2001
Posts: 11386
thats how a lot of demos were made probably =)
2007-09-23 07:05
AüMTRöN

Registered: Sep 2003
Posts: 44
Quoting doynax
Anyway, here's the proof-of-concept demo: http://www.minoan.ath.cx/~doynax/6502/Balls%20of%20the%20Scroll..
(I know it isn't all that fancy but I'm a novice when it comes to VIC hacking so I'd like to pretend I invented a new graphics mode ;)


Nice demo, I liked it a lot. I guess your enemy formations lend themselves to "simple" multiplexing, but I still like a lot of moving objects no matter how its done. Good work! :D
2007-09-23 08:41
doynax
Account closed

Registered: Oct 2004
Posts: 212
Quoting cadaver
Doynax: also notice the special "hell elevator" scene in Turrican II where you drop 8 pixels per frame :) *That* at least must consume all rastertime available..
Meh.. It seems like that whole game is built around a huge number of special case hacks.

Quoting MTR1975
I guess your enemy formations lend themselves to "simple" multiplexing, but I still like a lot of moving objects no matter how its done.
To be honest that's more of a symptom of lazy level design than a special purpose multiplexer. There should be room for a few sprites left to play with actually, especially if you synchronize the players' bullet streams and allow the occasional glitch.
2007-09-23 11:28
johncl

Registered: Aug 2007
Posts: 37
Quote: I don't know how much you need leftover rastertime, but I just want to mention that games like Turrican I/II/III do up-to-4-pixels-per-frame scrolling without unrolling either the screen or color copy routine. An example:

frame 1: shift screen data in hidden buffer
frame 2: shift color data, split in two halves
frame 3: repeat above...


This is assuring to read. I see that unrolling the loops take a lot of memory which I really want to use for compressed level data instead. I'll probably unroll whole columns at least to free as much raster time as possible. Since I need a different loop per frame and one per direction a big unroll would have left me with scraps.

While I am at it (and sorry for the noobish questions here as I am quite new to the C64 after all these years): I have been able to turn off Basic ROM since that is not needed to free 8kb of memory. But is there more rom memory I can "get under"? I seem to recall that I should be able to write to char rom at $d000-$dfff but reading this I guess requires me to turn off that rom somehow? And the RAM under the kernal rom at $e000-$ffff - Is there any way I can use that? I guess I just need to avoid using kernal functions or will turning it off ruin stuff like raster interrupts etc? Again sorry for the noob questions.

Today I basically use the kernal for reading the keyboard to eg start the game or enter name in hiscore list. But maybe if I could store compressed level data there, turn it off, unpack to current level ram and turn it on again I will be ok? It would have been nice to use all the ram I can use.

Also one final question. Zero page addresses, which can I use for my own programs without messing anything up? I guess I can use any spot that is typically used by basic at least in addition to the end bytes that are reserved for apps?
2007-09-23 12:10
Radiant

Registered: Sep 2004
Posts: 639
johncl: Put $35 in memory location $01 to swap out both the BASIC and KERNAL ROM areas. You'll have to use the $FFFE/$FFFF etc interrupt vectors.

Put $34 there to also swap out the IO registers at $D000-$E000. You have to enable them again once you are going to do any SID, VIC or CIA operations, of course.
2007-09-23 12:23
Steppe

Registered: Jan 2002
Posts: 1510
Johncl, I think a very useful site for you is here:
http://www.the-dreams.de/aay.html
At least the zero page issue is answered there.
2007-09-23 14:11
Radiant

Registered: Sep 2004
Posts: 639
Have to commend doynax on the EXCELLENT technique as well! Well done indeed!
2007-09-23 14:55
Stainless Steel

Registered: Mar 2003
Posts: 966
Looks great, except for the gfx.
2007-09-23 16:01
Oswald

Registered: Apr 2002
Posts: 5094
$01 usage for beginners:

http://codebase64.org/doku.php?id=base:memory_management&s=memo..

raster irq with kernal & basic off:

http://codebase64.org/doku.php?id=base:introduction_to_raster_i..


if you still want to use kernal you can do turn it back on IN the irq, and turn it off(or restore state) before exiting. If you use the kernal for reading the keys make sure to call SCNKEY once every screen refresh since this routine scans the keyboard and fills the keyboard buffer for GETIN which you call normall from the main loop to read the keys.

Make sure that your irq saves the state of $01 before changing it, and restores it before exiting. This assures that your code outside the irq can aswell safely turn the roms on and off.

This way you can turn off even the $d000 area, but that will cost you time since in all irqs you will have to turn it back on and off again.
2007-09-24 19:36
johncl

Registered: Aug 2007
Posts: 37
Quote: johncl: Put $35 in memory location $01 to swap out both the BASIC and KERNAL ROM areas. You'll have to use the $FFFE/$FFFF etc interrupt vectors.

Put $34 there to also swap out the IO registers at $D000-$E000. You have to enable them again once you are going to do any SID, VIC or CIA operations, of course.


Thanks for the tips. Oswald also posted some good links that explains it all. It is nice to be able to use as much memory as possible. That kernal keyboard scan is really only needed during the non-game screens so I can swap in the kernal again for those screens. The game would benefit from the extra memory so that I can store some compressed level data there. The big challenge at the end here is to see how many levels I can squeeze into the game without any loading. :)
2007-09-25 07:13
Radiant

Registered: Sep 2004
Posts: 639
johncl: And don't forget loading isn't a big deal nowadays, with several very fast and easy to use IRQ loaders around. Almost like already having the data in RAM. ;-)
2007-09-25 09:19
Oswald

Registered: Apr 2002
Posts: 5094
yeah, you should rather ask for a loader source here instead of squeezing the memory and limiting the game code.
2007-09-25 21:26
johncl

Registered: Aug 2007
Posts: 37
Yeah, I might try out a loader too if the game cant pack enough levels into memory. For this game project I really wanted to make a tape version and all. Its like an advanced joke where I am making a brand new version of a game I did in basic back in the 80s (which was in a terrible state). :)

Maybe the disk version will have lots of extra levels. :)
2007-09-26 12:23
cbmeeks

Registered: Oct 2005
Posts: 78
@johncl

I think you and I are in the same boat. I am also working on developing a simple scroller/platformer (Metroid clone).

Last night, I was able to get one column drawn from map data. Big freaking deal...lol

Well, it has been a LONG time since I was a 6502 coder.

Anyway, my email is cbmeeks AT gmail DOT com
if you want to share ideas or whatever. Of course since you are ahead of me I will probably just be an anchor stealing ideas from you. hahaha

Seriously, I am finding the KickAssembler a real joy to use. I think that is how I was able to do it in about 20 minutes last night! But, I did unroll everything. So, I am going to figure out a cleaner/better way soon.


METROID
http://www.metroidclassic.com
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
psych
Thierry
Raf/Vulture Design
Guests online: 105
Top Demos
1 Next Level  (9.7)
2 13:37  (9.7)
3 Mojo  (9.7)
4 Coma Light 13  (9.6)
5 Edge of Disgrace  (9.6)
6 What Is The Matrix 2  (9.6)
7 The Demo Coder  (9.6)
8 Uncensored  (9.6)
9 Comaland 100%  (9.6)
10 Wonderland XIV  (9.6)
Top onefile Demos
1 No Listen  (9.6)
2 Layers  (9.6)
3 Cubic Dream  (9.6)
4 Party Elk 2  (9.6)
5 Copper Booze  (9.6)
6 X-Mas Demo 2024  (9.5)
7 Dawnfall V1.1  (9.5)
8 Rainbow Connection  (9.5)
9 Onscreen 5k  (9.5)
10 Morph  (9.5)
Top Groups
1 Performers  (9.3)
2 Booze Design  (9.3)
3 Oxyron  (9.3)
4 Censor Design  (9.3)
5 Triad  (9.3)
Top Swappers
1 Derbyshire Ram  (10)
2 Jerry  (9.8)
3 Violator  (9.7)
4 Acidchild  (9.7)
5 Cash  (9.6)

Home - Disclaimer
Copyright © No Name 2001-2024
Page generated in: 0.079 sec.