| | jamiefuller Account closed
Registered: Mar 2018 Posts: 25 |
(Vertical) Scroller
Help!
I've been banging me head against the wall for sometime on this, i've read through so many forum posts but am really struggling so turning to you kind souls for advice.
I am trying to write a very simple scroller, that scrolls the screen downwards (think seuck)
The scroller will be 4 directional, (only ever one direction at a time) but needs to...
1) Scroll the full screen (1 char) every frame
2) not use double buffering (screen switching)
This scroller has to be full screen but doesn't have to worry about colour.
I have 3 of the directions working simply by using some speedcode and racing the raster, it still uses a lot of CPU but works fine for left, right and scrolling data upwards.
On the downward one I cannot race the raster as I need to copy from the bottom up.
So instead of dragging the screen data around I've been thinking about alternatives...
My best idea so far is to use an offscreen memory buffer (80*50) and simply copy from top to bottom, 40*25 chars onto the screen, this works but I am only just ahead of the raster, leaving no cpu for anything else, including adding the new chars.
my code looks like this
scroll_test2
lda #$c0
sta $d1
lda #$00
sta $d0
lda #$04
sta $d3
lda #$00
sta $d2
ldx #$19
@loop
ldy #$00
lda ($d0),y
sta ($d2),y
iny
... the above three lines are repeated 39 time
lda ($d0),y
sta ($d2),y
clc
lda $d0
adc #$50
sta $d0
bcc @lp1
inc $d1
clc
@lp1
lda $d2
adc #$28
sta $d2
bcc @lp2
inc $d3
@lp2
dex
beq @loop2
jmp @loop
@loop2
rts
I'm happy to waste 6k of ram and complete unroll this loop if need be,
Please help me speed this up or give me some advice on how I can do it better?
Thanks |
|
| | Mixer
Registered: Apr 2008 Posts: 452 |
Completely unrolled moving of 1000 bytes will take minimum of 4 cycle lda and 4 cycle sta times 1000 == minimum of 8000 cycles, which is nearly half of the cycles of a frame (19655). Partially unrolling for instance 25 lines and 40 loops of it will add only 200 cycles ldx 40 (start:(lda line+1,x sta line,x ... lda line+25,line+24,x) dex, bpl start), but save much memory.
Also, table lookup is faster way to fetch screen addresses than 16-bit addition.
If the scroll speed is less than charline per frame, then one can also split the copying over multiple frames.
Without double buffering the copying effect will very likely be visible. |
| | oziphantom
Registered: Oct 2014 Posts: 490 |
basically you have to cut it in half.
Cache middle line
while you are in the bottom half of the visible screen, shift up from middle to top. then when you get out of the visible screen you then shift the bottom->middle area. |
| | mhindsbo
Registered: Dec 2014 Posts: 51 |
I know you know this, but double buffering would solve all your problems. Are you so memory constrained that you can't have another screen mem area? |
| | jamiefuller Account closed
Registered: Mar 2018 Posts: 25 |
Quote: I know you know this, but double buffering would solve all your problems. Are you so memory constrained that you can't have another screen mem area?
No I just can't seem to get my head around double buffering :) won't the screen then be a single frame behind the "action"? I know this is likely a miniscule time frame but as I'm scrolling an entire char per frame. It seems like quite lot to be behind?
If double buffering is the way forward then I'll redouble my efforts there. |
| | Oswald
Registered: Apr 2002 Posts: 5094 |
https://cadaver.github.io/rants/scroll.html |
| | cadaver
Registered: Feb 2002 Posts: 1160 |
Using double buffer, you build up what you want to show on the next frame into the hidden screen (and can do that at your own leisure). You're not any more delayed.
Shifting with a single buffer is actually just the same, but your timing is strictly tied to the screen refresh.
Using double buffering and e.g. X register for source offset and Y for destination, you need just 2 copy loops, screen1 to screen2 and vice versa.
Though depending on what your source data is like, you could forget about different shifting directions and just redraw the whole screen from the source data while racing the raster. If you use a tilemap, but format your data cleverly, you could also redraw directly from the tilemap data. |
| | Raistlin
Registered: Mar 2007 Posts: 680 |
It's too long to write about here but, for a single direction at a time (ie. the direction won't change for a complete level of the game), I can get ~6825 cycles to fill the screen - plus some additional cycles for bringing the new line in. The code for this is fully unrolled, though, and takes up ~10k. |
| | TWW
Registered: Jul 2009 Posts: 545 |
Also if you intend to put sprites, the more you put the less cycles you have. |
| | jamiefuller Account closed
Registered: Mar 2018 Posts: 25 |
Thanks Everyone,
I decided to go down the double buffer route instead.
below is some very raw code which seems to work nicely,
the border colour change is the raster usage
thanks again gang!
P.S. if you notice any bugs feel free to shout :)
frame=$ff
*=$0801
BYTE $0B, $08, $0A, $00, $9E, $33, $30, $37, $32, $00, $00, $00
*=$0c00
init
lda #$7f
sta $DC0D ;"Switch off" interrupts signals from CIA-1
and $D011
sta $D011 ;Clear most significant bit in VIC's raster register
lda #$01
sta $D012 ; set raster to occour 1 lines down
lda #<irq_handler
sta $0314 ; set low bit of start
lda #>irq_handler
sta $0315 ; set high bit of start
lda #$01
sta $D01A ;enable raster interrupts
@endloop
jmp @endloop
irq_handler
inc $d020
ldy #$29+$7c ;change the $29 to change direction, $00 = up, $50=down $27=left, $29 =right
inc frame
lda frame
and #$01
beq @loopy
lda $d018
and $0f
ora #$10
sta $d018
jsr scroll_btof
jmp @exit
@loopy
lda $d018
and $0f
ora #$20
sta $d018
jsr scroll_ftob
@exit dec $d020
asl $D019
JMP $EA31
scroll_btof
ldx #$7c
@loop1
LDA $0400-$28,y
STA $0800,x
LDA $047D-$28,y
STA $087D,x
LDA $04FA-$28,y
STA $08FA,x
LDA $0577-$28,y
STA $0977,x
LDA $05F4-$28,y
STA $09F4,x
LDA $0671-$28,y
STA $0A71,x
LDA $06EE-$28,y
STA $0AEE,x
LDA $076B-$28,y
STA $0B6B,x
dey
dex
bpl @loop1
rts
scroll_ftob
ldx #$7c
@loop1
LDA $0800-$28,y
STA $0400,x
LDA $087D-$28,y
STA $047D,x
LDA $08FA-$28,y
STA $04FA,x
LDA $0977-$28,y
STA $0577,x
LDA $09F4-$28,y
STA $05F4,x
LDA $0A71-$28,y
STA $0671,x
LDA $0AEE-$28,y
STA $06EE,x
LDA $0B6B-$28,y
STA $076B,x
dey
dex
bpl @loop1
rts
|
| | Mixer
Registered: Apr 2008 Posts: 452 |
Perhaps and #$0f instead of and $0f ? |
... 1 post hidden. Click here to view all posts.... |
Previous - 1 | 2 - Next | |