| | gregg Account closed
Registered: Apr 2005 Posts: 56 |
Some sort of multithreading.
About a week ago the topic multithreading came up on #c-64. So today I gave it a try. However, there's something wrong with my code and I can't really figure out what it is.
A short description: I have a fixed number of threads running and a CIA IRQ deals with context switching in a round-robin fashion. Every IRQ I save all current state data (SP, status register, PC, A, X, Y) in a structure and fetch the state data for the next thread.
In this example the first thread increments the screen background color (fast), while the second thread changes the border background color (slow). However the wait in the second thread runs too fast every other time, and I have no idea why. It's probably something wrong with the context switch stuff, maybe some of you could take a look at it?
Sources are for ACME.
!to "threading.prg",cbm
!cpu 6510
;!source "mylib.a"
!macro basic_header .a, .b, .c, .d {
*= $0801
!byte <.eol,>.eol,0,0,$9e
!text .a, .b, .c, .d
.eol: !byte 0,0,0
}
num_threads = 2
thread_num = $fd ; current thread number
;--------------------------------------------------------------------------
+basic_header "2", "0", "6", "1"
*= $080d
init: sei
; set up context switch IRQ
lda #$35
sta $01
lda #<context_switch
ldx #>context_switch
sta $fffe
stx $ffff
lda #0
sta thread_num
cli
jmp thread1
;--------------------------------------------------------------------------
context_switch:
pha
txa
pha
tya
pha
lda $dc0d
; save current thread
lda thread_num
; *8
asl
asl
asl
tay
; save A,X,Y
pla
sta thread_data+6,y
pla
sta thread_data+5,y
pla
sta thread_data+4,y
; save PSW
pla
sta thread_data+1,y
; save PC
pla
sta thread_data+2,y
pla
sta thread_data+3,y
; save SP
tsx
txa
sta thread_data,y
; next thread, wraparound
ldy thread_num
iny
cpy #num_threads
bne +
ldy #0
+ sty thread_num
; *8
tya
asl
asl
asl
tay
; restore thread data
; stack pointer first
lda thread_data,y
tax
txs
; push PC, PSW for RTI
lda thread_data+3,y
pha
lda thread_data+2,y
pha
lda thread_data+1,y
pha
; push registers
lda thread_data+6,y
pha
lda thread_data+5,y
pha
lda thread_data+4,y
pha
pla
tay
pla
tax
pla
rti
;--------------------------------------------------------------------------
thread1:
inc $d021
ldy #$02
jsr wait2
jmp thread1
thread2:
inc $d020
ldy #$80
jsr wait2
jmp thread2
wait2:
- ldx #0
dex
bne *-1
dey
bne -
rts
;--------------------------------------------------------------------------
thread_data:
!fill 8, 0
!byte $ff-$40, $22, <thread2, >thread2, 0,0,0,0
|
|
... 211 posts hidden. Click here to view all posts.... |
| | gregg Account closed
Registered: Apr 2005 Posts: 56 |
Quote:why not simply keep it directly on the thread's stack?
Damn, that's so obvious and simple that I totally missed it. :) This will get me around a lot of error prone stack fiddling too, I'll try to adapt it later. |
| | doynax Account closed
Registered: Oct 2004 Posts: 212 |
I think I've got it. The registers seem to be restored in reverse order. Consider y for instance. When it's pushed you store it in thread_data+6, but in the return sequence it gets popped into A instead. |
| | gregg Account closed
Registered: Apr 2005 Posts: 56 |
Thanks, looks like it really was that. Hey, and I was SO sure it was the correct order. The stack really can be nasty. :) |
| | gregg Account closed
Registered: Apr 2005 Posts: 56 |
Here's a somewhat cleaned up example for the threading code. Maybe this can be put onto codebase, any objections against it?
!to "threading.prg",cbm
!cpu 6510
num_threads = 2
thread_num = $fd ; current thread number
;--------------------------------------------------------------------------
*= $0801
!byte <.eol,>.eol,0,0,$9e
!text "2061"
.eol: !byte 0,0,0
*= $080d
init: sei
lda #<context_switch
ldx #>context_switch
sta $0314
stx $0315
; initialize threads
ldx #0
stx thread_num
; main thread is automatically setup by first irq
; we only need to setup further threads
; split stack
tsx
txa
tay
sec
sbc #$20
tax
txs
; push thread data
; program counter, status register, a, x, y
lda #>thread2
pha
lda #<thread2
pha
lda #0
pha
pha
pha
pha
; save stack pointer
tsx
txa
sta thread_data+1
; restore old stack pointer
tya
tax
txs
cli
; go to main thread
jmp thread1
;--------------------------------------------------------------------------
context_switch:
; save stack pointer
ldy thread_num
tsx
txa
sta thread_data,y
; next thread, wraparound
iny
cpy #num_threads
bne nowrap
ldy #0
nowrap:
sty thread_num
; restore thread
lda thread_data,y
tax
txs
jmp $ea31
;--------------------------------------------------------------------------
thread1:
inc $d020
ldy #$01
jsr wait2
jmp thread1
thread2:
lda #<msg1
ldy #>msg1
jsr $ab1e
ldy #0
jsr wait2
jmp thread2
wait2:
- ldx #0
dex
bne *-1
dey
bne -
rts
;--------------------------------------------------------------------------
msg1: !pet "hello, here is thread 2!",13,0
thread_data:
|
| | Burglar
Registered: Dec 2004 Posts: 1101 |
so whats your goal for c64 'threads'? |
| | doynax Account closed
Registered: Oct 2004 Posts: 212 |
Quote: so whats your goal for c64 'threads'?
They're actually surprisingly useful.
For instance I'm currently working on a NES demo (a 6502-based system BTW) where I want to run a high priority task during vblank, since that's the only time you can access VRAM, and other lower-priority code the rest of the time.
Unfortunately I was forced to redesign the vblank code in such a way as to run out of time (which ended up costing a lot of extra time and memory since anything unpredictable had to be moved out of the blanking period) since that damned console lacks a decent interrupt source.
Similar schemes are often useful in more complex game setups and the like too, and operating systems of course.
As always the problem with multithreading is working out the synchronization, and this is no less true on the 6502. But I suppose it isn't that much different from synchronizing your code with an interrupt handler (hint: use the RMW instructions to implement locks and the like). |
| | gregg Account closed
Registered: Apr 2005 Posts: 56 |
Burglar: It's cool to have them. No, seriously, I was thinking of something along the lines of calculating an effect while irq-loading the next part of a demo. This gets very easy with threads. You'd just need more elaborate scheduling here. |
| | trident
Registered: May 2002 Posts: 91 |
Preemptive multithreading is very useful when rastertime is particularly tight and when the timing of your routines are not critical, as you don't need to worry about the exact execution time of your code. IIRC, I think I used a simple form of multithreading for Hellfork where one thread draws the outlines of the scene and one calculates the movement of the player. The threads are scheduled when the ESCOS routine is not active and each thread runs about 25 times per second (I think - I don't think I ever measured it and I didn't really need to as the preemptive multithreading took care of the scheduling).
Hellfork also uses another, more elaborate, "static multithreading" technique (essentially a variant of rate monotonic scheduling): the ESCOS and eor fill routines are interleaved in a cycle-exact way so that the 3D scene can be drawn with all borders open. A precalc routine that keeps track of the cycle count of every instruction produces a cycle-exact mixture of an ESCOS routine and an EOR fill routine. The same precalc function also produces an interleaved ESCOS and sample player routine. |
| | Oswald
Registered: Apr 2002 Posts: 5094 |
multithreading/tasking is an OS thing, if you're looking for speed you better forget them, as there are smarter methods to solve your problems without the costly context switchings.
loading while effect runs:
1. modify the loader to exit after every xth byte
2. run the loader in the "main" program, and your effect from "irq" dont worry if your effect takes more than one frame, do it like this:
irqstart inc $dö19
..
..
..
cli
lda effect_already_runs
beq yes
dec effect_already_runs
jsr effect
inc effect_already_runs
yes rti
make sure to use the stack to save the regs, otherwise the "irq reentering" trick wont work.
when your effect finishes, the timeframe left until the next "effect calling" irq will be automagically used by the loader. |
| | trident
Registered: May 2002 Posts: 91 |
Oswald: you just described preemptive multithreading: the IRQ is one thread that preempts the main thread. "Multithreading" is just the name for it.
Using the stack to save the regs = context switch. Context switching is not expensive on a 6502 since there is no memory protection. |
Previous - 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | ... | 23 - Next | |