[CSDb] - User Forums - Some sort of multithreading.

Welcome to our latest new user eightbitswide ! (Registered 2024-12-24)

You are not logged in - nap

CSDb User Forums

Forums > C64 Coding > Some sort of multithreading.

2008-04-08 22:51

gregg
Account closed

Registered: Apr 2005
Posts: 56

Some sort of multithreading.

About a week ago the topic multithreading came up on #c-64. So today I gave it a try. However, there's something wrong with my code and I can't really figure out what it is.

A short description: I have a fixed number of threads running and a CIA IRQ deals with context switching in a round-robin fashion. Every IRQ I save all current state data (SP, status register, PC, A, X, Y) in a structure and fetch the state data for the next thread.

In this example the first thread increments the screen background color (fast), while the second thread changes the border background color (slow). However the wait in the second thread runs too fast every other time, and I have no idea why. It's probably something wrong with the context switch stuff, maybe some of you could take a look at it?

Sources are for ACME.

!to "threading.prg",cbm
!cpu 6510
;!source "mylib.a"
!macro basic_header .a, .b, .c, .d {
        *= $0801
        !byte <.eol,>.eol,0,0,$9e
        !text .a, .b, .c, .d
.eol:   !byte 0,0,0
}

num_threads = 2
thread_num = $fd		; current thread number

;--------------------------------------------------------------------------
+basic_header "2", "0", "6", "1"

*= $080d

init:	sei
		; set up context switch IRQ
		lda #$35
		sta $01

		lda #<context_switch
		ldx #>context_switch
		sta $fffe
		stx $ffff
		
		lda #0
		sta thread_num

		cli
		jmp thread1

;--------------------------------------------------------------------------
context_switch:
		pha
		txa
		pha
		tya
		pha
		lda $dc0d

		; save current thread
		lda thread_num
		; *8
		asl
		asl
		asl
		tay
		; save A,X,Y
		pla
		sta thread_data+6,y
		pla
		sta thread_data+5,y
		pla
		sta thread_data+4,y
		; save PSW
		pla
		sta thread_data+1,y
		; save PC
		pla
		sta thread_data+2,y
		pla
		sta thread_data+3,y
		; save SP
		tsx
		txa
		sta thread_data,y

		; next thread, wraparound
		ldy thread_num
		iny
		cpy #num_threads
		bne +
		ldy #0
+		sty thread_num

		; *8
		tya
		asl
		asl
		asl
		tay

		; restore thread data
		; stack pointer first
		lda thread_data,y
		tax
		txs
		; push PC, PSW for RTI
		lda thread_data+3,y
		pha
		lda thread_data+2,y
		pha
		lda thread_data+1,y
		pha
		; push registers
		lda thread_data+6,y
		pha
		lda thread_data+5,y
		pha
		lda thread_data+4,y
		pha

		pla
		tay
		pla
		tax
		pla
		rti
	
;--------------------------------------------------------------------------
thread1:
		inc $d021
		ldy #$02
		jsr wait2
		jmp thread1


thread2:
		inc $d020
		ldy #$80
		jsr wait2
		jmp thread2
		
wait2:
-		ldx #0
		dex
		bne *-1
		dey
		bne -
		rts

;--------------------------------------------------------------------------
thread_data:
	!fill 8, 0
	!byte $ff-$40, $22, <thread2, >thread2, 0,0,0,0

... 211 posts hidden. Click here to view all posts....

2008-04-08 23:31

gregg
Account closed

Registered: Apr 2005
Posts: 56

Quote:

why not simply keep it directly on the thread's stack?

Damn, that's so obvious and simple that I totally missed it. :) This will get me around a lot of error prone stack fiddling too, I'll try to adapt it later.

2008-04-08 23:48

doynax
Account closed

Registered: Oct 2004
Posts: 212

I think I've got it. The registers seem to be restored in reverse order. Consider y for instance. When it's pushed you store it in thread_data+6, but in the return sequence it gets popped into A instead.

2008-04-09 00:27

gregg
Account closed

Registered: Apr 2005
Posts: 56

Thanks, looks like it really was that. Hey, and I was SO sure it was the correct order. The stack really can be nasty. :)

2008-04-12 22:36

gregg
Account closed

Registered: Apr 2005
Posts: 56

Here's a somewhat cleaned up example for the threading code. Maybe this can be put onto codebase, any objections against it?

!to "threading.prg",cbm
!cpu 6510

num_threads = 2
thread_num = $fd		; current thread number

;--------------------------------------------------------------------------
		*= $0801
		!byte <.eol,>.eol,0,0,$9e
		!text "2061"
.eol:	!byte 0,0,0


*= $080d

init:	sei
		
		lda #<context_switch
		ldx #>context_switch
		sta $0314
		stx $0315
		
		; initialize threads
		ldx #0
		stx thread_num
		
		; main thread is automatically setup by first irq
		; we only need to setup further threads
		; split stack
		tsx
		txa
		tay
		sec
		sbc #$20
		tax
		txs
		
		; push thread data
		; program counter, status register, a, x, y
		lda #>thread2
		pha
		lda #<thread2
		pha
		lda #0
		pha
		pha
		pha
		pha
		
		; save stack pointer
		tsx
		txa
		sta thread_data+1
		
		; restore old stack pointer
		tya
		tax
		txs

		cli
		; go to main thread
		jmp thread1

;--------------------------------------------------------------------------
context_switch:
		; save stack pointer
		ldy thread_num
		tsx
		txa
		sta thread_data,y
		
		; next thread, wraparound
		iny
		cpy #num_threads
		bne nowrap
		ldy #0
nowrap:
		sty thread_num
		
		; restore thread
		lda thread_data,y
		tax
		txs

		jmp $ea31

;--------------------------------------------------------------------------
thread1:
		inc $d020
		ldy #$01
		jsr wait2
		jmp thread1


thread2:
		lda #<msg1
		ldy #>msg1
		jsr $ab1e
		ldy #0
		jsr wait2
		jmp thread2


wait2:
-		ldx #0
		dex
		bne *-1
		dey
		bne -
		rts

;--------------------------------------------------------------------------
msg1:	!pet "hello, here is thread 2!",13,0
thread_data:

2008-04-13 01:02

Burglar

Registered: Dec 2004
Posts: 1101

so whats your goal for c64 'threads'?

2008-04-13 01:38

doynax
Account closed

Registered: Oct 2004
Posts: 212

Quote: so whats your goal for c64 'threads'?

They're actually surprisingly useful.
For instance I'm currently working on a NES demo (a 6502-based system BTW) where I want to run a high priority task during vblank, since that's the only time you can access VRAM, and other lower-priority code the rest of the time.
Unfortunately I was forced to redesign the vblank code in such a way as to run out of time (which ended up costing a lot of extra time and memory since anything unpredictable had to be moved out of the blanking period) since that damned console lacks a decent interrupt source.
Similar schemes are often useful in more complex game setups and the like too, and operating systems of course.

As always the problem with multithreading is working out the synchronization, and this is no less true on the 6502. But I suppose it isn't that much different from synchronizing your code with an interrupt handler (hint: use the RMW instructions to implement locks and the like).

2008-04-13 01:59

gregg
Account closed

Registered: Apr 2005
Posts: 56

Burglar: It's cool to have them. No, seriously, I was thinking of something along the lines of calculating an effect while irq-loading the next part of a demo. This gets very easy with threads. You'd just need more elaborate scheduling here.

2008-04-13 07:41

trident

Registered: May 2002
Posts: 91

Preemptive multithreading is very useful when rastertime is particularly tight and when the timing of your routines are not critical, as you don't need to worry about the exact execution time of your code. IIRC, I think I used a simple form of multithreading for Hellfork where one thread draws the outlines of the scene and one calculates the movement of the player. The threads are scheduled when the ESCOS routine is not active and each thread runs about 25 times per second (I think - I don't think I ever measured it and I didn't really need to as the preemptive multithreading took care of the scheduling).

Hellfork also uses another, more elaborate, "static multithreading" technique (essentially a variant of rate monotonic scheduling): the ESCOS and eor fill routines are interleaved in a cycle-exact way so that the 3D scene can be drawn with all borders open. A precalc routine that keeps track of the cycle count of every instruction produces a cycle-exact mixture of an ESCOS routine and an EOR fill routine. The same precalc function also produces an interleaved ESCOS and sample player routine.

2008-04-13 09:25

Oswald

Registered: Apr 2002
Posts: 5094

multithreading/tasking is an OS thing, if you're looking for speed you better forget them, as there are smarter methods to solve your problems without the costly context switchings.

loading while effect runs:

1. modify the loader to exit after every xth byte

2. run the loader in the "main" program, and your effect from "irq" dont worry if your effect takes more than one frame, do it like this:

irqstart inc $dö19
..
..
..

cli
lda effect_already_runs
beq yes

dec effect_already_runs
jsr effect
inc effect_already_runs

yes rti

make sure to use the stack to save the regs, otherwise the "irq reentering" trick wont work.

when your effect finishes, the timeframe left until the next "effect calling" irq will be automagically used by the loader.

2008-04-13 09:30

trident

Registered: May 2002
Posts: 91

Oswald: you just described preemptive multithreading: the IRQ is one thread that preempts the main thread. "Multithreading" is just the name for it.

Using the stack to save the regs = context switch. Context switching is not expensive on a 6502 since there is no memory protection.

Previous - 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | ... | 23 - Next

Refresh

Subscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search for in
All times are CET.

Search CSDb

Advanced

Users Online

gorans
theK/ATL
Felidae/Reflex
Mojzesh/TGR🇬🇧
astaroth/TRSI
t0m3000/hf^boom!^ibx
lA-sTYLe/Quantum
ThunderBlade/BLiSS
Alakran_64
fugu/Excess
New Design/Excess
Spinball/Excess
Isildur/Samar
Guests online: 100

Top Demos

1 Next Level  (9.7)
2 13:37  (9.7)
3 Mojo  (9.7)
4 Coma Light 13  (9.6)
5 The Demo Coder  (9.6)
6 Edge of Disgrace  (9.6)
7 What Is The Matrix 2  (9.6)
8 Uncensored  (9.6)
9 Comaland 100%  (9.6)
10 Wonderland XIV  (9.6)

Top onefile Demos

1 Layers  (9.6)
2 Cubic Dream  (9.6)
3 Party Elk 2  (9.6)
4 Copper Booze  (9.6)
5 X-Mas Demo 2024  (9.5)
6 Dawnfall V1.1  (9.5)
7 Rainbow Connection  (9.5)
8 Onscreen 5k  (9.5)
9 Morph  (9.5)
10 Libertongo  (9.5)

Top Groups

1 Performers  (9.3)
2 Booze Design  (9.3)
3 Oxyron  (9.3)
4 Censor Design  (9.3)
5 Triad  (9.3)

Top Swappers

1 Derbyshire Ram  (10)
2 Jerry  (9.8)
3 Violator  (9.7)
4 Acidchild  (9.7)
5 Cash  (9.6)

Page generated in: 0.044 sec.