| |
Oswald
Registered: Apr 2002 Posts: 5086 |
Sorting
are sorters really so slow in games? :) I have made my unrolled version for my theoretical game (;), and it takes 132 rlines to sort 32 numbers o_O worst case is ~200 lines. tho when its the case of 4 numbers have to be swapped only it does the job in ~10 lines. wastes a lot of memory but I like it :) |
|
... 193 posts hidden. Click here to view all posts.... |
| |
lft
Registered: Jul 2007 Posts: 369 |
Quoting ChristopherJamI'm wondering if writing an RTS into the increment field could be cost effective?
That would imply a JSR out of the increment field, right? That still means you have to put the bucket-emptying routine in four different, fixed places. Meanwhile the execution time increases by a rasterline.
Perhaps you meant the other way around, JSR into the increment field and RTS to the bucket-emptying routine. That would indeed make the memory layout more flexible, but at the cost of three rasterlines.
(Edit: I see now that I misread your post. Of course you mean the second thing.) |
| |
lft
Registered: Jul 2007 Posts: 369 |
(removed) |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
(intrigued) |
| |
lft
Registered: Jul 2007 Posts: 369 |
I wrote a post about how JSR/RTS would interfere with pushing the result to the stack. But that is only a problem if you do it the wrong way, putting JSR in the field and RTS in the emptying routine. So it was a non-issue after all. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
Ah!
Well, FWIW I forgot the results were being pushed to the stack when I was wondering if a TXS:JMP would be a faster way to return to the fieldÂ… (ie, just keep start of empty routine at top of stack) |
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
Just some thoughts...
Multiplexers in game scenarios, is the sorting normally done in the vertical blank or is the sorting performed in the main loop bur for the next frame's sprite setup?
Was just thinking of the ideas of not sorting at all and the fact that the vertical blank period is a bit crowded. A lot of other stuff must be updated there such as the scrolling etc.
What if the multiplexer simply scans the remaining actors for the next entry? This is in total of course a O(n^2/2) algorithm and dead slow. Worst case to handle would be 8 sprites having to be multiplexed directly below. So you have 21/8 raster lines to find the lowest index in the remaining actors.
Think I'll test this approach some day soon.. |
| |
cadaver
Registered: Feb 2002 Posts: 1160 |
There are games that do both. I remember at least Midnight Resistance *not* doublebuffering the sorted sprites, so it was doing the sort in the vblank / scorepanel area.
Generally I'd recommend not making something timecritical that absolutely doesn't need to be, therefore rather pre-calculate the sorted sprites anywhere when the main program has time.
If you do the sorting "on the fly", you can't take advantage of last frame's sorting result. In a tight sprite formation, you barely have enough time to load the sprite registers from pre-sorted data, so would imagine you would run into trouble with a "find the next sprite" approach, even with unrolled code. |
| |
JackAsser
Registered: Jun 2002 Posts: 2014 |
Quote: There are games that do both. I remember at least Midnight Resistance *not* doublebuffering the sorted sprites, so it was doing the sort in the vblank / scorepanel area.
Generally I'd recommend not making something timecritical that absolutely doesn't need to be, therefore rather pre-calculate the sorted sprites anywhere when the main program has time.
If you do the sorting "on the fly", you can't take advantage of last frame's sorting result. In a tight sprite formation, you barely have enough time to load the sprite registers from pre-sorted data, so would imagine you would run into trouble with a "find the next sprite" approach, even with unrolled code.
Ok, thanks for the insights. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
For applications where border time is a lot scarcer than display time, I suspect I'd lean towards a nice simple linked list based bucket sort; the times it takes a while to find next sprite are precisely when the next sprite is a fair way down the screen, so the time taken wouldn't be so critical. If the next sprite is in the next line or three there'd only be a few links to follow, which shouldn't take too long.. |
| |
Repose
Registered: Oct 2010 Posts: 225 |
Just wanted to mention an O(n) sort I invented long ago I call Fibonacci sort, for how one step does a running sum.
Now I realize it's been invented long ago and is called the counting sort.
I think it takes about 50 cycles per number.
#Fibonacci Sort
unsorted = [3, 1, 4, 1, 5, 9, 2, 6, 0]
counts = [0] * 10
fib = [0] * (10 + 1)
sortd = [0] * len(unsorted)
#First pass: find the counts
for n in unsorted:
counts[n]+=1
#Second pass: do the Fibonacci magic
i = 0
total = 0
for n in counts:
fib[i] = total
total += n
i+=1
#Third pass: output sorted
for n in unsorted:
sortd[fib[n]] = n
fib[n] += 1
#Results
print(sortd)
a very buggy version in assembler:
;count each number
ldx #$ff
count ldy unsort,x
inc count,y
dex
bne count;15/ea
;Fibonacci step
clc
lda count
inx
fib adc count,x
sta sta count,x
inx
bne fib;13/ea
;Copy sorted
sort ldy unsort
ldx count,y
sty sort,x
inc count,y
inc sort+1
bne sort/27/ea
|
Previous - 1 | ... | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 - Next |