[CSDb] - User Forums

You are not logged in - nap

CSDb User Forums

Forums > C64 Coding > Sorting

2007-10-08 16:08

Oswald

Registered: Apr 2002
Posts: 5126

Sorting

are sorters really so slow in games? :) I have made my unrolled version for my theoretical game (;), and it takes 132 rlines to sort 32 numbers o_O worst case is ~200 lines. tho when its the case of 4 numbers have to be swapped only it does the job in ~10 lines. wastes a lot of memory but I like it :)

... 193 posts hidden. Click here to view all posts....

2017-09-26 08:05

lft

Registered: Jul 2007
Posts: 369

I wrote a post about how JSR/RTS would interfere with pushing the result to the stack. But that is only a problem if you do it the wrong way, putting JSR in the field and RTS in the emptying routine. So it was a non-issue after all.

2017-09-26 08:11

ChristopherJam

Registered: Aug 2004
Posts: 1423

Ah!

Well, FWIW I forgot the results were being pushed to the stack when I was wondering if a TXS:JMP would be a faster way to return to the field (ie, just keep start of empty routine at top of stack)

2017-09-29 09:22

JackAsser

Registered: Jun 2002
Posts: 2038

Just some thoughts...

Multiplexers in game scenarios, is the sorting normally done in the vertical blank or is the sorting performed in the main loop bur for the next frame's sprite setup?

Was just thinking of the ideas of not sorting at all and the fact that the vertical blank period is a bit crowded. A lot of other stuff must be updated there such as the scrolling etc.

What if the multiplexer simply scans the remaining actors for the next entry? This is in total of course a O(n^2/2) algorithm and dead slow. Worst case to handle would be 8 sprites having to be multiplexed directly below. So you have 21/8 raster lines to find the lowest index in the remaining actors.

Think I'll test this approach some day soon..

2017-09-29 10:46

cadaver

Registered: Feb 2002
Posts: 1163

There are games that do both. I remember at least Midnight Resistance *not* doublebuffering the sorted sprites, so it was doing the sort in the vblank / scorepanel area.

Generally I'd recommend not making something timecritical that absolutely doesn't need to be, therefore rather pre-calculate the sorted sprites anywhere when the main program has time.

If you do the sorting "on the fly", you can't take advantage of last frame's sorting result. In a tight sprite formation, you barely have enough time to load the sprite registers from pre-sorted data, so would imagine you would run into trouble with a "find the next sprite" approach, even with unrolled code.

2017-09-29 11:21

JackAsser

Registered: Jun 2002
Posts: 2038

Quote: There are games that do both. I remember at least Midnight Resistance *not* doublebuffering the sorted sprites, so it was doing the sort in the vblank / scorepanel area.

Generally I'd recommend not making something timecritical that absolutely doesn't need to be, therefore rather pre-calculate the sorted sprites anywhere when the main program has time.

If you do the sorting "on the fly", you can't take advantage of last frame's sorting result. In a tight sprite formation, you barely have enough time to load the sprite registers from pre-sorted data, so would imagine you would run into trouble with a "find the next sprite" approach, even with unrolled code.

Ok, thanks for the insights.

2017-09-29 13:35

ChristopherJam

Registered: Aug 2004
Posts: 1423

For applications where border time is a lot scarcer than display time, I suspect I'd lean towards a nice simple linked list based bucket sort; the times it takes a while to find next sprite are precisely when the next sprite is a fair way down the screen, so the time taken wouldn't be so critical. If the next sprite is in the next line or three there'd only be a few links to follow, which shouldn't take too long..

2018-08-06 06:54

Repose

Registered: Oct 2010
Posts: 227

Just wanted to mention an O(n) sort I invented long ago I call Fibonacci sort, for how one step does a running sum.

Now I realize it's been invented long ago and is called the counting sort.

I think it takes about 50 cycles per number.

#Fibonacci Sort
unsorted = [3, 1, 4, 1, 5, 9, 2, 6, 0]
counts = [0] * 10
fib    = [0] * (10 + 1)
sortd    = [0] * len(unsorted)

#First pass: find the counts
for n in unsorted:
    counts[n]+=1

#Second pass: do the Fibonacci magic
i = 0
total = 0
for n in counts:
    fib[i] = total
    total += n
    i+=1
    
#Third pass: output sorted
for n in unsorted:
    sortd[fib[n]] = n
    fib[n] += 1

#Results
print(sortd)

a very buggy version in assembler:

;count each number
ldx #$ff
count ldy unsort,x
inc count,y
dex
bne count;15/ea

;Fibonacci step
clc
lda count
inx
fib adc count,x
sta sta count,x
inx
bne fib;13/ea

;Copy sorted
sort ldy unsort
ldx count,y
sty sort,x
inc count,y
inc sort+1
bne sort/27/ea

2018-08-06 10:06

Oswald

Registered: Apr 2002
Posts: 5126

this is one case for (,x) if you want the usual indice indirection too.

2018-08-07 07:27

ChristopherJam

Registered: Aug 2004
Posts: 1423

Nice work independently discovering counting sort, Repose

It's worth noting that this one is O(n)+O(m), where m is the number of buckets.

Pass 2 for a "perfect" sort of n sprites would take 220 iterations (256 in your example code)

That's around 220*10 cycles if you unroll the loop a little, so about 35 lines of overhead. Of course, you can cut that down considerably if you don't need as much accuracy or range.

Clearing the counts array also takes time, though you can drop that back from O(m) to O(n) by keeping it seperate to your "fib" array, persisting it frame to frame, and only clearing the entries you incremented in the first place.

2019-12-10 21:40

Oswald

Registered: Apr 2002
Posts: 5126

found out this one, posting if someone finds a flaw, somehow I am unsure if I'm overlooking something, because in speed its close to the pha pha method, which feels unbeatable.

phase1 bucketsort, 4bit per buckets:

;this code repeats * numofsprites

ldy yvalues+spriteindex
ldx table,y ;x = y / 16 *2

lda #spriteindex
sta (bucket,x) ;needs some prep on zp
inc bucket,x ;every 2nd zp points to a buckettab

phase2 sorting network:

ldy bucket+z ;19 such compare snippets per bucket
lda yvalues,y ;can sort max 8 sprite per bucket
ldx bucket+q ;z and q is precalculated
cmp yvalues,x
bcc + ; or bcs?

sty bucket+q
stx bucket+z ;26 cycles worst case

with $20 sprites phase1 is 768 cycles, 1976 cycles for phase 2 at worst case (highly unlikely) , overall ~43 rasterlines.

probably will use different sorting networks based on nr of sprites in bucket. that will reduce nr of compare/swap snippets per bucket.

Previous - 1 | ... | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 - Next

Refresh

Subscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search for in
All times are CET.

Search CSDb

Advanced

Users Online

MP Software/Hokuto F..
Alakran_64
Exile/Anubis
Mythus/Delysid
Pixelnacho
Guests online: 381

Top Demos

1 Next Level  (9.7)
2 13:37  (9.7)
3 Codeboys & Endians  (9.7)
4 Mojo  (9.6)
5 Coma Light 13  (9.6)
6 Edge of Disgrace  (9.6)
7 Signal Carnival  (9.6)
8 Uncensored  (9.5)
9 Wonderland XIV  (9.5)
10 No Bounds  (9.5)

Top onefile Demos

1 Nine  (9.7)
2 Layers  (9.6)
3 Cubic Dream  (9.6)
4 Party Elk 2  (9.6)
5 Copper Booze  (9.5)
6 Scan and Spin  (9.5)
7 Onscreen 5k  (9.5)
8 Grey  (9.5)
9 Dawnfall V1.1  (9.5)
10 Rainbow Connection  (9.5)

Top Groups

1 Artline Designs  (9.3)
2 Booze Design  (9.3)
3 Oxyron  (9.3)
4 Performers  (9.3)
5 Censor Design  (9.3)

Top Cover Designers

1 Duce  (9.8)
2 Electric  (9.8)
3 Junkie  (9.6)
4 The Elegance  (9.5)
5 Mermaid  (9.3)

Page generated in: 0.048 sec.