| |
Oswald
Registered: Apr 2002 Posts: 5086 |
Sorting
are sorters really so slow in games? :) I have made my unrolled version for my theoretical game (;), and it takes 132 rlines to sort 32 numbers o_O worst case is ~200 lines. tho when its the case of 4 numbers have to be swapped only it does the job in ~10 lines. wastes a lot of memory but I like it :) |
|
... 193 posts hidden. Click here to view all posts.... |
| |
Oswald
Registered: Apr 2002 Posts: 5086 |
this is one case for (,x) if you want the usual indice indirection too. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
Nice work independently discovering counting sort, Repose
It's worth noting that this one is O(n)+O(m), where m is the number of buckets.
Pass 2 for a "perfect" sort of n sprites would take 220 iterations (256 in your example code)
That's around 220*10 cycles if you unroll the loop a little, so about 35 lines of overhead. Of course, you can cut that down considerably if you don't need as much accuracy or range.
Clearing the counts array also takes time, though you can drop that back from O(m) to O(n) by keeping it seperate to your "fib" array, persisting it frame to frame, and only clearing the entries you incremented in the first place. |
| |
Oswald
Registered: Apr 2002 Posts: 5086 |
found out this one, posting if someone finds a flaw, somehow I am unsure if I'm overlooking something, because in speed its close to the pha pha method, which feels unbeatable.
phase1 bucketsort, 4bit per buckets:
;this code repeats * numofsprites
ldy yvalues+spriteindex
ldx table,y ;x = y / 16 *2
lda #spriteindex
sta (bucket,x) ;needs some prep on zp
inc bucket,x ;every 2nd zp points to a buckettab
phase2 sorting network:
ldy bucket+z ;19 such compare snippets per bucket
lda yvalues,y ;can sort max 8 sprite per bucket
ldx bucket+q ;z and q is precalculated
cmp yvalues,x
bcc + ; or bcs?
sty bucket+q
stx bucket+z ;26 cycles worst case
with $20 sprites phase1 is 768 cycles, 1976 cycles for phase 2 at worst case (highly unlikely) , overall ~43 rasterlines.
probably will use different sorting networks based on nr of sprites in bucket. that will reduce nr of compare/swap snippets per bucket. |
| |
Krill
Registered: Apr 2002 Posts: 2969 |
Something i keep thinking briefly about whenever sprite sorting pops up is... somehow having hardware-assisted constant-time sorting.
Like, when you have buckets of 4 or 8 coordinates, which are quickly generated as seen in Oswald's post, how about sorting the coordinates within a bucket using, say, CIA timer interrupts or VIC collision detection? :)
For timer interrupts, one could set up 4 timers and then sample the interrupt flags, which would give results according to the values to be sorted.
Collision detection could also help for buckets of 8 coordinates, with some smart sprite-char collision scheme, albeit at the expense of some memory.
But... i never came far with these ideas, but then again i haven't really tried so far. :)
Edit: Of course, with 4 timer interrupts and a raster interrupt, one could sort 5 coordinates on the fly and have the multiplexer interrupt handler triggered in sorted order. But i guess this is out of scope for these discussions. =) |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
Nice one Oswald.
As is, it's a little slower than https://www.codebase64.org/doku.php?id=base:flagged_bucket_sort (38.5 rasters worst case) but you have got me thinking about sorting networks again.
Could perhaps dispatch to the sorting network routines by spacing the zero page pointers slightly further apart, so that one could jmp($xxxx) to one byte earlier, using the low byte of the storage address as the high byte of the sort routine address
Can also combine pairs of sorting network swap macros to avoid some register spilling, saving about five or six cycles off the pair
You've also reminded me that sometime the last few years I finally realised that sorting network based methods might be able to take advantage of seperating out the player sprite so that each bucket then only needs to deal with at most seven sprites per bucket, or six if there's a second player or a multiple in one player mode.
A six element sorting element is only 12 swaps, a good 15% less per sprite than the 19 for an eight element network. |
| |
Oswald
Registered: Apr 2002 Posts: 5086 |
thanks! in the meantime I have implemented it, and I am at 46 lines with an avg case, using adaptive sort network. (something off with my speed calcs earlier?)
the swap macro combine is nice I can shave off 8 cycles with that :)
to save memory I'm using another variant than posted:
ldy buckets+q,x
lda yvalues,y
ldy buckets+z,x
cmp yvalues,y
bcs skipswap
lda buckets+q,x
sty buckets+q,x
sta buckets+z,x
.
4 cycles slower but 1/16th mem usage. no need for different code for each bucket.
Gunnar I dont understand how you could do that, not even the timer method :P start the timers and poll the flag register to see which one runs out first, then 2nd etc ?
the player sprite, why would one handle it differently ? it just reduces the flexibility of displayable sprites? |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
Regarding the player sprite, if it’s free roaming then the multiplexer can only guarantee seven free sprites for antagonists on any given line. So, you get the same capabilities from {putting the player into the actor list and letting the multiplexer drive eight sprites} as you do from {putting the player into sprite zero directly and driving the remaining seven from the multiplexer}.
Same logic applies to a second player, or an orbiting multiple. And if the sorter only needs to deal with at most six sprites per bucket, then the worst case sorting network only has 12 swap macros.
Speaking of buckets, if you’re determining which bucket with a table lookup, you only need 11 of them, each being 21px high. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
Reusable sort code sounds good btw. It’d be nice to have a clearer view of code size/speed trade offs. |
| |
Oswald
Registered: Apr 2002 Posts: 5086 |
btw one thing I could never grasp, how would sprite plexer work with bucket sort, ie not fully sorted but accurate enough ? order of irq doesnt matter for 8 sprites ? just use 1 irq for all 8 ? |
| |
Burglar
Registered: Dec 2004 Posts: 1088 |
Quoting Oswaldbtw one thing I could never grasp, how would sprite plexer work with bucket sort, ie not fully sorted but accurate enough ? order of irq doesnt matter for 8 sprites ? just use 1 irq for all 8 ? I'd go for buckets of 4 sprites I think. |
Previous - 1 | ... | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 - Next |