| |
Oswald
Registered: Apr 2002 Posts: 5086 |
Sorting
are sorters really so slow in games? :) I have made my unrolled version for my theoretical game (;), and it takes 132 rlines to sort 32 numbers o_O worst case is ~200 lines. tho when its the case of 4 numbers have to be swapped only it does the job in ~10 lines. wastes a lot of memory but I like it :) |
|
... 193 posts hidden. Click here to view all posts.... |
| |
Oswald
Registered: Apr 2002 Posts: 5086 |
thanks, now I get it :) |
| |
lft
Registered: Jul 2007 Posts: 369 |
I was able to improve the radix sort.
Writeup at codebase64.
This version can sort 32 actors in 1970 cycles. That's 61.6 cycles per actor.
Quoting ChristopherJamExtraordinarily close to one raster line per actor! Boom! |
| |
Frantic
Registered: Mar 2003 Posts: 1646 |
@lft: Much appreciated! Thank you! :) |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
Excellent work, both of you!
I've been meaning to write up a comprehensive comparison of all the approaches mentioned in this post since sometime in 2017, but as that's clearly not going to happen soon, here's a quick table comparing several of the 'sub 2500 cycle worst case' implementations; 32 actors in each instance.
+---------------+------------------+---------+--------+-----------+------+
| author | algorithm | worst | code+ | zeropage | year |
| | | cycles | data | | |
+---------------+------------------+---------+--------+-----------+------+
| cjam | flagged buckets | 2422 | ~3kb | 59b | 2017 |
| lft | field sort | 2200 | sparse | 64b | 2017 |
| colorbar+cjam | inline buckets | 2086 | 19kb | 0b | 2017 |
| bezerk | radix | 2044 | ~2kb | 94/185b | 2020 |
| bezerk+lft | radix | 1970 | ~2kb | 32b | 2020 |
+---------------+------------------+---------+--------+-----------+------+
(cycle counts do not include jsr/rts, or decanting results from index lists in stack)
Bezerk had already just scraped in below my and Colorbar's approach, with a relatively miniscule memory footprint - and now lft has improved that to hit the holy grail of sub-raster performance in the 32 actor category :D |
| |
Bezerk Account closed
Registered: Mar 2020 Posts: 5 |
That's brilliant @Lft, well done!
@ChristopherJam, no need to put me + Lft on this one in the comparison table. |
| |
Rastah Bar Account closed
Registered: Oct 2012 Posts: 336 |
Lovely! Question: if one would sort according to high nybble first, what would be the complexity?
And could someone post an equation for the the latest approach that shows the number of cycles as a function of Y-range and number of actors, please? |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
I should have included an e&oe really :)
ok, next draft just lft for final radix entry. Any comments on fieldsort memory usage will be integrated too, I just didn't manage to look that up before dinner.
Rastah Bar - Going by lft's writeup at codebase, it's an extra 51 cycles per actor regardless of order.
Flagged bucket sort's a little messier as it depends how many are on the same line or in the same eight line group.
graphs one of these days.. |
| |
lft
Registered: Jul 2007 Posts: 369 |
My approach uses 60 bytes of zero-page, so this should be updated in the table. It can be reduced to 32 bytes at the cost of a few extra cycles (I think 12, but haven't tested).
Adding support for the full range ($00-$ff, adding two more lists for the high nybble) will bump the zero-page size up to 64 bytes, and add 22 cycles. |
| |
Oswald
Registered: Apr 2002 Posts: 5086 |
Quote: My approach uses 60 bytes of zero-page, so this should be updated in the table. It can be reduced to 32 bytes at the cost of a few extra cycles (I think 12, but haven't tested).
Adding support for the full range ($00-$ff, adding two more lists for the high nybble) will bump the zero-page size up to 64 bytes, and add 22 cycles.
a general solution would be nice to have on codebase, not everyone wants to sort sprites. also can it handle all entries showing up in 1 bucket ? |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
Quote: a general solution would be nice to have on codebase, not everyone wants to sort sprites. also can it handle all entries showing up in 1 bucket ?
Doesn't need to support more than eight entries per bucket for sprites… |
Previous - 1 | ... | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 - Next |