| |
Oswald
Registered: Apr 2002 Posts: 5086 |
Sorting
are sorters really so slow in games? :) I have made my unrolled version for my theoretical game (;), and it takes 132 rlines to sort 32 numbers o_O worst case is ~200 lines. tho when its the case of 4 numbers have to be swapped only it does the job in ~10 lines. wastes a lot of memory but I like it :) |
|
... 193 posts hidden. Click here to view all posts.... |
| |
lft
Registered: Jul 2007 Posts: 369 |
My approach uses 60 bytes of zero-page, so this should be updated in the table. It can be reduced to 32 bytes at the cost of a few extra cycles (I think 12, but haven't tested).
Adding support for the full range ($00-$ff, adding two more lists for the high nybble) will bump the zero-page size up to 64 bytes, and add 22 cycles. |
| |
Oswald
Registered: Apr 2002 Posts: 5086 |
Quote: My approach uses 60 bytes of zero-page, so this should be updated in the table. It can be reduced to 32 bytes at the cost of a few extra cycles (I think 12, but haven't tested).
Adding support for the full range ($00-$ff, adding two more lists for the high nybble) will bump the zero-page size up to 64 bytes, and add 22 cycles.
a general solution would be nice to have on codebase, not everyone wants to sort sprites. also can it handle all entries showing up in 1 bucket ? |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
Quote: a general solution would be nice to have on codebase, not everyone wants to sort sprites. also can it handle all entries showing up in 1 bucket ?
Doesn't need to support more than eight entries per bucket for sprites… |
| |
lft
Registered: Jul 2007 Posts: 369 |
@Oswald Yes, it handles all cases, always using the same amount of cycles. |
| |
Rastah Bar Account closed
Registered: Oct 2012 Posts: 336 |
The costs per actor are lower for inline buckets. I was trying to figure out if it could be beneficial to combine the new radix sort and the inline bucket sort. F.e. one stage of radix sort followed by an inline bucket sort for each bucket of the first stage. |
| |
Rastah Bar Account closed
Registered: Oct 2012 Posts: 336 |
<Original post deleted>
I thought I had found an improvement to inline buckets, but ChristopherJam already proposed that. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
Ah, sorry about that :)
But also damn, I'd forgotten whether or not I'd already implemented what you wrote, and was looking forward to rummaging through my code to see if I could integrate it..
Interesting idea about combining radix with IBS - after all, a bucket sort is just a one digit radix sort (compared to the two digit radix sorts used elsewhere in this discussion) |
| |
mhindsbo
Registered: Dec 2014 Posts: 51 |
Here is my insertion sort, if of any interest. Takes about 800-1000 cycles for 20 actors/sprites under actual load in game.
I wont be the fastest in worst case, but under realistic game loads where you control the number of new actors per frame (i.e. start with a mostly sorted list) it is pretty fast and compact (w 32 actors I have not had it exceed 2000 cycles).
sprite_sorty ldy sorted_indexz+1 ; first iteration only compares element 0 & 1 in the table, so no loop needed
lda object_yl,y ; y position of object in position 1
ldx sorted_indexz ; get pointer to the object in position 0
cmp object_yl,x ; compare y position of object 0 to y position of object 1 (stored in AR)
bcs @second ; if Obj 1 > Obj 0 we are done
stx sorted_indexz+1 ; otherwise swap them
sty sorted_indexz
@second ldx #2 ; get first position to be compared
@loop stx current_pos ; position in index table we are finding the right place for
ldy sorted_indexz,x ; get pointer (offset) to current object to find the right place for
sty current_index ; store current index number dynamically in code for later storage
lda object_yl,y ; y position of current object into AR
@in_right_place ldy sorted_indexz-1,x ; check if object is already in the right place ... saves time if list is already partially sorted
cmp object_yl,y ; ... if it is we dont need to swap any positions but can move right on
bcs @next_position
sty sorted_indexz,x
dex
@compare ldy sorted_indexz-1,x ; get pointer to the object we are comparing current one with
cmp object_yl,y ; compare y position of this object to y position of current object (stored in AR)
bcs @found_place ; is y of current object (in AR) >= y of compare object
sty sorted_indexz,x ; move the compared to index one right in the table
dex ; loop to check next position
bne @compare
@found_place lda current_index ; index to current object
sta sorted_indexz,x ; store to the right of the one it is >= of
ldx current_pos
@next_position inx
cpx #NUM_SPR_OBJ ; end of table (number of sprite objects)?
bcc @loop
@end rts
I have N bytes (number of actors) in zero page in terms of the sorted index table. If you also place the y-coordinates in zero page it is 2*N bytes. And the actual code is 61 bytes. |
| |
Oswald
Registered: Apr 2002 Posts: 5086 |
ldy sorted_indexz+1
lda object_yl,y
why not use lda (),y for this ? you can do the indirection in one step: zp pointers are the sorted indexes, and Y points to Y coordinate attribute within virtual sprite data |
| |
mhindsbo
Registered: Dec 2014 Posts: 51 |
Good question @Oswald that sent me on a spin incl. considering if I would use (zp,x) instead. I had not really contemplated using the index as an index of 16 bit addresses, rather than a simple list of 8 bit numbers.
However unless I misunderstand you the resulting code including the swapping of indexes would become:
@loop sty current_pos ; position in the index
lda (sorted_indexz),y ; y coordinate of object
ldx sorted_indexz,y ; object number/index
stx current_index
@in_right_place cmp (sorted_indexz-2),y ; compare to y coordinate of next object in index
bcs @next_position
ldx sorted_indexz-2,y ; move compared objects index to the right sorted index list
stx sorted_indexz,y
dey
dey
@compare cmp (sorted_indexz-2),y
bcs @found_place
ldx sorted_indexz-2,y
stx sorted_indexz,y
dey
dey
bne @compare
@found_place ldx current_index
stx sorted_indexz,y
@next_position ldy current_pos
iny
iny
cpy #2*NUM_SPR_OBJ
bcc @loop
which is actually a couple of cycle longer per iteration and the index will be twice as long in zp. Also when I later populate the sprite IRQ's I will have two dey's to iterate through the indexes (small extra cost, but still).
It might be worth noting that all my (virtual) sprite data sits in array's of similar properties. I.e. y1,y2,y3, ... x1,x2,x3, ... and not y1,x1 ... y2,x2. I access all values using the index of the object and not the index of the attribute.
I could have missed something. It is early on a Saturday and I have not had enough coffee yet :-)
PS: I did spot a possible optimization more that I will take a look at now. |
Previous - 1 | ... | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 - Next |