| |
Oswald
Registered: Apr 2002 Posts: 5086 |
Sorting
are sorters really so slow in games? :) I have made my unrolled version for my theoretical game (;), and it takes 132 rlines to sort 32 numbers o_O worst case is ~200 lines. tho when its the case of 4 numbers have to be swapped only it does the job in ~10 lines. wastes a lot of memory but I like it :) |
|
... 193 posts hidden. Click here to view all posts.... |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
Yes, nice work! This will take some investigating to unravel.
SHX $FE00,Y ^ is basically stx $fe00,y on account of the mask being $fe+1, yes? |
| |
lft
Registered: Jul 2007 Posts: 369 |
Color Bar, yes I realise that you went through a number of different versions, trading memory for cycles. But I thought it fair to mention the version that beats mine in running time.
There can be any number of sprites on the same line; They are linked. Of course, the multiplexer drops them if there are too many, as is evident from the flickering in the demo.
This is a generic full random sorting routine for 8-bit values, although in the demo it is limited to 220 different y-values to make it comparable with your benchmarks.
Christopher Jam: That is correct. It's nice to find a use for this opcode where it actually improves performance, because it's quite constrained: The masking drops off if the instruction is suspended by DMA, and page crossings cause the wrong value to be written. So the best bet is indeed to stick to the operand $fe00. |
| |
Rastah Bar Account closed
Registered: Oct 2012 Posts: 336 |
Quote: Yes, nice work! This will take some investigating to unravel.
SHX $FE00,Y ^ is basically stx $fe00,y on account of the mask being $fe+1, yes?
Yes, check out this document (pages 38 and 41):
No More Secrets v0.91 |
| |
chatGPZ
Registered: Dec 2001 Posts: 11350 |
lft: now it would be very cool if you could expand this routine to handle excess sprites in a way that they will be displayed on alternating frames ("interlaced") - then it would work perfectly fine for a game (MUCH better than sprites disappearing) |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
Dear lords. So the n-actors snippets leading up to $fe00 insert each actor ID in to a linked list with the head at $aa00+yposition, and the 'next' pointers are at $10+actorID, and also replaces the INY at $fe00+yposition with a JMP to a bucket emptying routine.
Here's a sample snippet:
.C:fb52 A4 90 LDY $90+0 ; Y coordinate of actor 0
.C:fb54 9E 00 FE SHX $FE00,Y ; this is effectively STX $fe00,y
.C:fb57 B9 00 AA LDA $AA00,Y ; bucket head
.C:fb5a 85 10 STA $10+0 ; next pointer for actor 0
.C:fb5c A9 00 LDA #$00 ; actor id 0
.C:fb5e 99 00 AA STA $AA00,Y
The INYs that fill $fe00 to $feff increment Y through all the potential bucket indices, jumping off to a bucket emptying routine wherever an INY has been replaced with a JMP
There are four routines at 4c4c, 4cc8, c84c and c8c8 to handle each of the cases where zero to two of the next two buckets are also nonempty.
The bucket emptying routine takes advantage of the list heads being stored in page $aa to reuse the $aa as a TAX instruction when looping over a bucket that contains more than one actor, and the lists are terminated with $c8, which is both negative and doubles as the INY that needs to be written back to $fexx to clear the bucket for the next frame. (it's also written to $aaxx, just like the flagged bucket sort on codebase).
Each bucket emptying routine returns to the scene of the crime after writing the INY back, which bumps Y the value needed for the next bucket. Hence, the four bucket emptying routines are in fact identical; here's a sample:
.C:4c4c 8C 5E 4C STY $4C5E ; patch return address
.C:4c4f BF 00 AA LAX $AA00,Y ; fetch bucket head
.C:4c52 48 PHA
.C:4c53 B5 10 LDA $10,X ; read 'next' pointer.
.C:4c55 10 FA BPL $4C51 ; Actor ids are positive, INY is not
.C:4c57 99 00 FE STA $FE00,Y ; restore INY
.C:4c5a 99 00 AA STA $AA00,Y ; empty linked list
.C:4c5d 4C 7E FE JMP $FE7E
I think that just about covers it. Lft, that's a mighty fine piece of work. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
Color Bar, thanks for the No More Secrets rec; I wasn't familiar with that one. |
| |
Trash
Registered: Jan 2002 Posts: 122 |
The sorting algorithm implemented by HCL is actually called Counting Sort (I just found that out..) and it is considered to be O(n).
It should be competive to the discussed code both in size and speed. |
| |
Cruzer
Registered: Dec 2001 Posts: 1048 |
Hilariously constrained addresses indeed. :) And my respect for lft just grew even higher.
Quoting Groepazlft: now it would be very cool if you could expand this routine to handle excess sprites in a way that they will be displayed on alternating frames ("interlaced") - then it would work perfectly fine for a game (MUCH better than sprites disappearing) I would like no interlacing or disappearing sprites even better. In the ideal world, the routine moving the sprites around should avoid situations with >8 sprites on the same line. And the multiplexor should of course strive for being able to display any situation with <=8 sprites per line. |
| |
chatGPZ
Registered: Dec 2001 Posts: 11350 |
thats not really useful for a game - in a demo of course that is always preferred - but in a game it would be SO much better if you could move around a large amount of objects without having to care about multiplexer limitations in the game logic. (if you look at gameboy or NES games for example - thats how a lot of them work) |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1408 |
or then there was Alien Syndrome, where you could never have more than one alien on any given raster 😂 |
Previous - 1 | ... | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | ... | 21 - Next |