| |
mhindsbo
Registered: Dec 2014 Posts: 51 |
Code optimization
I thought I would tap into the creativity here to see if you have optimization suggestions for the following.
For my game I have a list of objects. When an object spawns I set bit 7 in the first byte of the table so I dont spawn it again while it is active (or if it is destroyed). If the object leaves the screen without being destroyed I flip the bit back again.
Each active objects stores the address of its location in the spawn table and uses the following code to 'reactivate' itself if/when it leaves the screen.
lda object_d6,x ; lo byte of address
sta tempz+0 ; location in zero page
lda object_d7,x ; hi byte of address
sta tempz+1
ldy #0
lda (tempz),y
and #%01111111 ; clear bit 7
sta (tempz),y
That is 29 cycles +3 if page boundary is crossed. Any optimization ideas?
I could of course just add another byte in the object table so I dont have to set/clear a bit, but that adds potentially hundreds of bytes ekstra for a given level. |
|
| |
Icon
Registered: Jul 2012 Posts: 2 |
You could preset tempz + 0 to zero (perhaps during initiation of the game) and do the following:
ldy object_d6,x ; lo byte of address
lda object_d7,x ; hi byte of address
sta tempz+1
lda (tempz),y
and #%01111111 ; clear bit 7
sta (tempz),y
Then you are down to 24 cycles and the code is 4 bytes shorter. |
| |
Frantic
Registered: Mar 2003 Posts: 1648 |
...or you could move that byte out of the object and keep it in a table, and just do:
lda objectflagtable,x
and #%01111111 ; clear bit 7
sta objectflagtable,x
Not sure if that would work in your case, as it may break your approach of using "objects", but generally speaking it seems like a possibility. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
What is the range of values stored in that first byte? (clearly it's at most 0..127 or you couldn't safely trash bit 7)
If you're also not using bit 6 you could store (0x40+value) for an inactive object, (0x80+value*2) for an active one.
Then you can just use ASL and LSR to activate/deactivate. Combining with Frantic's suggestion, your code then becomes
lsr objectflagtable,y ; 7 cycles
|
| |
mhindsbo
Registered: Dec 2014 Posts: 51 |
Thanks. Some good suggestions here. I should perhaps have mentioned that XR refers to the active objects of which there are ~30.
All objects sit in a list linked to the map so they can be spawned when the player is at a specific location. The address I store relate to this table that contains hundreds of objects.
If the total number of objects ends up less than 256 I can use the object flag table idea. But could end up exceeding it. |
| |
Raistlin
Registered: Mar 2007 Posts: 684 |
Could the information be stored in a bit field? With 30 objects you could squeeze this into just 4 bytes and use something similar to the following code (17 cycles, 64 bytes of data and 12 bytes of code). Untested - but I -think- it should work..?
.byte Div8table:
.fill 32, i/8
.byte NegBitMask:
.fill 32, 0xFFFFFFFF - (1<<(i&7))
x is index 0-31
ldy Div8table, x //; 4 cycles
lda AddrTable, y //; 4
and NegBitMask, x //; 4
sta AddrTable, y //; 5 |
| |
Raistlin
Registered: Mar 2007 Posts: 684 |
Worth also saying ... you’d need an additional table with my suggestion of the regular bitmasks:-
BitMask:
.fill 32, 1<<(i&7)
And any other code for setting the bit and testing it becomes much simpler too. |
| |
Hoogo
Registered: Jun 2002 Posts: 105 |
The code looks nice, but all the tables take so much memory, 2 bytes of table for every used bit. In that case it would be better to spend a whole byte for the flag. Such a flag could also be easily reset by lsr, set by sec:ror and tested by bit. |
| |
Raistlin
Registered: Mar 2007 Posts: 684 |
True... it would only make sense if it helped in other parts of the code as well... if there were more flags other than Activated/Deactivated... |
| |
Oswald
Registered: Apr 2002 Posts: 5095 |
"If the total number of objects ends up less than 256 I can use the object flag table idea. But could end up exceeding it."
if exceeds it then use 2 object flag tables :)
Also it would be better to separate all object attributes into such tables, and just use ,x to reach any of your objects.
ldx #objectnr
lda xcoordinatetable,x
lda ycoordinatetable,x
lda isobjactivetable,x
etc. |
| |
Raistlin
Registered: Mar 2007 Posts: 684 |
Yep, Oswald’s right. Having 30 current objects makes this perfect - increase that to 32 (or pad by the extra 2) and align the data so that you never cross pages - that way your reads are always 4 cycles, writes are 5. Try to avoid indirect read/writes where you can. |
| |
Copyfault
Registered: Dec 2001 Posts: 478 |
If I get the initial problem right it is not only about those ~30 active objects but about how to cope with information on any object (be it active or not at a specific moment).
Thus, if there will be>$100 objects in total, also the initial code will come short as this would force to use more than 8 bits for the index register. If 9 Bits suffice this should be done with a case distinction, i.e.
ldx #obj_nr
lda upperbitlist,x
bne OBJECTS_0x100
OBJECTS0x00
lda isobjectactive_0x00,x
..etc..
OBJECTS0x100
lda isobjectactive_0x100,x
..etc..
If possible, use the upperbitlist such that it holds the hi_byte for the information fetch in question, i.e.
ldx #obj_nr
lda upperbitlist,x
sta OBJECTS_LD+2
OBJECTS_LD
lda isobjectactive,x
..etc..
(does maybe have to combined with an ORA as most probably there'll be different such tables)
Last but not least: is it mandatory to use bit7 for the "isactive"-information? If not, it might be a bit tighter to use bit0 and set/clear the active-information with an INC/DEC.
Depends on a lot of other things, so these are just loose ideas... |
| |
cadaver
Registered: Feb 2002 Posts: 1160 |
Is the object activation / deactivation a bottleneck even? Depends on the enemy layouts, but I'd imagine only a few objects leaving the screen per frame (and most often, none)
I have typically done it so that when I put an object onto the screen, I zero the object type byte from the spawn data, meaning "no object." Then, if the object leaves the screen without being killed, I store its type back. Therefore no bit manipulation needed. |
| |
mhindsbo
Registered: Dec 2014 Posts: 51 |
You are right. Activation and deactivation is most likely not a bottleneck as only one or two will happen each frame. It just felt that there should be a faster method. From an object handling point og view the original is actually fairly 'elegant'.
And yes the issue is not the subset of active objects, but the total list with more than 256 objects potentially. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
I'm a bit confused about the number of objects thing.
Surely even the original question presupposes no more than 256 objects, as their addresses are stored in the object_d6/object_d7 tables, which are just indexed by X?
And I'm not seeing how Copyfault's suggestion gets past that, unless the upperbitlist table is used to dynamically select for each object ID which of two possible objects it might refer to.
In either case, moving past 256 objects while still storing object ID in X is going to need some kind of system for managing a smaller subset of potentially active ones.. |
| |
cadaver
Registered: Feb 2002 Posts: 1160 |
In the original code, the X indexed variables are for currently active (onscreen) objects, of which there are much less than 256. The d6 / d7 variables point into an arbitrarily sized object spawn data. |
| |
Oswald
Registered: Apr 2002 Posts: 5095 |
btw its hard to imagine a c64 game with hundreds (>256) npcs.. :) |
| |
mhindsbo
Registered: Dec 2014 Posts: 51 |
The x indexed is active object data tables (x, y, image, ...) that are sorted etc. for sprite multiplexor.
hundreds of objects is easy ;-) Aviator Arcade has several hundreds of tanks, planes, etc. per level. that get spawned at specific locations on the map. |
| |
mhindsbo
Registered: Dec 2014 Posts: 51 |
Thanks for the input. Discussing here made me restructure my data format.
Each tile row on the map now simply has a start index and an end index of object to spawn. If start and end index is the same number is means no objects to spawn on that row.
I will limit the number of objects for any given level to 265. And use the suggestion to change object type to 0 once it is spawned.
significantly simpler, smaller and faster.
Thanks again for all the input. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
Quoting cadaverIn the original code, the X indexed variables are for currently active (onscreen) objects, of which there are much less than 256. The d6 / d7 variables point into an arbitrarily sized object spawn data.
Ah! That makes sense, thank you.
mhindsbo, glad we could help :) |
| |
Copyfault
Registered: Dec 2001 Posts: 478 |
Quoting ChristopherJam[...]
And I'm not seeing how Copyfault's suggestion gets past that, unless the upperbitlist table is used to dynamically select for each object ID which of two possible objects it might refer to.[...]
I did not write out all the details, but you're right, I had some kind of dynamic link list in mind. Ofcourse this does not really widen the length of the index value (X-reg is still in gear) but this rather gives a possibility to choose between "sub"-objects.
For a proper 9-bit-index-value, it should work to use another register (e.g. a zp-reg) as a highbyte for the index-value and use this just like upperbitlist in my former code examples, just without any index! Downside will certainly be the code doubling to choose between the different tables used lateron, but with self-modification of the code this can be worked around I think.
@mindsbo: glad to read that you found a data structure for your need:) Looking forward to seeing what kind of game comes out of it!!! |
| |
Copyfault
Registered: Dec 2001 Posts: 478 |
Hmm, now even though the initial problem is solved already, the following idea just crossed my mind: you could use x (or y) plus a byte in zeropage as index information, and directly use zp-adressing modes for access. I.e.:
$02 $00 (always!) ;lo-byte for accessing information-table, always =$00
$03 (hi-byte of index) ;hi-byte for information-table "=" hi-byte for index
... ...
ldy #obj_index_lo ;full object-index consists of this y-value plus the value currently in $03
lda ($02),y ;get information from table
The number of different values put in $03 gives a multiplier for the $100, extending the index range accordingly. So no need to initialize the vectors on every access to the object information, only if the index value overflows. Could cause problems because most probably the index-hi-byte cannot just be $00, $01, etc but rather some suitable pointer to a mem page. Plus, one needs even more differnt hi-byte-values if different table types come into play (isactive, position,...). |
| |
Oswald
Registered: Apr 2002 Posts: 5095 |
I'd rather duplicate code for the next 256 objs, and branch. fex c can hold the 9th bit for that. |
| |
ChristopherJam
Registered: Aug 2004 Posts: 1409 |
Nah, this calls for a complex spooling system - overwrite half the objects as you approach a new area, using a custom decruncher that writes scattered data from a compressed representation that was background-loaded earlier \o/ |