initstabilise lda $d012 ldx #10 ; 2 - dex ; (10 * 5) + 4 bpl - ; 54 nop ; 2 eor $d012 - $ff,x; 5 = 63 bne initstabilise; 7 = 70 [...]; timer setup
Quoting CopyfaultWell, in large parts it's the same as what I proposed in post#86. Yes, you are right. I lost track a bit of all the variants. Quote: But now that we entered the territory of over-stretching, here a version that does it in only 5 bytes (again putting all required reg settings on the decruncher's bill) : $fdfc 9E D0 FD shx $fdd0,y $fdff D0 FC bne $fdfd Comes with all constraints one could think of: mem loc fixed, y=$2f fixed val mandatory, x=$d1 fixed val mandatory, setting of vector $fc/$fd has influence on the no. of cycle that are taken when the loop is left, to be started with z-flag=0, ... maybe more! Ok. it's possible to do it with any branch-opcode, but this doesn't really make it any better;) Very ingenious, but an 8-cycle loop doesn't work, doesn't it? See post #61.
Well, in large parts it's the same as what I proposed in post#86.
But now that we entered the territory of over-stretching, here a version that does it in only 5 bytes (again putting all required reg settings on the decruncher's bill) : $fdfc 9E D0 FD shx $fdd0,y $fdff D0 FC bne $fdfd Comes with all constraints one could think of: mem loc fixed, y=$2f fixed val mandatory, x=$d1 fixed val mandatory, setting of vector $fc/$fd has influence on the no. of cycle that are taken when the loop is left, to be started with z-flag=0, ... maybe more! Ok. it's possible to do it with any branch-opcode, but this doesn't really make it any better;)
$fdfc 9E D0 FD shx $fdd0,y $fdff D0 FC bne $fdfd
loop: sha (vec),y bne loop
Quoting Rastah Bar Very ingenious, but an 8-cycle loop doesn't work, doesn't it? See post #61.Quote:It's actually a 12-cycle loop, cause the first branch is 4-cycles long (page-break!), the branch in the operand of the SHX takes 3 cycles and the SHX itself 5 -> 12 cycles in total;)
Very ingenious, but an 8-cycle loop doesn't work, doesn't it? See post #61.
It's actually a 12-cycle loop, cause the first branch is 4-cycles long (page-break!), the branch in the operand of the SHX takes 3 cycles and the SHX itself 5 -> 12 cycles in total;)
It could even be done with just 4 bytes (continuing the abuse of the byte-counting): loop: sha (vec),y bne loop If this is located at the end of a page s.t. the BNE comes with a pb, it's a 10-cycle-loop in total.
$5f00 SHA (VEC),y $5f02 RTS
Still, too far-fetched, too many things must be configured correctly. Personally, I think the 7-bytes-solution (as in post#110) that "only" comes with requirements on zp-values set in a special way is the best compromise between flexibility and byte-count!
Awesome! With SHA(vec),y even 3 bytes is possible for a 12-cycle loop. One example: $5f00 SHA (VEC),y $5f02 RTS If we assume that the decruncher provides the following initial conditions: {A&X} = $EA (opcode of NOP), Y = 2, the ZP addresses VEC and VEC+1 point to $5F00 and the stack is completely filled with the return address $5F00. Without DMA the SHA writes $EA & {$5F+1} = $60 (opcode for RTS) and repeats that until a DMA makes it write an NOP.
read the whole thread again any by scientific measures you all turned out pretty insane.
Oh this is great. May I suggest replacing the RTS with a BRK? Then you only need a single vector pointing at the routine start, instead of all of stack :) 13 cycle works I think; it divides cycles per frame but not cycles per character row. (edit - assuming no issues with cycle stealing from all the stack writes for the BRK, of course. I've not tested this)
[...] (edit - assuming no issues with cycle stealing from all the stack writes for the BRK, of course. I've not tested this)
I do hope it works. Because then we can use the entire SHA instruction and its opcode as the address operand of a preceding instruction (eg placing the zp pointer at an address that doubles as the high byte of an IO address), then the entire routine can vanish altogether. Zero bytes :D :D