initstabilise lda $d012 ldx #10 ; 2 - dex ; (10 * 5) + 4 bpl - ; 54 nop ; 2 eor $d012 - $ff,x; 5 = 63 bne initstabilise; 7 = 70 [...]; timer setup
This is an idea I got after talking to Copyfault. At least in the cycle-correct version of Vice (i.e., x64sc) this seems to work. Haven't tried on a real machine. * = $0f00 ; Some address with (H+1)&1 = 0 and (H+1)&$10 = $10 ldy #$00 loop: ldx #$11 shx cont, y cont: bpl loop It uses the fact that we will AND the written value with H+1 unless a badline pauses the CPU between the third and fourth cycle of shx. The latter then changes the "bpl" into an "ora" and drops us out of the loop at horizontal position 61.
* = $0f00 ; Some address with (H+1)&1 = 0 and (H+1)&$10 = $10 ldy #$00 loop: ldx #$11 shx cont, y cont: bpl loop
* = $0faa ; _a very nice_ address with (H+1)&1 = 0 and (H+1)&$10 = $10 0FAA loop: ldx #$11 0FAC shx cont, y 0FAF cont: bpl loop
If you want to "overdo" (optimize, erm) this, let's save another 2 bytes: * = $0faa ; _a very nice_ address with (H+1)&1 = 0 and (H+1)&$10 = $10 0FAA loop: ldx #$11 0FAC shx cont, y 0FAF cont: bpl loop with start adress $0FAD (you guess the operand bytes of the SHX ; )) with start adress $0FAD (you guess the operand bytes of the SHX ; ))
* = $???? ;magic address that allows us to save 1 byte lax ZP ;another one of those magic addresses tay loop: shx cont,y cont: bpl loop:
Quoting Copyfault If you want to "overdo" (optimize, erm) this, let's save another 2 bytes: * = $0faa ; _a very nice_ address with (H+1)&1 = 0 and (H+1)&$10 = $10 0FAA loop: ldx #$11 0FAC shx cont, y 0FAF cont: bpl loop with start adress $0FAD (you guess the operand bytes of the SHX ; )) with start adress $0FAD (you guess the operand bytes of the SHX ; )) Interesting idea, but I do not completely understand it. How does the Y register get the right value? [...]
[...]Another amusing thing to contemplate is how this code could be placed at, say, $08xx. Preferably without messing up the basic upstart.
Also, careful with the loop length. The number of CPU cycles between two badlines is 461, except when the loop's one write cycle (last cycle of SHX) sneaks into the three cycle RDY grace period. Then it's 462 ticks. (Imagine a graph with n nodes, in which node i is connected to node (i+461)%n for 0 < i < n-1 and to (i+462)%n for i = 0. Node n-1 isn't connected to anything. You want that graph to be acyclic.) In the range 5-20, the lengths that do work are 5, 10, 12, 16, 18 and 19. But note that in particular, length 8 (a.k.a. branching directly to the SHX) does not.
0FAA loop: ldx #$11 0FAC shx cont, y 0FAF cont: bpl loop with start adress $0FAD (you guess the operand bytes of the SHX ; ))
0FAA loop: ldx #$11 0FAC shx cont, y 0FAF cont: bpl loop
The SHX will be SHX $0fa0,y. Then you start this with a JMP $0fad which is just an LDY #$0f. This is also the reason for that code blob to begin at $0faa.
Yes, yes, it's so true! After I wrote this post, two things haunted me some hours later: 1. that branching to SHX is not possible when doing the SHX $0fa0,Y-trick, so it was confusing to start with it and writing that branch-idea after it 2. those 63 cycles do only apply for non-badlines, but your approach needs badlines badly (pun intended). So my calculation was wrong. Thanks for putting this right, with the corresponding cycle calculations included :)
Quoting Copyfault The SHX will be SHX $0fa0,y. Then you start this with a JMP $0fad which is just an LDY #$0f. This is also the reason for that code blob to begin at $0faa. Excellent! Thanks for the explanation. I had the same problem as Perplex. The code you posted gives LDA $100F. [...]
*=$XX?? ;Suitable starting address we have to find. loop: tay uoc: shx cont-offset,y cont: bpl loop: