Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
You are not logged in - nap
CSDb User Forums


Forums > C64 Coding > ACME macro for delaying X cycles
2017-10-24 13:02
Frantic

Registered: Mar 2003
Posts: 1628
ACME macro for delaying X cycles

Anybody got an macro handy for the ACME assembler for delaying X number of cycles? It is OK if it kills the A or X register.
 
... 21 posts hidden. Click here to view all posts....
 
2017-10-25 21:58
doynax
Account closed

Registered: Oct 2004
Posts: 212
@lft: Cute, but beware of inadvertently touching I/O register with side-effects on read (i.e. $DC0D or $DD0D).
2017-10-25 22:25
chatGPZ

Registered: Dec 2001
Posts: 11135
doynax: another nice source for subtle bugs =)
2017-10-26 04:54
Oswald

Registered: Apr 2002
Posts: 5022
"doesn't clobber any registers or flags. Still requires two bytes of stack."

almost perfect, now one without jsr and 2-63 cycles please for the ultimate macro :)
2017-10-27 20:24
Han

Registered: Apr 2017
Posts: 8
Funny to see this question now when I was writing my own macro last week :)
Maybe this is useful for somebody (KickAssembler):
.macro waitx(Cycles)
{
	// Parameters of fast loop (outside a page boundary)
	.var LC=5 // Cycles per loop iteration (DEX, BNE)
	.var LoopCount=max(1, floor((Cycles-1)/LC)) // Loop counter
	.if((LoopCount>1) && (Cycles - (LoopCount*LC+1)==1)) { .eval LoopCount-- } // Handle only 1 remaining cycle
	.var ExtraCycles=max(0, Cycles - (LoopCount*LC+1)) // Cycles outside the loop
	.var ExtraBytes=max(0, ceil(ExtraCycles/2)) // Bytes required outside the loop

	// Parameters of slow loop (branch over page boundary)
	.var P_LC=6
	.var P_LoopCount=max(1, floor(Cycles/P_LC))
	.if((P_LoopCount>1) && (Cycles - (P_LoopCount*P_LC)==1)) { .eval P_LoopCount-- }
	.var P_ExtraCycles=max(0, Cycles - (P_LoopCount*P_LC))
	.var P_ExtraBytes=max(0, ceil(P_ExtraCycles/2))

	.var Relocate=false
	
	.var IsPageCrossed=(((<*)>=$fb) && ((<*)<=$fd))
	.if(IsPageCrossed)
	{ // Check if fast loop could be relocated to be slow and would also be smaller
		.var adr=*+ExtraBytes
		.if((ExtraBytes<P_ExtraBytes) && (((<adr)<$fb) || ((<adr)>$fd)))
		{
			.eval Relocate=true
		}
		else
		{
			.eval LoopCount=P_LoopCount
			.eval ExtraCycles=P_ExtraCycles
			.eval ExtraBytes=P_ExtraBytes
		}
	}
	else
	{ // Check if slow loop could be relocated to be fast and would also be smaller
		.var adr=*+P_ExtraBytes
		.if((P_ExtraBytes<ExtraBytes) && (((<adr)>=$fb) && ((<adr)<=$fd)))
		{
			.eval LoopCount=P_LoopCount
			.eval ExtraCycles=P_ExtraCycles
			.eval ExtraBytes=P_ExtraBytes
			.eval Relocate=true
		}
	}
	
	.if(ceil(Cycles/2) <= (5+ExtraBytes))
	{ // Loopless wait is smaller than using a loop
		wait(Cycles)
	}
	else
	{ // All that hassle for this small (relocated) loop :)
		.if(Relocate) { wait(ExtraCycles) }
		ldx #LoopCount
		dex
		bne *-1
		.if(!Relocate) { wait(ExtraCycles) }
	}
}

.macro wait(Cycles)
{
	.if(Cycles>0)
	{
		.if(Cycles<2) .error "Can't delay 1 cycle"
		.if((Cycles & 1)==0) { nop } else { bit $00 } // Delay 2 or 3 cycles
		.for(var i=1; i<floor(Cycles/2); i++) { nop } // Remaining even amount
	}
}

What this does is building an optimal Loop+Nop+Bit-combination that observes a page boundary.
Depending on the number of delay cycles and on the location of the loop the number of required extra cycles varies. So this macro checks if the extra bytes can be used to relocate the loop from/onto a page boundary so that the resulting number of bytes is minimal. (Of course it uses a loopless delay if that's even better.)

Example: your code starts at $08fd and you want to wait 24 cycles:
$08fd LDX #$04
$08ff DEX
$0900 BNE $08FF // Page crossing

If instead you wanted to wait 28 cycles at this location you could append 2 NOPs. But it's smaller to prepend just one NOP, thus relocating the loop off of the page boundary and adding one iteration:
$08fd NOP
$08fe LDX #$05
$0900 DEX
$0901 BNE $0900 // No page crossing

The wait() macro is just a simple loopless delay that's used inside waitx(). Using pha/pla the code size could be reduced even more so maybe I'll include that later.
Please note that I did test this but it's still work in progress..
2017-11-06 09:53
Cruzer

Registered: Dec 2001
Posts: 1048
Just got a crazy idea for delaying 13 cycles in 1 byte:
pause:
	rti

delay13Cycles:
	brk
Requires that the IRQ/BRK vector is set to the pause label, and no IRQs occur at the same time, which I guess is unlikely anyway when cycle-exact timing is going on. However, after a little test it seems like the PC skips a byte after returning with rti, so in reality it takes two bytes:
pause:
	rti

delay13Cycles:
	brk
	.by 0
2017-11-06 10:07
Krill

Registered: Apr 2002
Posts: 2851
Yes, BRK is a two-byte instruction. The operand byte is supposed to be an argument for the software interrupt you're triggering, pretty much similar to TRAP #<X> or INT <X> on other platforms.

It was intended for OS calls, i think, but i fail to come up with an example that actually uses the argument byte.
The 1581 ROM code only has a dummy parameter:
.8:959d  08          PHP
.8:959e  58          CLI
.8:959f  95 02       STA $02,X
.8:95a1  00          BRK
.8:95a2  EA          NOP
2017-11-07 23:01
Cruzer

Registered: Dec 2001
Posts: 1048
Interesting, did not know that. Wonder why BRK isn't usually interpreted as having an argument by assemblers/disassemblers.
2017-11-08 05:27
Oswald

Registered: Apr 2002
Posts: 5022
so byte after brk is loaded into A ? or just thrown away ? isnt it just some kind of side effect from jsr ?
2017-11-08 07:36
ChristopherJam

Registered: Aug 2004
Posts: 1380
Quoting Cruzer
However, after a little test it seems like the PC skips a byte after returning with rti, so in reality it takes two bytes


I guess you could make it a single byte 19 cycle delay by incrementing the return address in the interrupt handler, assuming you know the stack depth at the time of execution, and also avoid page boundary crossings in the 'caller'…
2017-11-08 09:12
Krill

Registered: Apr 2002
Posts: 2851
Quoting Cruzer
Wonder why BRK isn't usually interpreted as having an argument by assemblers/disassemblers.
Usually, yes. Some assemblers allow an optional argument. Default is without, as usually BRK is used to end a program, discarding any code or data after it.

Quoting Oswald
so byte after brk is loaded into A ? or just thrown away ? isnt it just some kind of side effect from jsr ?
The byte needs to be retrieved manually, reading it from stack after finding its position via TSX.
It may be possible that this is just a side-effect of saving gates or re-using some other logic (but probably not JSR with its two argument bytes).

But there was one real-world application which at least mildly suggests it was a conscious decision. The 6502 was designed as a micro-controller for industrial machines, not a general-purpose CPU for home computers. Back then, PROMs were used for custom or low-volume machines, which would be turned on and immediately manipulate physical objects in the real world. The PROMs came with all bits set, and were programmed by blowing fuses to flip bits to 0, but those bits could never be reset to 1.
Now, the BRK opcode is $00, and it could be used to patch code in PROMs. Upon encountering BRK (which was some other instruction formerly), the interrupt handler could then look up the argument byte (in addition or alternatively to the return address on stack) and decide which patch routine for that location (located in a patch area on the PROM) to execute, then resume operation.

Has anybody interviewed Mr Peddle about this? :)
Previous - 1 | 2 | 3 | 4 - Next
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
Mike
Airwolf/F4CG
Nith/TRIɅD
curtcool
MWR/Visdom
DanPhillips
thesuperfrog
kbs/Pht/Lxt
Mibri/ATL^MSL^PRX
Unlock/Padua/Albion
zscs
Didi/Laxity
bexxx
fredrikr
hedning/G★P
Guests online: 128
Top Demos
1 Next Level  (9.8)
2 Mojo  (9.7)
3 Coma Light 13  (9.7)
4 Edge of Disgrace  (9.6)
5 Comaland 100%  (9.6)
6 No Bounds  (9.6)
7 Uncensored  (9.6)
8 Wonderland XIV  (9.6)
9 Memento Mori  (9.6)
10 Bromance  (9.5)
Top onefile Demos
1 It's More Fun to Com..  (9.7)
2 Party Elk 2  (9.7)
3 Cubic Dream  (9.6)
4 Copper Booze  (9.5)
5 TRSAC, Gabber & Pebe..  (9.5)
6 Rainbow Connection  (9.5)
7 Dawnfall V1.1  (9.5)
8 Quadrants  (9.5)
9 Daah, Those Acid Pil..  (9.5)
10 Birth of a Flower  (9.5)
Top Groups
1 Nostalgia  (9.3)
2 Oxyron  (9.3)
3 Booze Design  (9.3)
4 Censor Design  (9.3)
5 Crest  (9.3)
Top Swappers
1 Derbyshire Ram  (10)
2 Jerry  (9.8)
3 Violator  (9.8)
4 Acidchild  (9.7)
5 Starlight  (9.6)

Home - Disclaimer
Copyright © No Name 2001-2024
Page generated in: 0.054 sec.