| |
chatGPZ
Registered: Dec 2001 Posts: 11386 |
useless opcodes riddle - wtf is up with LAS and TAS?
Due to my emulator related activities in the last year(s) i digged more into the so called "illegal" opcodes, and the result is a nice (hopefully) PDF that i'll publish soonish (when some of you lazy bastards are done proofreading =D) - which includes complete state of the art reference to all of these, plus a bunch of real world examples on how to use these (sometimes very weird behaving) instructions... (in large part provided by bitbreaker, thumbs up!) Its about time for a comprehensive document on this topic that is suitable for normal people =)
however, two opcodes seem to be completely useless and so far i cant think of what to use them for in real world code:
- LAS abs,y ($BB) (A,X,SP = addr & SP)
- TAS abs,y ($9B) (SP = A & X ; addr = SP & addrhi+1)
if any of you have used these before, or for whatever reason have an idea on what to do with them - let me know please, this is your chance to earn some karma upgrade points :o)
(and as a sidenote, some not so obvious *short* snippets that are using SLO, RLA, RRA or ISC are welcome too)
let's bust some more myths! |
|
| |
CRT
Registered: Oct 2012 Posts: 87 |
I have had fun with the same thoughts. It resulted in a challenge for myself to one day use TAS for something. Still failing but I'm sure a thread like this will come up with something.
AND X register with accumulator and store result in stack
pointer, then AND stack pointer with the high byte of the
target address of the argument + 1. Store result in memory. |
| |
The Phantom
Registered: Jan 2004 Posts: 360 |
I recall FOE releasing a messy about undocumented opcodes, perhaps they'll help in your PDF?
Undocumented Opcodes
If not, didn't mean to waste your time, and look forward to the PDF :) |
| |
Peiselulli
Registered: Oct 2006 Posts: 81 |
For ISC, here is a real world example (discussion in german)
http://www.forum64.de/wbb3/board2-c64-alles-rund-um-den-brotkas.. |
| |
soci
Registered: Sep 2003 Posts: 480 |
Here's a real world example from IDEDOS for using SLO:
; A is zero before reaching here
.if R65C02
.9719 06 8e asl $8e asl next_sector+3
.971b 26 8d rol $8d rol next_sector+(2, 1, 0,)
.971d 26 8c rol $8c
.971f 26 8b rol $8b
.9721 a5 8e lda $8e lda next_sector+3
.else
.973b 07 8e slo $8e slo next_sector+3
.973d 26 8d rol $8d rol next_sector+(2, 1, 0,)
.973f 26 8c rol $8c
.9741 26 8b rol $8b
.fi
It spares 2 bytes for leaving out the last LDA when using illegal opcodes ;) |
| |
soci
Registered: Sep 2003 Posts: 480 |
Another one from IDEDOS for using ISB (ISC):
; A is zero and C=0 before reaching here
.elsif R65C02
.a77e fe 1e 19 inc $191e,x inc buffer_address_hi_exp,x
.a781 bd 1e 19 lda $191e,x lda buffer_address_hi_exp,x
.else
.a792 ff 1e 19 isb $191e,x isb buffer_address_hi_exp,x
.a795 49 ff eor #$ff eor #$ff
.fi
Spares a byte and is faster ;) |
| |
Hein
Registered: Apr 2004 Posts: 954 |
No LAS or TAS, but ARR.
To check if the target note frequency has been reached for a 'glide to' function:
;x=$0e,$07,$00
ldy v1_note_target,x ;target note
lda freq_table_lo,y
sec
sbc v1_freq_lo_buffer,x
sta zp_idx_lo
lda freq_table_hi,y
sbc v1_freq_hi_buffer,x
sta zp_idx_hi
ldy v1_note_base,x ;start note
lda freq_table_lo,y
cmp zp_idx_lo
lda freq_table_hi,y
sbc zp_idx_hi
arr #$80
eor v1_porta_lfo_length,x ;bit7 = 0 if glide up, bit7 = 1 if glide down
bpl set_calculated_note ;still not reached
[edit] Ah, you're asking for RRA, not ARR.. :) Only used RRA (lo),y to have a short opcode for cycle wasting.. [/edit] |
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
To my shame I must confess that I haven't found any use for LAS or TAS either.
You'd think TAS would be useful for clearing all of the registers and stack pointer during initialization but LXA #$00/TAY/TXS is a byte shorter, plus you probably want reset S to $FF anyway.
As for SLO/SRE I may have mentioned this before but below is the host side of a two-bit IRQ transfer running in 64 cycles per byte, including loop overhead. A asynchronous IRQ protocol works by toggling ATN, then waiting for the maximum delay period when the drive can be relied on to have the results back before sampling them and toggling ATN for the next bit pair.
By using the SLO/SRE RMW instructions we can both sample DATA/CLK and toggle ATN within two cycles inside the same instruction, instead of four as you would get with separate EOR/STA. ldy #%00000100
;Assert ATN
loop: arr #%11111010 ;cdab00--
arr #%11111010 ;dcdab00-
ror ;cdcdab00
sre $dd00-4,y ;cefdaB--
;Release ATN
sty $dd00
alr #%11111100 ;0cefdaB0
sta merge+1
slo $dd00 ;hcef1--0
;Assert ATN
and #%10000000 ;h0000000
merge: adc #%00000000 ;hcefdaBg
sta sector,x
sre $dd00-4,y ;0ab001--
;Release ATN
inx
sty $dd00
beq break
and #%01100000 ;0ab00000
slo $dd00 ;dab01---
bne loop ;(BRA) |
| |
lft
Registered: Jul 2007 Posts: 369 |
@doynax: Although it makes some assumptions about which vic bank is active, right? |
| |
Copyfault
Registered: Dec 2001 Posts: 478 |
Hey Groepaz,
Ninjas 6-Sprites-over-FLI-routine makes heavy use of SLO and RRA (maybe also RLA, not 100% sure), but I guess this example is already covered in the pdf ?!?
I've never used LAE/SHS ($bb/$9b) up to now. As you are looking for "real world examples" I guess that you completely found out what this "unstability" (which is usually connected to both opcodes) really means, right?
Looking forward to reading this paper;)
Bye-
CF |
| |
chatGPZ
Registered: Dec 2001 Posts: 11386 |
ninja contributed that routine, yes :) and yes, the instabilities are covered :) |
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
Quoting lftAlthough it makes some assumptions about which vic bank is active, right? Yes, only banks 0 and 3 work and IRQ-safe switching eats 10 cycles over $dd02 fiddling.
Plus the 14-cycle response is pushing it, famously failing on the 1571DCR in 1 MHz mode. |
| |
chatGPZ
Registered: Dec 2001 Posts: 11386 |
forgot: yes, only "real world" examples are useful, i HAVE used these opcodes in test programs (look in VICE repository) |
| |
chatGPZ
Registered: Dec 2001 Posts: 11386 |
who would have though =) |
| |
Cruzer
Registered: Dec 2001 Posts: 1048 |
The only thing I have used TAS for is for being impressed by how much stuff the CPU can do in 5 cycles. Too bad it's useless stuff. |
| |
Martin Piper
Registered: Nov 2007 Posts: 722 |
Maybe it's time for an optimising assembler that suggests replacement code by scanning for known instruction sequences. |
| |
doynax Account closed
Registered: Oct 2004 Posts: 212 |
Quoting Martin PiperMaybe it's time for an optimising assembler that suggests replacement code by scanning for known instruction sequences. I'm not convinced such a feature would be terribly useful unless the assembler had a fuller view of the semantics of the program, to separate side-effects from desired results.
Perhaps an emulator could be modified to report feedback on dead-stores/flags and the like. Plus with demo code coverage is largely a non-issue.
A super-optimizer would also be pretty neat, though the amount of complex immediates and tables in optimized 6502 code may put a damper on things. |
| |
lft
Registered: Jul 2007 Posts: 369 |
Quoting Martin PiperMaybe it's time for an optimising assembler that suggests replacement code by scanning for known instruction sequences.
Depends on the code. Often, the amount of cycles matters, and occasionally the amount of bytes.
But an optimising compiler for a high-level language should make use of illegal opcodes when it makes sense. |
| |
chatGPZ
Registered: Dec 2001 Posts: 11386 |
i have played with both, super optimizer and pattern based optimizer some years ago... it didnt make a lot of sense for hand written asm code (ie, it almost never found something to optimize). for code generated by cc65 on the other hand it was somewhat useful =) |
| |
Peacemaker
Registered: Sep 2004 Posts: 275 |
i call for an bruteforce ill. opcode tool that creates billions of combinations of routines and its results ;) |
| |
Frantic
Registered: Mar 2003 Posts: 1648 |
Although I've used quite a few of the illegals I can only confirm what others have said about failing to find a sensible use for LAS and TAS. :) |
| |
chatGPZ
Registered: Dec 2001 Posts: 11386 |
one thing i could think of... a clever combination of LAS $0100,y and TAS $0100,y with some other illegal(s) involving a shift operation to build a very fast pseudo random sequence generator. perhaps :) |
| |
metalux
Registered: Aug 2011 Posts: 17 |
So, let's publish that paper! |
| |
Bitbreaker
Registered: Oct 2002 Posts: 508 |
Quote: one thing i could think of... a clever combination of LAS $0100,y and TAS $0100,y with some other illegal(s) involving a shift operation to build a very fast pseudo random sequence generator. perhaps :)
one can at least do an and operation on an index with TAS (SP that is) and read the table via PLA then. But SBX #$xx + LDA table,x would do just as well then. Interesting is the feedback possibility in that TAS command, as the resulting SP is can again fed back into the next call/lookup. Just as with LAX ($xx,x). Still i could not think of any realworld example, either for LAS/TAS as well as for the mentioned lax variant. I finally successfully used SAX $xx,y yesterday however. |
| |
Frantic
Registered: Mar 2003 Posts: 1648 |
Would be a nice coding compo. Come up with the best uses for those two opcodes. (I wouldn't spend time on that myself though, but I'd like to see the result of such a compo. :D ) |
| |
Bitbreaker
Registered: Oct 2002 Posts: 508 |
Implement a turing complete stack machine with those two opcodes :-P |
| |
Cruzer
Registered: Dec 2001 Posts: 1048 |
Quoting BitbreakerImplement a turing complete stack machine with those two opcodes :-P Can't be done since they can't branch. |
| |
Perplex
Registered: Feb 2009 Posts: 255 |
Use TAS to store a branch opcode somewhere after the current program counter? :-) |
| |
Cruzer
Registered: Dec 2001 Posts: 1048 |
Ok, if you're allowed to cheat like that. :) |
| |
Mixer
Registered: Apr 2008 Posts: 452 |
Funny opcodes those LAS and TAS
TAS $fe00,y would let you insert any SP value on that page.
TAS $7e00,y would let you insert any SP value AND #$7f on that page, etc.
Sort of having an extra register.Except that SP gets replaced by A and X.
On the other hand perhaps this is a way to do 2 AND operations at the same time?
TAS ADDR,Y: ADDR=SP and ADDRHI+1, SP=A and X.
; perhaps when calculating and #$f8 and and #$07
ldy #$00
lda #$07
ldx ycoordinate,y
txs
tas $f700,y ; f700,y has Ycoord & F8 , SP has ycoord & 07
; perhaps use prepared stack, PLA here?
tas $fe00,y ; if this is correct, fe00,y now has ycoord & 07
;2+2+4+2+5+5
;$f700=y &f8, SP=Y &07
Same could be written without stackpointer like this.
ldy #$00
lax ycordinate,y
and #$f8
sta $f700,y
lda #$07
sax $fe00,y
;4+2+4+2+4+2
and with tables
ldy #$00
ldx yycoordinate,y
lda table1,x ; X and f8
sta $f700,y
lda table2,x ; X and 07
sta $fe00,y
;2+4+4+4+4+4 |
| |
Martin Piper
Registered: Nov 2007 Posts: 722 |
I agree a full optimiser would be a bit of a challenge since an optimiser would really need some useful context about expected register input ranges, scan line and position it's expected to execute on and possibly zero page context to make good suggestions. Although it's possible to tell the assembler expected input ranges and the like by using some source code directives.
However a simple optimiser would be easily doable.
For example if it spots:
ROR $f00d
ADC $f00d
It could automatically suggest RRA $f00d
If there are stability prerequisites like the instruction needing to be in a particular bank or scan line position then it could simply output the code hint to a separate file (or output a warning) and let the coder make the change or ignore it.
It would help where someone doesn't quite remember the extended opcode set yet. :) |
| |
Frantic
Registered: Mar 2003 Posts: 1648 |
Not sure if anybody mentioned something like this already, but the only thing I can think of is that you may use "las TABLE,y" to AND the stack pointer with the contents of the Y register (if pointing to a table with values 00..FF sequentially laid out) and storing the result back into the stack pointer. At TABLE you may also have a different sort of table, that would allow you to specify exactly what sort of values you actually want to AND with the stack pointer, so in a way it would be more flexible than a normal AND instruction as well. As a special case you could fill the entire 256 bytes of that table with the same value, to make the AND operation independent of the contents of the Y register and always AND with the same value.
You would also have the result of the operation in both A and X. Good or bad..
As for what contexts this may be useful.. don't ask me! :) So.. perhaps not that much of a contribution to the discussion after all... At least it is the fastest way I can think of of actually ANDing the stack pointer with something, and storing the result back to the stack pointer. Four cycles, if TABLE is page aligned.
Possibly I misunderstood how LAS works altogether. :) |
| |
soci
Registered: Sep 2003 Posts: 480 |
Quoting Martin Piper
For example if it spots:
ROR $f00d
ADC $f00d
It could automatically suggest RRA $f00d
Is there a practical use of this opcode, except on a Z80? |
| |
Rastah Bar Account closed
Registered: Oct 2012 Posts: 336 |
I may have found some use for the TAS opcode. Suppose I want to let the sprite image depend on it's y position on the screen.
For example make it bigger for larger y to create an illusion of depth. Suppose bank 1 is used and screen memory starts at $7800.
Then the sprite image pointers are at $7bf8-$7bff. The trick I'm thinking about uses the and+store operation of TAS, for example
LDY #spritenumber
LAX #yposition
TAS $7BF8,Y ;TAS performs and #$7C, allowing for 32 different sprite images.
PLA
STA $D027,Y ;Assuming I have color data on the stack. Just as an example of using that TAS modifies SP.
What do you think about this? |
| |
Rastah Bar Account closed
Registered: Oct 2012 Posts: 336 |
There are also other possibilities. To display an animation, LAX #counter can show a new image every 4th value (since bits 0 and 1 of $7c are 0). For each image, 4 values of data can be put on the stack, e.g., sprite color, X position LSB, X position MSB, Y position.
Or can this be done more efficiently without TAS? |
| |
Rastah Bar Account closed
Registered: Oct 2012 Posts: 336 |
Quoting Color BarSuppose bank 1 is used and screen memory starts at $7800.
Then the sprite image pointers are at $7bf8-$7bff. The trick I'm thinking about uses the and+store operation of TAS, for example
LDY #spritenumber
LAX #yposition
TAS $7BF8,Y ;TAS performs and #$7C, allowing for 32 different sprite images.
PLA
STA $D027,Y ;Assuming I have color data on the stack. Just as an example of using that TAS modifies SP.
The use of the stack looks perhaps a bit "constructed", but the sprite image pointer trick may be a good application of SHX, SHY, or even SHA, e.g.:
LDY #spritenumber
LDX #spritecoordinate
SHX $7BF8,Y
|