| |
TWW
Registered: Jul 2009 Posts: 545 |
The big VICE & SuperCPU Thread
I open this thread so coders may share information and issues regarding the Vice SuperCPU Emulator.
I found 2 issues;
The TCD command doesen't work. PHA/PLD works fine. Tried the XBA tp make sure there was no funny business with the C register beeing mixed up but same result.
I did a wipe-mem routine which clears memory. Obviously a 16 bit STZ DP is the way to go and just relocating the ZP for each page you wipe. However a dumb 16 bit STZ Abs,x (2 cycles more/instruction) is faster then a loop with roughly 20 cycles overhead. The math doesen't add up as the DP aproach consumes less cycles acc. to ref. material. Can it be the RAM refresh and branching which causes aditional wait times (I read somewhere that a RAM Refresh takes 8 cycles) which causes the deviation? |
|
... 12 posts hidden. Click here to view all posts.... |
| |
TWW
Registered: Jul 2009 Posts: 545 |
Anybody had any luck in getting the IDE64 to work with the SuperCPU Vice? Tried to compile a FW early 2.1 but can't get the HD to load anything. It works in regular vice X64 though. |
| |
blacky
Registered: Sep 2007 Posts: 41 |
Please try the latest nighly build at http://vice.pokefinder.org/
|
| |
TWW
Registered: Jul 2009 Posts: 545 |
Tried it.
#1: The stz ZP vs. stz abs,x timing doesen't seem to have improved (unless there is another cause for it). I can post some example code later to show this.
#2: TCD instead of PHA/PLD (16 bit mode) still messes things up in my code atleast
#3: a little better with the IDE64 part but still unreliable and vice even refuses to start up with a IDEROM and HD attached.
NewBug: Previed of FDs (1541/1581) doesen't work in latest version (-r27250)
|
| |
chatGPZ
Registered: Dec 2001 Posts: 11386 |
hint: if you actually want to help with fixing it, the bug tracker is where you should post the bugs, not here =P |
| |
TWW
Registered: Jul 2009 Posts: 545 |
I don't trust my own observational skills enough to determine if it is a bug or not.
Sure I'd like to help fix things based on what I find but I'd like a chance to discuss the bugs before reporting them to verify that we are, indeed, looking at a bug and not my crossed eyes (Has happened before you know (My wife blames Jack Daniels)).
ps. Mouse don't work either 8-D Really mess it up both on SCPU and 64SC. Same code on my side and brutal crash in vice once it's ALT-Q'ed in. |
| |
TWW
Registered: Jul 2009 Posts: 545 |
Alright.
The "TCD" does work fine. Stupid mistake by me (see Gpz^^).
Regarding the mem-filler, I still have the same situation. So I'm gonna attach some code this time in case someone knows what this is about (All in native, all regs in 16 bit mode):
Straight Indexed STZ routine (Called with bitmap memory in X):
.for(var i=0;i<8000;i=i+2) {
:STZ $0000+i,x
}
rts
And here is the one using relocatable Direct Page (ZP) (Called with bmpmem in A):
:LDX #$001f // 32 pages needs to be cleared
!: :TCD
.for(var i=0;i<128;i++) {
:STZ $00+[i*2]
}
clc
:ADC #$0100
dex
bmi *+5
jmp !- // Repeat
:LDA #$0000
:TCD
rts
Ther result is roughly 4 lines of rastertime more on the Direct Page approach. So unless the direct page aproach (the overhead/loop shouldn't give more cycles than saved on the direct page addressing) is causing more cycles to be eaten there is something amiss in my code or the emulator. |
| |
MagerValp
Registered: Dec 2001 Posts: 1078 |
You can't optimize SuperCPU code like that. To write to the graphics bank you need to have memory mirroring turned on, which leaves you with one write every 1 MHz cycle.
The first takes exactly 8000 cycles to execute, since what's how many bytes you clear. The second clears 8192 bytes so it'll take a smidge over 3 rasterlines extra.
Just keep the code simple instead:
ldy #8000
: stz $0000,x
inx
dey
bne :-
Now you have 8 "free" cycles left in the loop where it's just stalling and waiting for the next available write cycle. |
| |
TWW
Registered: Jul 2009 Posts: 545 |
Offcourse... Max 1 write / cycle to the 64k base ram... Explains everything! Thanx.
However it only takes 4000 writes since it's in 16 bit mode and you would have to increase the x-register twice and set y to #4000. Tested and works like a charm.
Edit: Too bad though. Would have been cool if it worked ;-) |
| |
Oswald
Registered: Apr 2002 Posts: 5094 |
you can always calculate some other stuff while the write to the c64 ram happens :) Now that there's SCPU emulation I whish I had a lot of free time to create a demo for it, or atleast experiment with some code.
for example a texture mapper, I guess the bottleneck would be rather the GFX mode and not the speed. At 20 mhz 16 bit registers and in 160x200 and a nicely optimized routine, AGA quality texture mapped objects should be possible. But what gfx mode could display 16 colors nicely ? ;) Or maybe 4 colors would work with heavy ordered dithering. |
| |
MagerValp
Registered: Dec 2001 Posts: 1078 |
Texture mapped 3D objects are boring on the PC, boring on the Amiga, and they will be just as boring on the SuperCPU :)
But you have the ability to write to the VIC every cycle so I bet you can do some wicked crazy stuff there. Vicious SID style audio effects are virtually "free" too, you'd only lose 60 out of the 1260 or so cycles you have each line... |
Previous - 1 | 2 | 3 - Next |