| | lft
Registered: Jul 2007 Posts: 369 |
New VSP discovery
First off, this is what we already knew: VSP causes the VIC chip to briefly
place a logically undefined value on the DRAM address lines during the
halfcycle following the write to d011. If the undefined value coincides with
the RAS signal, every memory cell with an xxx7 or xxxf address is at risk of
getting corrupted. The relative timing of the undefined value and RAS depends
on several factors including temperature.
We also knew that the undefined value could be delayed slightly if VSP was
triggered by setting the DEN bit instead of modifying YSCROLL. This was enough
to avoid a crash on some machines.
I wanted to investigate whether there were other ways of controlling the timing
of the undefined value. Based on a combination of educated guesswork, luck and
plenty of trial-and-error, I could observe the following: The timing depends on
the specific 3-bit value that is written to YSCROLL, as well as the 3-bit value
that was stored in YSCROLL previously.
This means that we can trigger VSP using one of 56 methods (eight different
YSCROLL values for various rasterlines, seven non-matching YSCROLL values to
switch from), each with slightly different timing.
Using the techniques from my Safe VSP demo, I created a tool that would trigger
VSP many times, check if memory got corrupted, and keep track of the number of
crashes caused by each of the 56 methods. I then looked for a pattern in these
statistics.
Intriguingly, if I arranged the 56 crash counters in a grid with the vertical
axis corresponding to the rasterline and the horizontal axis corresponding to
the exclusive-or between the rasterline and the dummy value that was stored in
d011 prior to the VSP, then the crashes would tend to occur only in a subset of
the columns. When my crash prone c64 is powered on, the VIC chip is cold, and
there are no crashes. Within a minute, crashes start to appear in column 7
(meaning that all three bits of YSCROLL were flipped). As the chip heats up,
more crashes begin to appear in columns 3, 5 and 6 (two bits flipped). After
several more minutes, crashes show up also in columns 1, 2 and finally 4 (a
single bit flipped), but by this time, there are no longer any crashes in
columns 5, 6 or 7. Finally, when the VIC chip has reached a stable working
temperature, my machine no longer crashes.
This is what it might look like four minutes after power-on:
Now, let me stress that I only have one VSP-crashing c64, and these results
might not carry over to other machines. I hope they do, though. I would very
much like you (yes, you!) to run VSP Lab (described below) on your crash prone
machines and report what happens.
Is this useful? Short answer: Yes, very. But it hinges on whether the behaviour
of my c64 is typical. Even without the mentioned regularity in the columns, it
would be possible to find a few safe combinations for a given machine and a
given temperature. But the regularity makes it so much more practical and also
easier to explain to all C64 users, not just coders.
Let's refer to the seven columns as "VSP channels". For a given machine at a
given temperature, some of these channels are safe, and possibly some of them
are unsafe. It takes about 5-10 minutes for the VIC chip to reach its working
temperature. If you know that e.g. VSP channel #5 is safe on your machine, and
you can somehow tell a demo or game to use that specific channel, then VSP
won't crash.
My measurement tool evolved into a program called VSP Lab, depicted above,
which you can use to find out which VSP channels are safe to use on your
machine. It triggers a lot of VSP operations and visualises the crashes in a
grid, where each column corresponds to a VSP channel. Remember that a cold and
a hot VIC behave differently, so don't trust the measurements until about ten
minutes after power-on. You can reset the grid highlights using F1 to see if
channels which were unsafe before have become safe.
Demos and games could prompt the user for a VSP channel to use, or try to
determine it automatically using the same technique that VSP Lab is based on.
From a coding point of view, all you then have to do in order to implement
crash-free VSP, is to prepare the value X that you'll write to d011 to trigger
VSP, and the value Y which is X ^ vsp_channel. Then, on the rasterline where
you want to do VSP, you just wait until the time is right and do:
sty $d011
stx $d011
On the VSP Lab disk image, there's a small demo effect that you can run. It
will ask you for a VSP channel to use, and if you give it a safe number, it
should not crash.
This technique is so simple and non-intrusive that it's quite feasible to patch
existing games and demos, VSP-fixing them.
Also, this discovery explains the old wisdom that if you attempt VSP more than
once per frame, the routine will be more likely to crash. Here's why: In a demo
effect, you typically perform VSP on a fixed rasterline, so the value you write
to d011 will be constant. It is reasonable to assume that the old value of
YSCROLL will also be constant. Therefore, a given VSP effect will consistently
end up in the same VSP channel. On a machine with N safe VSP channels, the
probability of survival is therefore p = N / 7. If you do VSP on two different
rasterlines, each VSP will likewise end up in a channel, but not necessarily
the same one. The probability that both end up in a safe channel is p*p. If we
assume that most crash prone machines have at least one safe channel, we have
0 < p < 1 and therefore p*p < p. Q.E.D. To verify this, I patched vice to
report the channel every time VSP was performed. Sure enough, VSP&IK+
consistently uses VSP channel 1, as does Royal Arte. Krestage 3 uses VSP
channel 2. The intro of Tequila Sunrise, which performs VSP twice per frame,
uses VSP channels 1 and 3, and so does Safe VSP.
Finally, I will attempt to explain the observed behaviour at the electronical
level. Suppose each bit of YSCROLL is continually compared to the corresponding
bit in the Y raster counter, using XOR gates. The outputs of the XOR gates are
routed to a triple-input NOR gate, the output of which is therefore high if and
only if the three bits match. A triple-input NOR gate in NMOS would consist of
a pull-up resistor and three pull-down transistors. But the output of the NOR
gate is not a perfect boolean signal, because the transistors are not ideal.
When they are closed, they act like small-valued resistors, pulling the output
towards -- but not all the way down to -- ground potential. When YSCROLL
differs from the raster position by three bits, all three transistors
contribute, and the output reaches a low voltage. When the difference is two
bits, only two transistors pull, so the output voltage is slightly higher. For
a one-bit difference, the voltage is even higher (but still a logic zero, of
course). When we trigger VSP, all transistors stop pulling the voltage down,
and because of the resistor, the output voltage will begin to rise. But the
time it needs in order to rise to a logic one depends on the voltage at which
it begins. Thus, the more bits that change in YSCROLL, the longer it takes
until the match signal is asserted.
I have a fair amount of confidence in this theory, but need more data to
confirm it. And, once again, this is only of practical use if the average crash
prone machine has safe channels, like mine does. So please check your
equipment! I'm looking forward to your reports. |
|
... 43 posts hidden. Click here to view all posts.... |
| | Oswald
Registered: Apr 2002 Posts: 5086 |
I would be glad to see the specification which says VSP should work. Truth is its just as a lucky side effect as illegal opcodes, and under normal circumstances no1 would try to manipulate d011 the VSP way, just like normally no1 would try to use non specified opcodes.
Anyway, emulators cope with VSP well since ages, and anyone can buy as much c64 as much he can until he finds a VSP safe one, so VSP crasing is rather an artifical problem in my eyes. Or in other words, wanking nerds whining over 30 year old HW problems. |
| | tokra
Registered: Jun 2011 Posts: 9 |
Writing to $d011 should work. Even if you do it just once chances are your machine will crash. While these chances are VERY low, one might discuss if those are acceptable from an engineering standpoint.
What are normal circumstances? Was sprite-multiplexing intended? Or FLI? Or opening the borders?
Sure one could just use emulators or a different machine or one could just say "don't use VSP". But "wanking and whining" about these and similar issues 30 years later is half of the fun. Exploring the extreme limits of this machine is what the whole scene is about, isn't it? So if someone were to present a hardware-fix for this issue as part of a say a Wild-Compo it would get my thumbs up just as much as the latest demo that uses another quirk of the VIC-chip. |
| | Oswald
Registered: Apr 2002 Posts: 5086 |
I think sprite multiplexing was definitly intended, while designing the chip, hence the TI99 did it in HW long before, which the designers had a look at before starting :)
I agree with the 2nd part fully. And have to admit I agree with the first too, if one write may crash the machine then its definitly a bug in the design. |
| | enthusi
Registered: May 2004 Posts: 677 |
Independent on the idea that d011-fiddling should never crash a machine, I consider hw-fixes super-lame.
Let's not get where all the other dead scenes went with their numerous hd-updates/fixes/revisions.
If it cant be handled by all plain c64s, then it sucks.
People did not know, now they do.
Also there ARE work-arrounds available now to be considered a new challenge. |
| | tlr
Registered: Sep 2003 Posts: 1787 |
enthusi is spot on with his comment. +1
If you do use a feature like VSP or even a simpler one like the exact pixel timing of a color split then you need to consider what happens on different machines and try to avoid it manifesting in the final result. |
| | tokra
Registered: Jun 2011 Posts: 9 |
I can see both points of view. The software should have avoided this issue in the first place. Since the unwanted effect shows so seldomly, it was not confirmed to be an actual underlying hardware-design-issue before lft found out the reason some months ago.
So while I fully agree that from now on coders should take this into account when doing VSP-effects, this still leaves about 30 years of existing software that might crash your computer if it changes $d011 even just once. Fixing all this software would certainly be very cumbersome, if even possible at all. So, if it is possible to create a hardware-fix for this issue, why not go for it? |
| | chatGPZ
Registered: Dec 2001 Posts: 11350 |
Quote:Fixing all this software would certainly be very cumbersome, if even possible at all.
finally a good use for all these hax0rcrax0rs =P |
| | Martin Piper
Registered: Nov 2007 Posts: 718 |
Quote: mmmh then who was it.... i somehow remember someone tried what happens.... *shrug*
Someone did it on the speccy
http://bitcycle.org/retro/spectrum/SRAM_replacement/ |
| | Zer0-X Account closed
Registered: Aug 2008 Posts: 78 |
In theory it should work the same as with speccy.
RAS & CAS still having control over the address loading and PLA taking care of keeping RAM off the bus when accessing IO, etc.
What annoys me is the need for a single damn inverter for the RAS to strobe half of the address to the buffering latch. With a single 64k SRAM chip one could do it using a single extra TTL chip if the RAS didn't have to be inverted. With 2x 32k SRAMs (what I happen to have) yet one more TTL chip needs to be added. The more chips, the more propagation delay, the more problems. Tho it would be for testing. |
| | Zer0-X Account closed
Registered: Aug 2008 Posts: 78 |
Hnngh, stupid me... Ofcourse the latch keeps storing the address and "locks" it in when RAS activate.
Also found out some german guy had already SRAM-modded his C64 two years ago. Tho I doubt he did any VSP testing, which I'm pretty sure would work just fine with the setup. |
Previous - 1 | 2 | 3 | 4 | 5 | 6 - Next | |