Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
You are not logged in - nap
CSDb User Forums


Forums > C64 Coding > New VSP discovery
2013-07-19 21:01
lft

Registered: Jul 2007
Posts: 369
New VSP discovery

First off, this is what we already knew: VSP causes the VIC chip to briefly
place a logically undefined value on the DRAM address lines during the
halfcycle following the write to d011. If the undefined value coincides with
the RAS signal, every memory cell with an xxx7 or xxxf address is at risk of
getting corrupted. The relative timing of the undefined value and RAS depends
on several factors including temperature.

We also knew that the undefined value could be delayed slightly if VSP was
triggered by setting the DEN bit instead of modifying YSCROLL. This was enough
to avoid a crash on some machines.

I wanted to investigate whether there were other ways of controlling the timing
of the undefined value. Based on a combination of educated guesswork, luck and
plenty of trial-and-error, I could observe the following: The timing depends on
the specific 3-bit value that is written to YSCROLL, as well as the 3-bit value
that was stored in YSCROLL previously.

This means that we can trigger VSP using one of 56 methods (eight different
YSCROLL values for various rasterlines, seven non-matching YSCROLL values to
switch from), each with slightly different timing.

Using the techniques from my Safe VSP demo, I created a tool that would trigger
VSP many times, check if memory got corrupted, and keep track of the number of
crashes caused by each of the 56 methods. I then looked for a pattern in these
statistics.

Intriguingly, if I arranged the 56 crash counters in a grid with the vertical
axis corresponding to the rasterline and the horizontal axis corresponding to
the exclusive-or between the rasterline and the dummy value that was stored in
d011 prior to the VSP, then the crashes would tend to occur only in a subset of
the columns. When my crash prone c64 is powered on, the VIC chip is cold, and
there are no crashes. Within a minute, crashes start to appear in column 7
(meaning that all three bits of YSCROLL were flipped). As the chip heats up,
more crashes begin to appear in columns 3, 5 and 6 (two bits flipped). After
several more minutes, crashes show up also in columns 1, 2 and finally 4 (a
single bit flipped), but by this time, there are no longer any crashes in
columns 5, 6 or 7. Finally, when the VIC chip has reached a stable working
temperature, my machine no longer crashes.

This is what it might look like four minutes after power-on:



Now, let me stress that I only have one VSP-crashing c64, and these results
might not carry over to other machines. I hope they do, though. I would very
much like you (yes, you!) to run VSP Lab (described below) on your crash prone
machines and report what happens.

Is this useful? Short answer: Yes, very. But it hinges on whether the behaviour
of my c64 is typical. Even without the mentioned regularity in the columns, it
would be possible to find a few safe combinations for a given machine and a
given temperature. But the regularity makes it so much more practical and also
easier to explain to all C64 users, not just coders.

Let's refer to the seven columns as "VSP channels". For a given machine at a
given temperature, some of these channels are safe, and possibly some of them
are unsafe. It takes about 5-10 minutes for the VIC chip to reach its working
temperature. If you know that e.g. VSP channel #5 is safe on your machine, and
you can somehow tell a demo or game to use that specific channel, then VSP
won't crash.

My measurement tool evolved into a program called VSP Lab, depicted above,
which you can use to find out which VSP channels are safe to use on your
machine. It triggers a lot of VSP operations and visualises the crashes in a
grid, where each column corresponds to a VSP channel. Remember that a cold and
a hot VIC behave differently, so don't trust the measurements until about ten
minutes after power-on. You can reset the grid highlights using F1 to see if
channels which were unsafe before have become safe.

Demos and games could prompt the user for a VSP channel to use, or try to
determine it automatically using the same technique that VSP Lab is based on.

From a coding point of view, all you then have to do in order to implement
crash-free VSP, is to prepare the value X that you'll write to d011 to trigger
VSP, and the value Y which is X ^ vsp_channel. Then, on the rasterline where
you want to do VSP, you just wait until the time is right and do:

        sty $d011
        stx $d011


On the VSP Lab disk image, there's a small demo effect that you can run. It
will ask you for a VSP channel to use, and if you give it a safe number, it
should not crash.

This technique is so simple and non-intrusive that it's quite feasible to patch
existing games and demos, VSP-fixing them.

Also, this discovery explains the old wisdom that if you attempt VSP more than
once per frame, the routine will be more likely to crash. Here's why: In a demo
effect, you typically perform VSP on a fixed rasterline, so the value you write
to d011 will be constant. It is reasonable to assume that the old value of
YSCROLL will also be constant. Therefore, a given VSP effect will consistently
end up in the same VSP channel. On a machine with N safe VSP channels, the
probability of survival is therefore p = N / 7. If you do VSP on two different
rasterlines, each VSP will likewise end up in a channel, but not necessarily
the same one. The probability that both end up in a safe channel is p*p. If we
assume that most crash prone machines have at least one safe channel, we have
0 < p < 1 and therefore p*p < p. Q.E.D. To verify this, I patched vice to
report the channel every time VSP was performed. Sure enough, VSP&IK+
consistently uses VSP channel 1, as does Royal Arte. Krestage 3 uses VSP
channel 2. The intro of Tequila Sunrise, which performs VSP twice per frame,
uses VSP channels 1 and 3, and so does Safe VSP.

Finally, I will attempt to explain the observed behaviour at the electronical
level. Suppose each bit of YSCROLL is continually compared to the corresponding
bit in the Y raster counter, using XOR gates. The outputs of the XOR gates are
routed to a triple-input NOR gate, the output of which is therefore high if and
only if the three bits match. A triple-input NOR gate in NMOS would consist of
a pull-up resistor and three pull-down transistors. But the output of the NOR
gate is not a perfect boolean signal, because the transistors are not ideal.
When they are closed, they act like small-valued resistors, pulling the output
towards -- but not all the way down to -- ground potential. When YSCROLL
differs from the raster position by three bits, all three transistors
contribute, and the output reaches a low voltage. When the difference is two
bits, only two transistors pull, so the output voltage is slightly higher. For
a one-bit difference, the voltage is even higher (but still a logic zero, of
course). When we trigger VSP, all transistors stop pulling the voltage down,
and because of the resistor, the output voltage will begin to rise. But the
time it needs in order to rise to a logic one depends on the voltage at which
it begins. Thus, the more bits that change in YSCROLL, the longer it takes
until the match signal is asserted.

I have a fair amount of confidence in this theory, but need more data to
confirm it. And, once again, this is only of practical use if the average crash
prone machine has safe channels, like mine does. So please check your
equipment! I'm looking forward to your reports.
 
... 43 posts hidden. Click here to view all posts....
 
2013-08-05 19:05
Kabuto
Account closed

Registered: Sep 2004
Posts: 58
I agree that the ideal solution would be a hardware modification since this bug can also be triggered by non-VSP code that accidentally creates a VSP condition. Odds are pretty low (re-enabling display or changing vscroll without timing will trigger VSP with a probability of about 0.2%, and then, according to the plots, up to 2% of those might actually crash) but even simple basic games that just modify $D011 for a cheap screen shake effect can crash.
2013-08-05 19:57
tlr

Registered: Sep 2003
Posts: 1723
Sure, it would be nice to have fixed but there are many other machine/temp dependent stuff that is flaky (e.g LAX, ANE). Should we fix those too?

The question is obviously retorical. The difference is that VSP is commonly used.

Still it's definately on the border of what should be considered safe to use. It seems to me that LAX #$00 is safer than vsp for instance.

I'd rather we found a safe/safer way to use the unmodified hardware, which I believe is exactly what lft is trying to do.
2013-08-05 20:21
HCL

Registered: Feb 2003
Posts: 717
Hmm.. My c64 and c128 ran the test for an hour+ at BFP, but they were both clean (!). This is puzzeling since i know that at least the c128 was a bitch to code d011-stuff on in the early 90's, and that was the only thing i wanted to do back then :P.
2013-08-06 01:29
Martin Piper

Registered: Nov 2007
Posts: 644
Replacing DRAM with SRAM should fix the issue. It might need some extra logic circuitry.
2013-08-06 17:01
algorithm

Registered: May 2002
Posts: 702
Thats a bit overkill however
2013-08-06 17:30
chatGPZ

Registered: Dec 2001
Posts: 11136
and zero-x (i think) tried it - doesnt fix it at all (i guess you must disconnect/prevent the VIC from writing to it too)
2013-08-06 22:25
Zer0-X

Registered: Aug 2008
Posts: 78
Quote: and zero-x (i think) tried it - doesnt fix it at all (i guess you must disconnect/prevent the VIC from writing to it too)

Nope, haven't tried it. Tempting tho.
2013-08-06 22:28
chatGPZ

Registered: Dec 2001
Posts: 11136
mmmh then who was it.... i somehow remember someone tried what happens.... *shrug*
2013-08-06 23:12
tokra

Registered: Jun 2011
Posts: 9
I think the difference between using VSP and using illegal opcodes is that VSP should work according to the specification while with illegal opcodes you are just lucky that they work despite the specification.

So, even though the failure rate may just be 2% of 0.2% in standard cases (1 in 25.000) this can be considered a hardware-defect.

On the other hand I could imagine if Commodore engineers in 1982 knew about this error and had presented it to Jack Tramiel he would have just shrugged it off as "works good enough" and gone to production anyway.

Now, 30 years later the situation is different. There are demos that use this effect and a lot of machines that crash when it is used (my C128D included). So I for one would very much welcome a hardware-fix that rectifies this issue at last.
2013-08-07 05:17
WVL

Registered: Mar 2002
Posts: 886
I can only agree with this. While VSP was never a 'feature', changing the value of $d011 surely always was intended. It can only be described as a defect that changing $d011 (for whatever your purpose is) can crash some machines.
Previous - 1 | 2 | 3 | 4 | 5 | 6 - Next
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
psych
tomz/TIDE
CopAss/Leader
Guests online: 101
Top Demos
1 Next Level  (9.8)
2 Mojo  (9.7)
3 Coma Light 13  (9.7)
4 Edge of Disgrace  (9.6)
5 Comaland 100%  (9.6)
6 No Bounds  (9.6)
7 Uncensored  (9.6)
8 Wonderland XIV  (9.6)
9 Memento Mori  (9.6)
10 Bromance  (9.5)
Top onefile Demos
1 It's More Fun to Com..  (9.7)
2 Party Elk 2  (9.7)
3 Cubic Dream  (9.6)
4 Copper Booze  (9.5)
5 TRSAC, Gabber & Pebe..  (9.5)
6 Rainbow Connection  (9.5)
7 Dawnfall V1.1  (9.5)
8 Quadrants  (9.5)
9 Daah, Those Acid Pil..  (9.5)
10 Birth of a Flower  (9.5)
Top Groups
1 Nostalgia  (9.3)
2 Oxyron  (9.3)
3 Booze Design  (9.3)
4 Censor Design  (9.3)
5 Crest  (9.3)
Top Logo Graphicians
1 Sander  (9.9)
2 Facet  (9.6)
3 Mermaid  (9.4)
4 Pal  (9.4)
5 Shine  (9.3)

Home - Disclaimer
Copyright © No Name 2001-2024
Page generated in: 0.061 sec.