Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
You are not logged in - nap
CSDb User Forums


Forums > C64 Coding > New VSP discovery
2013-07-19 21:01
lft

Registered: Jul 2007
Posts: 369
New VSP discovery

First off, this is what we already knew: VSP causes the VIC chip to briefly
place a logically undefined value on the DRAM address lines during the
halfcycle following the write to d011. If the undefined value coincides with
the RAS signal, every memory cell with an xxx7 or xxxf address is at risk of
getting corrupted. The relative timing of the undefined value and RAS depends
on several factors including temperature.

We also knew that the undefined value could be delayed slightly if VSP was
triggered by setting the DEN bit instead of modifying YSCROLL. This was enough
to avoid a crash on some machines.

I wanted to investigate whether there were other ways of controlling the timing
of the undefined value. Based on a combination of educated guesswork, luck and
plenty of trial-and-error, I could observe the following: The timing depends on
the specific 3-bit value that is written to YSCROLL, as well as the 3-bit value
that was stored in YSCROLL previously.

This means that we can trigger VSP using one of 56 methods (eight different
YSCROLL values for various rasterlines, seven non-matching YSCROLL values to
switch from), each with slightly different timing.

Using the techniques from my Safe VSP demo, I created a tool that would trigger
VSP many times, check if memory got corrupted, and keep track of the number of
crashes caused by each of the 56 methods. I then looked for a pattern in these
statistics.

Intriguingly, if I arranged the 56 crash counters in a grid with the vertical
axis corresponding to the rasterline and the horizontal axis corresponding to
the exclusive-or between the rasterline and the dummy value that was stored in
d011 prior to the VSP, then the crashes would tend to occur only in a subset of
the columns. When my crash prone c64 is powered on, the VIC chip is cold, and
there are no crashes. Within a minute, crashes start to appear in column 7
(meaning that all three bits of YSCROLL were flipped). As the chip heats up,
more crashes begin to appear in columns 3, 5 and 6 (two bits flipped). After
several more minutes, crashes show up also in columns 1, 2 and finally 4 (a
single bit flipped), but by this time, there are no longer any crashes in
columns 5, 6 or 7. Finally, when the VIC chip has reached a stable working
temperature, my machine no longer crashes.

This is what it might look like four minutes after power-on:



Now, let me stress that I only have one VSP-crashing c64, and these results
might not carry over to other machines. I hope they do, though. I would very
much like you (yes, you!) to run VSP Lab (described below) on your crash prone
machines and report what happens.

Is this useful? Short answer: Yes, very. But it hinges on whether the behaviour
of my c64 is typical. Even without the mentioned regularity in the columns, it
would be possible to find a few safe combinations for a given machine and a
given temperature. But the regularity makes it so much more practical and also
easier to explain to all C64 users, not just coders.

Let's refer to the seven columns as "VSP channels". For a given machine at a
given temperature, some of these channels are safe, and possibly some of them
are unsafe. It takes about 5-10 minutes for the VIC chip to reach its working
temperature. If you know that e.g. VSP channel #5 is safe on your machine, and
you can somehow tell a demo or game to use that specific channel, then VSP
won't crash.

My measurement tool evolved into a program called VSP Lab, depicted above,
which you can use to find out which VSP channels are safe to use on your
machine. It triggers a lot of VSP operations and visualises the crashes in a
grid, where each column corresponds to a VSP channel. Remember that a cold and
a hot VIC behave differently, so don't trust the measurements until about ten
minutes after power-on. You can reset the grid highlights using F1 to see if
channels which were unsafe before have become safe.

Demos and games could prompt the user for a VSP channel to use, or try to
determine it automatically using the same technique that VSP Lab is based on.

From a coding point of view, all you then have to do in order to implement
crash-free VSP, is to prepare the value X that you'll write to d011 to trigger
VSP, and the value Y which is X ^ vsp_channel. Then, on the rasterline where
you want to do VSP, you just wait until the time is right and do:

        sty $d011
        stx $d011


On the VSP Lab disk image, there's a small demo effect that you can run. It
will ask you for a VSP channel to use, and if you give it a safe number, it
should not crash.

This technique is so simple and non-intrusive that it's quite feasible to patch
existing games and demos, VSP-fixing them.

Also, this discovery explains the old wisdom that if you attempt VSP more than
once per frame, the routine will be more likely to crash. Here's why: In a demo
effect, you typically perform VSP on a fixed rasterline, so the value you write
to d011 will be constant. It is reasonable to assume that the old value of
YSCROLL will also be constant. Therefore, a given VSP effect will consistently
end up in the same VSP channel. On a machine with N safe VSP channels, the
probability of survival is therefore p = N / 7. If you do VSP on two different
rasterlines, each VSP will likewise end up in a channel, but not necessarily
the same one. The probability that both end up in a safe channel is p*p. If we
assume that most crash prone machines have at least one safe channel, we have
0 < p < 1 and therefore p*p < p. Q.E.D. To verify this, I patched vice to
report the channel every time VSP was performed. Sure enough, VSP&IK+
consistently uses VSP channel 1, as does Royal Arte. Krestage 3 uses VSP
channel 2. The intro of Tequila Sunrise, which performs VSP twice per frame,
uses VSP channels 1 and 3, and so does Safe VSP.

Finally, I will attempt to explain the observed behaviour at the electronical
level. Suppose each bit of YSCROLL is continually compared to the corresponding
bit in the Y raster counter, using XOR gates. The outputs of the XOR gates are
routed to a triple-input NOR gate, the output of which is therefore high if and
only if the three bits match. A triple-input NOR gate in NMOS would consist of
a pull-up resistor and three pull-down transistors. But the output of the NOR
gate is not a perfect boolean signal, because the transistors are not ideal.
When they are closed, they act like small-valued resistors, pulling the output
towards -- but not all the way down to -- ground potential. When YSCROLL
differs from the raster position by three bits, all three transistors
contribute, and the output reaches a low voltage. When the difference is two
bits, only two transistors pull, so the output voltage is slightly higher. For
a one-bit difference, the voltage is even higher (but still a logic zero, of
course). When we trigger VSP, all transistors stop pulling the voltage down,
and because of the resistor, the output voltage will begin to rise. But the
time it needs in order to rise to a logic one depends on the voltage at which
it begins. Thus, the more bits that change in YSCROLL, the longer it takes
until the match signal is asserted.

I have a fair amount of confidence in this theory, but need more data to
confirm it. And, once again, this is only of practical use if the average crash
prone machine has safe channels, like mine does. So please check your
equipment! I'm looking forward to your reports.
 
... 43 posts hidden. Click here to view all posts....
 
2013-08-04 12:46
chatGPZ

Registered: Dec 2001
Posts: 11386
at this point i would want to recheck if replacing the ram and preventing the VIC from writing to it wouldnt completely fix the problem.... :)
2013-08-04 16:39
tlr

Registered: Sep 2003
Posts: 1790
Quoting WVL
I can imagine some simple piece of addon electronics that measure the powerup state and put the result on some register that the ROM can read and display on the screen on powerup.
...
Then the user could decide to powerup again to get to a stable state for his/her machine.

Wouldn't it be better to make some electronics locking the phase to the vic-II?
2013-08-05 02:02
Skate

Registered: Jul 2003
Posts: 494
after writing this comment, i noticed similar comments are already written. so, i post it like "+1".

this whole thing is very interesting but test results show me that what we really need is a VSP safe machine, not a safe VSP routine. i believe same machine may give different test results with different power supplies or even different places with different electricity infrastructure.

now, thanks to lft, we have a great VSP test utility. now some hardware gurus should build some easy to implement hardware fix and we should all apply this patch. amiga guys do many hw patches, right? we can do as well.
2013-08-05 19:05
Kabuto
Account closed

Registered: Sep 2004
Posts: 58
I agree that the ideal solution would be a hardware modification since this bug can also be triggered by non-VSP code that accidentally creates a VSP condition. Odds are pretty low (re-enabling display or changing vscroll without timing will trigger VSP with a probability of about 0.2%, and then, according to the plots, up to 2% of those might actually crash) but even simple basic games that just modify $D011 for a cheap screen shake effect can crash.
2013-08-05 19:57
tlr

Registered: Sep 2003
Posts: 1790
Sure, it would be nice to have fixed but there are many other machine/temp dependent stuff that is flaky (e.g LAX, ANE). Should we fix those too?

The question is obviously retorical. The difference is that VSP is commonly used.

Still it's definately on the border of what should be considered safe to use. It seems to me that LAX #$00 is safer than vsp for instance.

I'd rather we found a safe/safer way to use the unmodified hardware, which I believe is exactly what lft is trying to do.
2013-08-05 20:21
HCL

Registered: Feb 2003
Posts: 728
Hmm.. My c64 and c128 ran the test for an hour+ at BFP, but they were both clean (!). This is puzzeling since i know that at least the c128 was a bitch to code d011-stuff on in the early 90's, and that was the only thing i wanted to do back then :P.
2013-08-06 01:29
Martin Piper

Registered: Nov 2007
Posts: 722
Replacing DRAM with SRAM should fix the issue. It might need some extra logic circuitry.
2013-08-06 17:01
algorithm

Registered: May 2002
Posts: 705
Thats a bit overkill however
2013-08-06 17:30
chatGPZ

Registered: Dec 2001
Posts: 11386
and zero-x (i think) tried it - doesnt fix it at all (i guess you must disconnect/prevent the VIC from writing to it too)
2013-08-06 22:25
Zer0-X
Account closed

Registered: Aug 2008
Posts: 78
Quote: and zero-x (i think) tried it - doesnt fix it at all (i guess you must disconnect/prevent the VIC from writing to it too)

Nope, haven't tried it. Tempting tho.
Previous - 1 | 2 | 3 | 4 | 5 | 6 - Next
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
Act-Otl/Outlaws
zscs
rexbeng
Guests online: 88
Top Demos
1 Next Level  (9.7)
2 13:37  (9.7)
3 Mojo  (9.7)
4 Coma Light 13  (9.6)
5 Edge of Disgrace  (9.6)
6 What Is The Matrix 2  (9.6)
7 The Demo Coder  (9.6)
8 Uncensored  (9.6)
9 Comaland 100%  (9.6)
10 Wonderland XIV  (9.6)
Top onefile Demos
1 No Listen  (9.6)
2 Layers  (9.6)
3 Cubic Dream  (9.6)
4 Party Elk 2  (9.6)
5 Copper Booze  (9.6)
6 Dawnfall V1.1  (9.5)
7 Rainbow Connection  (9.5)
8 Onscreen 5k  (9.5)
9 Morph  (9.5)
10 Libertongo  (9.5)
Top Groups
1 Performers  (9.3)
2 Booze Design  (9.3)
3 Oxyron  (9.3)
4 Triad  (9.3)
5 Censor Design  (9.3)
Top Musicians
1 Rob Hubbard  (9.7)
2 Mutetus  (9.7)
3 Jeroen Tel  (9.7)
4 Linus  (9.6)
5 Stinsen  (9.6)

Home - Disclaimer
Copyright © No Name 2001-2024
Page generated in: 0.084 sec.