| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
Sample streaming from tape
Has anyone done four bit sample streaming from tape before? I'm thinking the time between pulses can be directly translated to four bit volume values as they are read from tape.
Practically useless, except for pure geekyness. But I wanted to know if it's been done, before I do it myself. |
|
| |
chatGPZ
Registered: Dec 2001 Posts: 11290 |
I think i have seen something like that back in the days. No idea if that worked with special prepared data though
Why limit it to 4 bit though? |
| |
Repose
Registered: Oct 2010 Posts: 225 |
Interesting idea. I've done something related; just play a tape and listen to the read bit. You can hear 1-bit music this way. |
| |
chatGPZ
Registered: Dec 2001 Posts: 11290 |
THAT was all over the mags back in the days :) |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
Quote: Interesting idea. I've done something related; just play a tape and listen to the read bit. You can hear 1-bit music this way.
1 bit is different. :)
I'm thinking 4-bit to use the full SID volume range.
Instead of trying to binary encode the 4 bits used for the SID volume, I'm thinking use pulse length to fill the number range from 0 to 15 (4 bits) volume.
This means creating a tape with 16 different pulse lengths basically. |
| |
tlr
Registered: Sep 2003 Posts: 1762 |
I think ”Pulse” by pixel for the vic-20 uses this technique for music playback somewhere in the loading sequence. |
| |
Repose
Registered: Oct 2010 Posts: 225 |
So the question is, what's the max frequency of pulses you can read? Based on the music I heard, it seems at 1kHz. Your average pulse length would be 8, so your sample rate would only be 128Hz.
Or look at it this way, to get phone quality sample rate (8kHz), you'd need pulses at 64KHz. I'm pretty sure that's not possible.
In fact, your encoding method is not efficient. You face the same tradeoffs as disk encoding. It's only that pulse length can give the wrong value with a slightly different speed, but that's hidden due to the nature of audio.
If you write 1/0 directly with occasional sync, that's most efficient, then the factor is only 4 instead of 8.
This is still a very low sample rate. If you get fancy with ADPCM, you could do at best 2x ratio.
This concept could only play bass/drums at best.
The best system I could imagine is using some DSP to find a combo of SID waveforms as a basis function to recreate the psychoacoustically most relevant frequencies, then the sample rate needs to be more related to how quickly the sound changes, and you could get something like the "mp3" demo.
But for normal coders, basically just d'n'b. |
| |
Repose
Registered: Oct 2010 Posts: 225 |
On 2nd thought, I estimated pulse speed in a very poor way. Looking at typical tape response, a cheap tape deck would do 10kHz, so somewhere below that. That's still pretty poor sampling rate. |
| |
MagerValp
Registered: Dec 2001 Posts: 1065 |
Use Macbeth's delta encoding instead (0/+1/-1), it only requires three different pulses. |
| |
Frantic
Registered: Mar 2003 Posts: 1641 |
Vaguely related. Video streaming from tape:
Deep Throat |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
Quote: So the question is, what's the max frequency of pulses you can read? Based on the music I heard, it seems at 1kHz. Your average pulse length would be 8, so your sample rate would only be 128Hz.
Or look at it this way, to get phone quality sample rate (8kHz), you'd need pulses at 64KHz. I'm pretty sure that's not possible.
In fact, your encoding method is not efficient. You face the same tradeoffs as disk encoding. It's only that pulse length can give the wrong value with a slightly different speed, but that's hidden due to the nature of audio.
If you write 1/0 directly with occasional sync, that's most efficient, then the factor is only 4 instead of 8.
This is still a very low sample rate. If you get fancy with ADPCM, you could do at best 2x ratio.
This concept could only play bass/drums at best.
The best system I could imagine is using some DSP to find a combo of SID waveforms as a basis function to recreate the psychoacoustically most relevant frequencies, then the sample rate needs to be more related to how quickly the sound changes, and you could get something like the "mp3" demo.
But for normal coders, basically just d'n'b.
Eh? On PAL, for 8KHz I'd need a pulse every ~123 cycles, right? Assuming one tape pulse encodes one SID volume store of 4 bits.
The TAP file format has a resolution of 8 cycles, so each of the 16 SID volume levels would be, in the TAP file, a pulse length of X to X+15. Where X is a lower boundary constant to allow enough cycles to trigger on tape pulse, read the pulse length with a CIA timer, reset the timer, store to SID, loop...
So in reality the tape pulses in cycle terms could be 64 to 192 cycles to map directly to volume levels 0 to 15. Even an upper limit of 192 cycles would, in PAL, give 5131 Hz, right? |
| |
chatGPZ
Registered: Dec 2001 Posts: 11290 |
You can also abuse the "long gap" to make gaps with 1 cycle granularity... i'd certainly try that, use 8bit, and the mahoney playback. |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
I might also try the Mahoney sample method. If I really want to remove the rest of my hair. |
| |
Bansai
Registered: Feb 2023 Posts: 40 |
Quoting MagerValpUse Macbeth's delta encoding instead (0/+1/-1), it only requires three different pulses. I take it that saturating arithmetic on the integrator equation would self-correct for tape dropouts where pulses are lost or corrupted? That is, at some point the value would incorrectly want to +1 the $d418 volume beyond 15 or -1 below 0, and saturating math would eventually clamp the value stream right back to where it should be given how overdriven 4-bit samples are anyway. |
| |
Hoogo
Registered: Jun 2002 Posts: 103 |
Staying within the restrictions of the TAP format is reasonable. But better check Slushload V3 to see minimum reliable pulse lengths.
16 pulse lengths for 4 bit are nice to handle defects on tape. But you will get a higher sample rate if you use 4 pulse lengths and write 2*2 bit.
Also your bit rate will be variable.
And finally: An analog magnetic tape may behave more weird than a TAP file suggests. |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
Proof of concept ready. :)
Download this: https://github.com/martinpiper/C64Public/raw/master/IRQTape/Tap..
Run it in PAL Vice. It will wait on a black screen for a TAP file to be attached.
Download this TAP file: https://github.com/martinpiper/C64Public/raw/master/IRQTape/vic..
Use "File->Attach tape image..." in Vice (do not use auto-start!) and press play on the tape using "File->Datasette control".
You should see narrow colour bars and hear a sample from Space Ace.
If you want a command line for Vice then this should work: x64sc.exe -remotemonitor -dstapewobble 0 -dsspeedtuning 0 -1 vice.tap TapeStreamSamples.prg |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
The sample rate is variable, since the 4-bit chunks from the tape for the volume are variable. However the average sample rate is 5.1KHz since the average pulse length is 192 cycles. |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
The code changes for this example. https://github.com/martinpiper/C64Public/compare/e7dc2ef85d1c72..
Accuracy would be improved by using an IRQ instead of polling bit 4 on $dc0d. |
| |
tlr
Registered: Sep 2003 Posts: 1762 |
Would be interesting to see how this works on real HW. If you just translate pulse length to sample value, then any imperfections in detecting the length will translate to some noise which is probably unproblematic.
This encoding is basically frequency modulation (as in FM radio).
If you accept relying on precise pulse length detection, then you could employ a differential encoding instead, e.g 0, +1, -1, +2, -2... This way you could choose to have a higher sample rate for small changes, improving high frequency content of the sound. |
| |
Krill
Registered: Apr 2002 Posts: 2940 |
Quoting Martin PiperAccuracy would be improved by using an IRQ instead of polling bit 4 on $dc0d. Main thread running over a thousand NOPs, interrupt kicks in, does stuff, and jumps back to start of NOP desert.
Did any tape loaders use this to improve accuracy (= increase bit rate)? =) |
| |
Krill
Registered: Apr 2002 Posts: 2940 |
Quoting tlrIf you accept relying on precise pulse length detection, then you could employ a differential encoding instead, e.g 0, +1, -1, +2, -2... This way you could choose to have a higher sample rate for small changes, improving high frequency content of the sound. Hmm... wouldn't the larger diffs imply a steeper gradient on the time-domain wave, and thus should be encoded with the shorter pulses? |
| |
tlr
Registered: Sep 2003 Posts: 1762 |
Quote: Quoting tlrIf you accept relying on precise pulse length detection, then you could employ a differential encoding instead, e.g 0, +1, -1, +2, -2... This way you could choose to have a higher sample rate for small changes, improving high frequency content of the sound. Hmm... wouldn't the larger diffs imply a steeper gradient on the time-domain wave, and thus should be encoded with the shorter pulses?
I would think that higher frequency content has lower amplitude in general, but feel free to experiment for optimum results. |
| |
Krill
Registered: Apr 2002 Posts: 2940 |
Quoting tlrI would think that higher frequency content has lower amplitude in general, but feel free to experiment for optimum results. I wouldn't. :)
Gutsfeeling says there's some kind of optimum to be achieved by having a special switch-token pulse length to flip between the two (apparently concurrent) goals, though. |
| |
Krill
Registered: Apr 2002 Posts: 2940 |
And on another thought... differential encoding alone would have to rely on lossless/error-free encoding, lest it degrade quickly (without some intermediate absolute literals).
This sort of application, however, has no need to reliably tell apart symbols, unlike tape loaders.
So, it's probably a good idea to have some encoding that forgives the random blooper (and allows for higher bitrates/tighter symbol packing at the cost of some noise). =) |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
Quote: Quoting Martin PiperAccuracy would be improved by using an IRQ instead of polling bit 4 on $dc0d. Main thread running over a thousand NOPs, interrupt kicks in, does stuff, and jumps back to start of NOP desert.
Did any tape loaders use this to improve accuracy (= increase bit rate)? =)
Yeah, it does reduce jitter, which allows the pulse lengths to be shorter. At the moment I'm using 16 cycles, if it could be brought down to 8 cycles reliably this would improve sample quality. |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
Quote: Would be interesting to see how this works on real HW. If you just translate pulse length to sample value, then any imperfections in detecting the length will translate to some noise which is probably unproblematic.
This encoding is basically frequency modulation (as in FM radio).
If you accept relying on precise pulse length detection, then you could employ a differential encoding instead, e.g 0, +1, -1, +2, -2... This way you could choose to have a higher sample rate for small changes, improving high frequency content of the sound.
My C2N is dead. If someone can try on real hardware and let me know that would be great. At the moment I'm using Vice with tape wobble at 10, which still sounds OK. |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
Quote: Quoting tlrI would think that higher frequency content has lower amplitude in general, but feel free to experiment for optimum results. I wouldn't. :)
Gutsfeeling says there's some kind of optimum to be achieved by having a special switch-token pulse length to flip between the two (apparently concurrent) goals, though.
I might try, delta pulse lengths for +1 and -1, and no change could be any longer pulse, which would allow consecutive unchanged samples to just use one pulse of a long length instead of having to encode "0" often.
Then every 16/32/64 samples read the full 4 bit token to reset any delta read errors. |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
Improved sample rate. Same instructions and links as post #16 above...
Download this PRG: https://github.com/martinpiper/C64Public/raw/master/IRQTape/Tap..
Download this TAP file: https://github.com/martinpiper/C64Public/raw/master/IRQTape/vic..
You should see narrow colour bars indicating the volume this time. This time it's using "Tom's Diner". |
| |
tlr
Registered: Sep 2003 Posts: 1762 |
Quoting KrillQuoting tlrI would think that higher frequency content has lower amplitude in general, but feel free to experiment for optimum results. I wouldn't. :)
Gutsfeeling says there's some kind of optimum to be achieved by having a special switch-token pulse length to flip between the two (apparently concurrent) goals, though.
My assumption is based on the fact that the spectrum of typical songs falls off towards high frequencies, thus the changes of higher frequency would statistically have smaller steps.
This can of course be tested with some actual data.
Quoting KrillAnd on another thought... differential encoding alone would have to rely on lossless/error-free encoding, lest it degrade quickly (without some intermediate absolute literals).
yes, errors would be handle badly. |
| |
tlr
Registered: Sep 2003 Posts: 1762 |
Quote: I might try, delta pulse lengths for +1 and -1, and no change could be any longer pulse, which would allow consecutive unchanged samples to just use one pulse of a long length instead of having to encode "0" often.
Then every 16/32/64 samples read the full 4 bit token to reset any delta read errors.
My idea was to not only encode +1 and -1, but 0, +1, -1, +2, -2 and perhaps more as single pulses. The reasoning behind this is that the difference between the pulses need not be a multiple of the shortest one. |
| |
Hoogo
Registered: Jun 2002 Posts: 103 |
Quote: Quoting Martin PiperAccuracy would be improved by using an IRQ instead of polling bit 4 on $dc0d. Main thread running over a thousand NOPs, interrupt kicks in, does stuff, and jumps back to start of NOP desert.
Did any tape loaders use this to improve accuracy (= increase bit rate)? =)
At least to test jitter. Iirc, 95% Jitter is surprisingly small, within a 6 cycle window. But the other 5% are far off, so you have to increase your pulse lengths. And if you want to write with a PC, you will stick to 44100 anyways.
But why is there so much noise in this sample?? |
| |
Oswald
Registered: Apr 2002 Posts: 5074 |
Quote: Improved sample rate. Same instructions and links as post #16 above...
Download this PRG: https://github.com/martinpiper/C64Public/raw/master/IRQTape/Tap..
Download this TAP file: https://github.com/martinpiper/C64Public/raw/master/IRQTape/vic..
You should see narrow colour bars indicating the volume this time. This time it's using "Tom's Diner".
that sounds very good |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
Quote: At least to test jitter. Iirc, 95% Jitter is surprisingly small, within a 6 cycle window. But the other 5% are far off, so you have to increase your pulse lengths. And if you want to write with a PC, you will stick to 44100 anyways.
But why is there so much noise in this sample??
Downloaded from YouTube. Volume increased to increase the range of volume bits used. Also the quantization is accentuating rumble and hiss. |
| |
tlr
Registered: Sep 2003 Posts: 1762 |
Quote: Downloaded from YouTube. Volume increased to increase the range of volume bits used. Also the quantization is accentuating rumble and hiss.
With this variable sample rate, how do you do your downsampling/alias filtering of the source material? |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
Quote: With this variable sample rate, how do you do your downsampling/alias filtering of the source material?
I don't, which obviously doesn't help at all. :)
New build using IRQ (and a NOP fill) is available at the same links as post #16. The sound quality is improved because there is less jitter in the pulse detection. I was also able to add timer underflow detection which clamps the volume appropriately.
Anyone recognise the music? :) |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
Hmm, so at the moment the 16 volume values are separated by 16 cycles, which is 256 cycles in total. A base value or 56 cycles is added. This means an early pulse or longer pulse (due to tape speed or wobble) has the high timer bit set.
The timer is setup to count cycles to measure the pulse length... The read of the high timer value and the associated branch adds precious cycles to the IRQ, which reduces the overall maximum volume value rate from the tape.
However... what if the one timer was setup to count not cycles but "cycles divided by 2" instead. This would allow the middle 128 value range of the timer to represent the desired pulses and anything outside that range to represent undesired values and clamp the volume appropriately.
This would mean the timer is doing the underflow check, not the CPU, which would save precious cycles in the IRQ and allow a higher sample rate. |
| |
Hoogo
Registered: Jun 2002 Posts: 103 |
You can't have less than 2 cycles of jitter, so a long column of INX can do the counting instead. Just add a Jmp * at the end. |
| |
tlr
Registered: Sep 2003 Posts: 1762 |
Quote: You can't have less than 2 cycles of jitter, so a long column of INX can do the counting instead. Just add a Jmp * at the end.
very clever, then you can just interweave nop's and inx's to acheive the counting rate you desire. Also, if you have decided on an acceptable range for each transition, the inx/nop chain need not be repeated, you can add a branch where there is supposed to be no transition. |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
Quote: You can't have less than 2 cycles of jitter, so a long column of INX can do the counting instead. Just add a Jmp * at the end.
Could do. But I actually want to use the mainline for other code. :) Perhaps even enable the screen with some scroller... |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
Interesting. If you're using vice, in the monitor do "> d011 1b" and the sample quality with the screen on can be heard.
Not too bad. Time for a scroller? :) |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
Links updated with 6.3KHz sample rate now.
Timer A and B are used, which means the CIA can do the underflow and overflow check for free. So the IRQ sample play is much simpler. |
| |
ws
Registered: Apr 2012 Posts: 248 |
Nice Megablast! I'm really curious, where this will lead to, quality wise.
I had a thought, that surely is sort of defeating the actual streaming idea, but i'd like to ask it anyways:
If the recording was recorded slowed down, the loading of the sample buffered and then replayed faster, could that improve overall quality in any significant way?
yeah i know, "why not load actual sampled data, then", but i just had to ask. i was thinking maybe this way perhaps one and a half RAM fillings could be played without having to pre-buffer again, like a Shortplay Mode.
[@ scroller: maybe a screen-off-with sprites scroller would be keeping the quality at max?] |
| |
Martin Piper
Registered: Nov 2007 Posts: 698 |
Instead of using 16 cycles as a multiplier for each volume level, I tried 8 cycles. The sound quality is better due to being able to play back at a higher frequency. However it only sounds better in Vice with zero tape wobble. The default Vice tape wobble of 10 cycles will cause an overall drop in audio quality.
This is a side effect of using frequency modulation to encode the volume on a device prone to frequency shifts. :)
A multiplier of 16 cycles, whilst being easy to encode in the TAP file, also seems to be a sweet spot for "accurate enough" reading of sample volumes at a high enough speed for quality purposes.
Turning on the screen causes high frequency noise, due to the bad lines introducing a regular spike in the volume read from the tape.
WangFM Megablast |
| |
MagerValp
Registered: Dec 2001 Posts: 1065 |
Starting to sound pretty nice!
Would be interesting to hear if delta coding with three or five pulses sounds better. Both would at least double the frequency. |
| |
Krill
Registered: Apr 2002 Posts: 2940 |
Pretty sure that proper resampling of the input wave to suit the encoding would make for a tremendous improvement in quality, much more than tinkering with C-64 side code details. |
| |
tlr
Registered: Sep 2003 Posts: 1762 |
Quote: I think ”Pulse” by pixel for the vic-20 uses this technique for music playback somewhere in the loading sequence.
for the record, this can be found here: https://eyes-uk.itch.io/pulse
srcs:
- https://github.com/SvenMichaelKlose/nipkow
- https://github.com/SvenMichaelKlose/pulse |
| |
soci
Registered: Sep 2003 Posts: 478 |
I'm wondering, did anyone tried this with an actual datassette yet? |