[CSDb] - User Forums - Sample streaming from tape

Welcome to our latest new user jmi ! (Registered 2024-09-15)

You are not logged in - nap

CSDb User Forums

Forums > C64 Coding > Sample streaming from tape

2023-10-31 04:25

Martin Piper

Registered: Nov 2007
Posts: 698

Sample streaming from tape

Has anyone done four bit sample streaming from tape before? I'm thinking the time between pulses can be directly translated to four bit volume values as they are read from tape.
Practically useless, except for pure geekyness. But I wanted to know if it's been done, before I do it myself.

2023-10-31 13:30

chatGPZ

Registered: Dec 2001
Posts: 11290

I think i have seen something like that back in the days. No idea if that worked with special prepared data though

Why limit it to 4 bit though?

2023-10-31 13:52

Repose

Registered: Oct 2010
Posts: 225

Interesting idea. I've done something related; just play a tape and listen to the read bit. You can hear 1-bit music this way.

2023-10-31 13:55

chatGPZ

Registered: Dec 2001
Posts: 11290

THAT was all over the mags back in the days :)

2023-10-31 14:39

Martin Piper

Registered: Nov 2007
Posts: 698

Quote: Interesting idea. I've done something related; just play a tape and listen to the read bit. You can hear 1-bit music this way.

1 bit is different. :)

I'm thinking 4-bit to use the full SID volume range.

Instead of trying to binary encode the 4 bits used for the SID volume, I'm thinking use pulse length to fill the number range from 0 to 15 (4 bits) volume.
This means creating a tape with 16 different pulse lengths basically.

2023-10-31 15:03

tlr

Registered: Sep 2003
Posts: 1762

I think ”Pulse” by pixel for the vic-20 uses this technique for music playback somewhere in the loading sequence.

2023-10-31 16:20

Repose

Registered: Oct 2010
Posts: 225

So the question is, what's the max frequency of pulses you can read? Based on the music I heard, it seems at 1kHz. Your average pulse length would be 8, so your sample rate would only be 128Hz.
Or look at it this way, to get phone quality sample rate (8kHz), you'd need pulses at 64KHz. I'm pretty sure that's not possible.
In fact, your encoding method is not efficient. You face the same tradeoffs as disk encoding. It's only that pulse length can give the wrong value with a slightly different speed, but that's hidden due to the nature of audio.
If you write 1/0 directly with occasional sync, that's most efficient, then the factor is only 4 instead of 8.
This is still a very low sample rate. If you get fancy with ADPCM, you could do at best 2x ratio.
This concept could only play bass/drums at best.
The best system I could imagine is using some DSP to find a combo of SID waveforms as a basis function to recreate the psychoacoustically most relevant frequencies, then the sample rate needs to be more related to how quickly the sound changes, and you could get something like the "mp3" demo.
But for normal coders, basically just d'n'b.

2023-10-31 16:24

Repose

Registered: Oct 2010
Posts: 225

On 2nd thought, I estimated pulse speed in a very poor way. Looking at typical tape response, a cheap tape deck would do 10kHz, so somewhere below that. That's still pretty poor sampling rate.

2023-10-31 17:21

MagerValp

Registered: Dec 2001
Posts: 1065

Use Macbeth's delta encoding instead (0/+1/-1), it only requires three different pulses.

2023-10-31 19:17

Frantic

Registered: Mar 2003
Posts: 1641

Vaguely related. Video streaming from tape:

Deep Throat

2023-11-01 14:39

Martin Piper

Registered: Nov 2007
Posts: 698

Quote: So the question is, what's the max frequency of pulses you can read? Based on the music I heard, it seems at 1kHz. Your average pulse length would be 8, so your sample rate would only be 128Hz.
Or look at it this way, to get phone quality sample rate (8kHz), you'd need pulses at 64KHz. I'm pretty sure that's not possible.
In fact, your encoding method is not efficient. You face the same tradeoffs as disk encoding. It's only that pulse length can give the wrong value with a slightly different speed, but that's hidden due to the nature of audio.
If you write 1/0 directly with occasional sync, that's most efficient, then the factor is only 4 instead of 8.
This is still a very low sample rate. If you get fancy with ADPCM, you could do at best 2x ratio.
This concept could only play bass/drums at best.
The best system I could imagine is using some DSP to find a combo of SID waveforms as a basis function to recreate the psychoacoustically most relevant frequencies, then the sample rate needs to be more related to how quickly the sound changes, and you could get something like the "mp3" demo.
But for normal coders, basically just d'n'b.

Eh? On PAL, for 8KHz I'd need a pulse every ~123 cycles, right? Assuming one tape pulse encodes one SID volume store of 4 bits.

The TAP file format has a resolution of 8 cycles, so each of the 16 SID volume levels would be, in the TAP file, a pulse length of X to X+15. Where X is a lower boundary constant to allow enough cycles to trigger on tape pulse, read the pulse length with a CIA timer, reset the timer, store to SID, loop...

So in reality the tape pulses in cycle terms could be 64 to 192 cycles to map directly to volume levels 0 to 15. Even an upper limit of 192 cycles would, in PAL, give 5131 Hz, right?

2023-11-01 14:43

chatGPZ

Registered: Dec 2001
Posts: 11290

You can also abuse the "long gap" to make gaps with 1 cycle granularity... i'd certainly try that, use 8bit, and the mahoney playback.

2023-11-01 15:15

Martin Piper

Registered: Nov 2007
Posts: 698

I might also try the Mahoney sample method. If I really want to remove the rest of my hair.

2023-11-01 21:41

Bansai

Registered: Feb 2023
Posts: 40

Quoting MagerValp

Use Macbeth's delta encoding instead (0/+1/-1), it only requires three different pulses.

I take it that saturating arithmetic on the integrator equation would self-correct for tape dropouts where pulses are lost or corrupted? That is, at some point the value would incorrectly want to +1 the $d418 volume beyond 15 or -1 below 0, and saturating math would eventually clamp the value stream right back to where it should be given how overdriven 4-bit samples are anyway.

2023-11-06 18:41

Hoogo

Registered: Jun 2002
Posts: 103

Staying within the restrictions of the TAP format is reasonable. But better check Slushload V3 to see minimum reliable pulse lengths.

16 pulse lengths for 4 bit are nice to handle defects on tape. But you will get a higher sample rate if you use 4 pulse lengths and write 2*2 bit.

Also your bit rate will be variable.

And finally: An analog magnetic tape may behave more weird than a TAP file suggests.

2023-11-10 15:00

Martin Piper

Registered: Nov 2007
Posts: 698

Proof of concept ready. :)
Download this: https://github.com/martinpiper/C64Public/raw/master/IRQTape/Tap..

Run it in PAL Vice. It will wait on a black screen for a TAP file to be attached.

Download this TAP file: https://github.com/martinpiper/C64Public/raw/master/IRQTape/vic..
Use "File->Attach tape image..." in Vice (do not use auto-start!) and press play on the tape using "File->Datasette control".

You should see narrow colour bars and hear a sample from Space Ace.

If you want a command line for Vice then this should work: x64sc.exe -remotemonitor -dstapewobble 0 -dsspeedtuning 0 -1 vice.tap TapeStreamSamples.prg

2023-11-10 15:04

Martin Piper

Registered: Nov 2007
Posts: 698

The sample rate is variable, since the 4-bit chunks from the tape for the volume are variable. However the average sample rate is 5.1KHz since the average pulse length is 192 cycles.

2023-11-11 02:39

Martin Piper

Registered: Nov 2007
Posts: 698

The code changes for this example. https://github.com/martinpiper/C64Public/compare/e7dc2ef85d1c72..

Accuracy would be improved by using an IRQ instead of polling bit 4 on $dc0d.

2023-11-11 12:03

tlr

Registered: Sep 2003
Posts: 1762

Would be interesting to see how this works on real HW. If you just translate pulse length to sample value, then any imperfections in detecting the length will translate to some noise which is probably unproblematic.

This encoding is basically frequency modulation (as in FM radio).

If you accept relying on precise pulse length detection, then you could employ a differential encoding instead, e.g 0, +1, -1, +2, -2... This way you could choose to have a higher sample rate for small changes, improving high frequency content of the sound.

2023-11-11 12:32

Krill

Registered: Apr 2002
Posts: 2940

Quoting Martin Piper

Accuracy would be improved by using an IRQ instead of polling bit 4 on $dc0d.

Main thread running over a thousand NOPs, interrupt kicks in, does stuff, and jumps back to start of NOP desert.

Did any tape loaders use this to improve accuracy (= increase bit rate)? =)

2023-11-11 12:39

Krill

Registered: Apr 2002
Posts: 2940

Quoting tlr

If you accept relying on precise pulse length detection, then you could employ a differential encoding instead, e.g 0, +1, -1, +2, -2... This way you could choose to have a higher sample rate for small changes, improving high frequency content of the sound.

Hmm... wouldn't the larger diffs imply a steeper gradient on the time-domain wave, and thus should be encoded with the shorter pulses?

2023-11-11 12:57

tlr

Registered: Sep 2003
Posts: 1762

Quote: Quoting tlr
If you accept relying on precise pulse length detection, then you could employ a differential encoding instead, e.g 0, +1, -1, +2, -2... This way you could choose to have a higher sample rate for small changes, improving high frequency content of the sound.
Hmm... wouldn't the larger diffs imply a steeper gradient on the time-domain wave, and thus should be encoded with the shorter pulses?

I would think that higher frequency content has lower amplitude in general, but feel free to experiment for optimum results.

2023-11-11 13:11

Krill

Registered: Apr 2002
Posts: 2940

Quoting tlr

I would think that higher frequency content has lower amplitude in general, but feel free to experiment for optimum results.

I wouldn't. :)

Gutsfeeling says there's some kind of optimum to be achieved by having a special switch-token pulse length to flip between the two (apparently concurrent) goals, though.

2023-11-11 13:29

Krill

Registered: Apr 2002
Posts: 2940

And on another thought... differential encoding alone would have to rely on lossless/error-free encoding, lest it degrade quickly (without some intermediate absolute literals).

This sort of application, however, has no need to reliably tell apart symbols, unlike tape loaders.

So, it's probably a good idea to have some encoding that forgives the random blooper (and allows for higher bitrates/tighter symbol packing at the cost of some noise). =)

2023-11-11 14:06

Martin Piper

Registered: Nov 2007
Posts: 698

Quote: Quoting Martin Piper
Accuracy would be improved by using an IRQ instead of polling bit 4 on $dc0d.
Main thread running over a thousand NOPs, interrupt kicks in, does stuff, and jumps back to start of NOP desert.

Did any tape loaders use this to improve accuracy (= increase bit rate)? =)

Yeah, it does reduce jitter, which allows the pulse lengths to be shorter. At the moment I'm using 16 cycles, if it could be brought down to 8 cycles reliably this would improve sample quality.

2023-11-11 14:08

Martin Piper

Registered: Nov 2007
Posts: 698

Quote: Would be interesting to see how this works on real HW. If you just translate pulse length to sample value, then any imperfections in detecting the length will translate to some noise which is probably unproblematic.

This encoding is basically frequency modulation (as in FM radio).

If you accept relying on precise pulse length detection, then you could employ a differential encoding instead, e.g 0, +1, -1, +2, -2... This way you could choose to have a higher sample rate for small changes, improving high frequency content of the sound.

My C2N is dead. If someone can try on real hardware and let me know that would be great. At the moment I'm using Vice with tape wobble at 10, which still sounds OK.

2023-11-11 14:14

Martin Piper

Registered: Nov 2007
Posts: 698

Quote: Quoting tlr
I would think that higher frequency content has lower amplitude in general, but feel free to experiment for optimum results.
I wouldn't. :)

Gutsfeeling says there's some kind of optimum to be achieved by having a special switch-token pulse length to flip between the two (apparently concurrent) goals, though.

I might try, delta pulse lengths for +1 and -1, and no change could be any longer pulse, which would allow consecutive unchanged samples to just use one pulse of a long length instead of having to encode "0" often.
Then every 16/32/64 samples read the full 4 bit token to reset any delta read errors.

2023-11-11 14:47

Martin Piper

Registered: Nov 2007
Posts: 698

Improved sample rate. Same instructions and links as post #16 above...

Download this PRG: https://github.com/martinpiper/C64Public/raw/master/IRQTape/Tap..

Download this TAP file: https://github.com/martinpiper/C64Public/raw/master/IRQTape/vic..

You should see narrow colour bars indicating the volume this time. This time it's using "Tom's Diner".

2023-11-11 16:59

tlr

Registered: Sep 2003
Posts: 1762

Quoting Krill

Quoting tlr
I would think that higher frequency content has lower amplitude in general, but feel free to experiment for optimum results.
I wouldn't. :)

Gutsfeeling says there's some kind of optimum to be achieved by having a special switch-token pulse length to flip between the two (apparently concurrent) goals, though.

My assumption is based on the fact that the spectrum of typical songs falls off towards high frequencies, thus the changes of higher frequency would statistically have smaller steps.

This can of course be tested with some actual data.

Quoting Krill

And on another thought... differential encoding alone would have to rely on lossless/error-free encoding, lest it degrade quickly (without some intermediate absolute literals).

yes, errors would be handle badly.

2023-11-11 17:01

tlr

Registered: Sep 2003
Posts: 1762

Quote: I might try, delta pulse lengths for +1 and -1, and no change could be any longer pulse, which would allow consecutive unchanged samples to just use one pulse of a long length instead of having to encode "0" often.
Then every 16/32/64 samples read the full 4 bit token to reset any delta read errors.

My idea was to not only encode +1 and -1, but 0, +1, -1, +2, -2 and perhaps more as single pulses. The reasoning behind this is that the difference between the pulses need not be a multiple of the shortest one.

2023-11-11 18:54

Hoogo

Registered: Jun 2002
Posts: 103

At least to test jitter. Iirc, 95% Jitter is surprisingly small, within a 6 cycle window. But the other 5% are far off, so you have to increase your pulse lengths. And if you want to write with a PC, you will stick to 44100 anyways.

But why is there so much noise in this sample??

2023-11-11 21:49

Oswald

Registered: Apr 2002
Posts: 5074

Quote: Improved sample rate. Same instructions and links as post #16 above...

Download this PRG: https://github.com/martinpiper/C64Public/raw/master/IRQTape/Tap..

Download this TAP file: https://github.com/martinpiper/C64Public/raw/master/IRQTape/vic..

You should see narrow colour bars indicating the volume this time. This time it's using "Tom's Diner".

that sounds very good

2023-11-12 03:29

Martin Piper

Registered: Nov 2007
Posts: 698

Quote: At least to test jitter. Iirc, 95% Jitter is surprisingly small, within a 6 cycle window. But the other 5% are far off, so you have to increase your pulse lengths. And if you want to write with a PC, you will stick to 44100 anyways.

But why is there so much noise in this sample??

Downloaded from YouTube. Volume increased to increase the range of volume bits used. Also the quantization is accentuating rumble and hiss.

2023-11-12 09:37

tlr

Registered: Sep 2003
Posts: 1762

Quote: Downloaded from YouTube. Volume increased to increase the range of volume bits used. Also the quantization is accentuating rumble and hiss.

With this variable sample rate, how do you do your downsampling/alias filtering of the source material?

2023-11-12 10:32

Martin Piper

Registered: Nov 2007
Posts: 698

Quote: With this variable sample rate, how do you do your downsampling/alias filtering of the source material?

I don't, which obviously doesn't help at all. :)

New build using IRQ (and a NOP fill) is available at the same links as post #16. The sound quality is improved because there is less jitter in the pulse detection. I was also able to add timer underflow detection which clamps the volume appropriately.

Anyone recognise the music? :)

2023-11-12 10:51

Martin Piper

Registered: Nov 2007
Posts: 698

Hmm, so at the moment the 16 volume values are separated by 16 cycles, which is 256 cycles in total. A base value or 56 cycles is added. This means an early pulse or longer pulse (due to tape speed or wobble) has the high timer bit set.

The timer is setup to count cycles to measure the pulse length... The read of the high timer value and the associated branch adds precious cycles to the IRQ, which reduces the overall maximum volume value rate from the tape.

However... what if the one timer was setup to count not cycles but "cycles divided by 2" instead. This would allow the middle 128 value range of the timer to represent the desired pulses and anything outside that range to represent undesired values and clamp the volume appropriately.

This would mean the timer is doing the underflow check, not the CPU, which would save precious cycles in the IRQ and allow a higher sample rate.

2023-11-12 11:18

Hoogo

Registered: Jun 2002
Posts: 103

You can't have less than 2 cycles of jitter, so a long column of INX can do the counting instead. Just add a Jmp * at the end.

2023-11-12 11:27

tlr

Registered: Sep 2003
Posts: 1762

Quote: You can't have less than 2 cycles of jitter, so a long column of INX can do the counting instead. Just add a Jmp * at the end.

very clever, then you can just interweave nop's and inx's to acheive the counting rate you desire. Also, if you have decided on an acceptable range for each transition, the inx/nop chain need not be repeated, you can add a branch where there is supposed to be no transition.

2023-11-12 12:47

Martin Piper

Registered: Nov 2007
Posts: 698

Quote: You can't have less than 2 cycles of jitter, so a long column of INX can do the counting instead. Just add a Jmp * at the end.

Could do. But I actually want to use the mainline for other code. :) Perhaps even enable the screen with some scroller...

2023-11-12 12:54

Martin Piper

Registered: Nov 2007
Posts: 698

Interesting. If you're using vice, in the monitor do "> d011 1b" and the sample quality with the screen on can be heard.
Not too bad. Time for a scroller? :)

2023-11-12 15:41

Martin Piper

Registered: Nov 2007
Posts: 698

Links updated with 6.3KHz sample rate now.
Timer A and B are used, which means the CIA can do the underflow and overflow check for free. So the IRQ sample play is much simpler.

2023-11-12 18:27

ws

Registered: Apr 2012
Posts: 248

Nice Megablast! I'm really curious, where this will lead to, quality wise.

I had a thought, that surely is sort of defeating the actual streaming idea, but i'd like to ask it anyways:

If the recording was recorded slowed down, the loading of the sample buffered and then replayed faster, could that improve overall quality in any significant way?
yeah i know, "why not load actual sampled data, then", but i just had to ask. i was thinking maybe this way perhaps one and a half RAM fillings could be played without having to pre-buffer again, like a Shortplay Mode.

[@ scroller: maybe a screen-off-with sprites scroller would be keeping the quality at max?]

2023-11-13 02:39

Martin Piper

Registered: Nov 2007
Posts: 698

Instead of using 16 cycles as a multiplier for each volume level, I tried 8 cycles. The sound quality is better due to being able to play back at a higher frequency. However it only sounds better in Vice with zero tape wobble. The default Vice tape wobble of 10 cycles will cause an overall drop in audio quality.

This is a side effect of using frequency modulation to encode the volume on a device prone to frequency shifts. :)

A multiplier of 16 cycles, whilst being easy to encode in the TAP file, also seems to be a sweet spot for "accurate enough" reading of sample volumes at a high enough speed for quality purposes.

Turning on the screen causes high frequency noise, due to the bad lines introducing a regular spike in the volume read from the tape.

WangFM Megablast

2023-11-13 07:56

MagerValp

Registered: Dec 2001
Posts: 1065

Starting to sound pretty nice!

Would be interesting to hear if delta coding with three or five pulses sounds better. Both would at least double the frequency.

2023-11-13 08:33

Krill

Registered: Apr 2002
Posts: 2940

Pretty sure that proper resampling of the input wave to suit the encoding would make for a tremendous improvement in quality, much more than tinkering with C-64 side code details.

2023-11-13 08:51

tlr

Registered: Sep 2003
Posts: 1762

Quote: I think ”Pulse” by pixel for the vic-20 uses this technique for music playback somewhere in the loading sequence.

for the record, this can be found here: https://eyes-uk.itch.io/pulse

srcs:
- https://github.com/SvenMichaelKlose/nipkow
- https://github.com/SvenMichaelKlose/pulse

2023-11-13 17:52

soci

Registered: Sep 2003
Posts: 478

I'm wondering, did anyone tried this with an actual datassette yet?

Refresh

Subscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search for in
All times are CET.

Search CSDb

Advanced

Users Online

TheRyk/MYD!
Scooby/G★P/Light
pby/HF/Acrise
bugjam
Thierry
dstar/Fairlight
Brush/Elysium
lucommodore
Matt
megasoftargentina
Higgie/Kraze/Slackers
-trb-
icon/The Silents, Sp..
Pajda/Faith Design
Avalanche/Atlantis
Da Snake
REBEL 1/HF
Courage
Guests online: 139

Top Demos

1 Next Level  (9.7)
2 13:37  (9.7)
3 Coma Light 13  (9.7)
4 Edge of Disgrace  (9.6)
5 Mojo  (9.6)
6 Uncensored  (9.6)
7 Wonderland XIV  (9.6)
8 Comaland 100%  (9.6)
9 No Bounds  (9.6)
10 Unboxed  (9.6)

Top onefile Demos

1 Layers  (9.6)
2 Party Elk 2  (9.6)
3 Cubic Dream  (9.6)
4 Copper Booze  (9.6)
5 Rainbow Connection  (9.5)
6 It's More Fun to Com..  (9.5)
7 Dawnfall V1.1  (9.5)
8 Onscreen 5k  (9.5)
9 Daah, Those Acid Pil..  (9.5)
10 Morph  (9.5)

Top Groups

1 Booze Design  (9.3)
2 Oxyron  (9.3)
3 Nostalgia  (9.3)
4 Censor Design  (9.3)
5 Triad  (9.2)

Top NTSC-Fixers

1 Pudwerx  (10)
2 Booze  (9.7)
3 Stormbringer  (9.7)
4 Fungus  (9.6)
5 Grim Reaper  (9.3)

Page generated in: 0.093 sec.