Log inRegister an accountBrowse CSDbHelp & documentationFacts & StatisticsThe forumsAvailable RSS-feeds on CSDbSupport CSDb Commodore 64 Scene Database
You are not logged in - nap
CSDb User Forums


Forums > C64 Coding > Tape loaders using more than two pulse widths for data
2018-12-08 20:16
ChristopherJam

Registered: Aug 2004
Posts: 1409
Tape loaders using more than two pulse widths for data

I’ve been thinking a bit about a novel tape encoding, but it would rely on (among other things) having more than two pulse widths. So far as I can see, none of the old turbo loaders used a long pulse for anything beyond end of byte/end of block - the bitstream itself was just a sequence of short and medium pulses (usually around 200 cycles for short, 300 to 450 for long).

Is there any particular reason none of the popular loaders used (eg) four pulse widths for each bitpair? Even using quite widely separated durations of 210/320/450/600 would then lead to an average time per bit of just 197 cycles, a massive improvement on the 265 cycles per bit you’d get for 210/320.

(The idea I’ve been working on is a bit more complex than that, but if a simple bitpair scheme wouldn’t work for some reason, then the idea I have would need to be scaled back somewhat. Promising that long pulses were used for framing, mind..)
 
... 23 posts hidden. Click here to view all posts....
 
2018-12-17 09:44
thrust64

Registered: Jun 2006
Posts: 8
I could be wrong, but I think your math is slightly incorrect.

Simple example:
Cycles|Bits| Prob.| Cycles per bit and probability
------+----+------+-------------------------------
  100 | 0  | 0.5  | 50.00
  150 | 10 | 0.25 | 18.75
  200 | 11 | 0.25 | 25.00
------+----+------+------
            Total:  93,75
That's easy to verify: 0 takes 100 cycles and 1 takes (150+200)/2/2 = 87,5 cycles on average. So the average length per bit is (100+87.5)/2 = 93.75 cycles.

Using the same math for my own loader I get:
Cycles|Bits| Prob.| Cycles per bit and probability
------+----+------+-------------------------------
  167 |11  |0,25  | 20.875
  217 |01  |0,25  | 27.125
  267 |101 |0,125 | 11.125
  317 |100 |0,125 | 13.208
  367 |000 |0,125 | 15.292
  417 |0011|0,0625|  6.516
  467 |0010|0,0625|  7.297
------+----+------+--------
            total: 101.4375
So that would be ~101.4 cycles/bit vs ~102.9 as in your math (both using Huffman).
2018-12-17 11:00
ChristopherJam

Registered: Aug 2004
Posts: 1409
You're making the same mistake I was a few days ago :)

Symbols that output longer bit sequences consume more of the original file when it's being converted to a pulse stream, so they need to be weighted accordingly when you're computing the expected cycles per bit.

I just encoded a sequence of 1000 random bits with the [0,10,11]=>[100,150,200] encoding, and found 322 bare zeros, 165 "10" pairs, and 174 "11" pairs;
total duration 322*100+165*150+174*200=91750 cycles, or 91.75 cycles per bit, quite close to the 91.667 I get from the python snippet below:


from math import log
def log2(x):  return log(x)/log(2)

def cycles_per_bit_given_arithmetic_code(code):
    den = num = 0
    for dur,p in code:
        bits = -log2(p)
        weight = p*bits         # this is critical
        num += weight*dur/bits
        den += weight
    cycles_per_bit = num/den
    return cycles_per_bit

def hc_to_ac(hc_code):  # convert a huffman code to corresponding arithmetic code
    return [(d,0.5**len(s)) for d,s in hc_code]

hccode = [
        (100, '0'),
        (150, '10'),
        (200, '11'),
        ]

print(cycles_per_bit_given_arithmetic_code(hc_to_ac(hccode)))


or in table form:
+-----+----+------+-----------+--------+-------+
Cycles|Bits| Prob.|  w = p*b  |  c/b   | c/b*w |  (b=len(Bits))
+-----+----+------+-----------+--------+-------+
| 100 | 0  | 0.50 |    0.5    |  100.0 |  50.0 |
| 150 | 10 | 0.25 |    0.5    |   75.0 |  37.5 |
| 200 | 11 | 0.25 |    0.5    |  100.0 |  50.0 |
+-----+----+------+-----------+--------+-------+

total weighted rates = 137.5
total weights        =   1.5
weighted average     =  91.667
2018-12-17 11:25
thrust64

Registered: Jun 2006
Posts: 8
On average, I would expect ~333 0 bits plus ~167 01 and ~167 10 bits. Then we would have 333*100 + 167*150 + 167*200 = 91,750 for 1001 bits. Which about matches your number.

Quote:
You're making the same mistake I was a few days ago :)

Looks so. :)

Quote:
Symbols that output longer bit sequences consume more of the original file when it's being converted to a pulse stream, so they need to be weighted accordingly when you're computing the expected cycles per bit.

I stand corrected. You are right.

Thanks for explaining again.
2018-12-17 11:56
thrust64

Registered: Jun 2006
Posts: 8
So when we look at the results and graphs (also from enthusi), what would be the best way for a fast but still robust solution?

There a lots of factors we can play with:
1. short pulse length
2. pulse gap
3. number of pulse lengths (a result of 1. + 2.)
4. ???

And what is causing the inaccuracies?
- the varying tape speed (how much? +/-5%?) (increasing pulse gaps can cope with that)
- frequencies? E.g high frequencies have a lower amplitude than lower ones, causing less correct reads (then we would need decreasing pulse gaps to handle that)
- enthusi's graph shows some intervals of heavy distortions, what is causing these?
- varying frequencies? (e.g. a low frequency followed by a high one give less accurate results than two high ones)
- aging?
- what else?

This is really tricky. :)
2018-12-17 15:36
Hoogo

Registered: Jun 2002
Posts: 105
Quoting thrust64
...There a lots of factors we can play with:
1. short pulse length
2. pulse gap
3. number of pulse lengths (a result of 1. + 2.)
4. ???
4. 1 pulse lenght for all gaps, or make a pulse as long as the following gap?
5. Use a Datasette for recording, or can Hifi equipment create better signals?
6. ???

I'd base anything on pulses of 1/44100Hz or multiples. It's much more easy to do this kind of stuff on PC.
Quoting thrust64
And what is causing the inaccuracies?
- the varying tape speed (how much? +/-5%?) (increasing pulse gaps can cope with that)
- frequencies? E.g high frequencies have a lower amplitude than lower ones, causing less correct reads (then we would need decreasing pulse gaps to handle that)
- enthusi's graph shows some intervals of heavy distortions, what is causing these?
- varying frequencies? (e.g. a low frequency followed by a high one give less accurate results than two high ones)
- aging?...
Did some tests last sunday.

The reading test showed a little in $d020/$d418 when something was received. That happened every 1 or 2 seconds, even when no tape was running. The space was crowded, cable was running close to the monitor.

A simple write test was writing one byte repeatedly, with 22 cycles for each bit. There is a scope available, but I need more practice in its usage (and more hands).

Seems that writing 10101010 creates a flat line, 00001111 gives a very nice wave.

Most likely I will do more tests next weekend.
2018-12-17 15:46
Hoogo

Registered: Jun 2002
Posts: 105
Quoting ChristopherJam
I just encoded a sequence of 1000 random bits with the [0,10,11]=>[100,150,200] encoding...
Could you show something for 4 pulse lenghts, please?
I did a little test in Basic V2 with 2KBit of random data.
"1/00/01/111" was always longer than "00/01/10/11", "0/1/000/111" was even worse.
2018-12-17 19:57
ChristopherJam

Registered: Aug 2004
Posts: 1409
Quoting Hoogo
Could you show something for 4 pulse lenghts, please?
I did a little test in Basic V2 with 2KBit of random data.
"1/00/01/111" was always longer than "00/01/10/11", "0/1/000/111" was even worse.


Sure, I’ll put something together tomorrow. But can I ask what pulse lengths you were using? 00/01/10/11 will always minimise the count of pulses you output, but I wouldn’t expect it to minimise the total recording length unless your four pulse lengths are quite similar to each other.

Those hardware tests sound useful; I’d love to know what frequency your square waves stop being reliably readable, and how much difference the wave shape makes to that threshold (eg, if 50 cycles high/50 cycles low is fine, how about 60/40?). Sadly I sold my datasettes a few months ago, so I can’t measure anything myself at the moment,
2018-12-17 21:18
Hoogo

Registered: Jun 2002
Posts: 105
Quoting ChristopherJam
...But can I ask what pulse lengths you were using? 00/01/10/11 will always minimise the count of pulses you output, but I wouldn’t expect it to minimise the total recording length unless your four pulse lengths are quite similar to each other.
I have used 1 pulse + 1 to 4 gaps of equal size, so 2...5 overall length.

For the random test data:

00 207
01 232
10 239
11 202
=>1800 Bits, length 3156 pulses+gaps.
If all appeared 225 times, length would be 3150.

1 365
00 297
01 311
111 73
=>1800 Buts, length 3230.

The arithmetic distribution would be best at 39%, 26%, 19%, 16%, but I don't get the idea how to find a nice Huffmann for that.
2018-12-17 22:17
ChristopherJam

Registered: Aug 2004
Posts: 1409
oh!

Um, that's not a huffman code. (you can't tell from the bits read so far whether you've finished the current codeword or not)
Try
0
10
110
111


Algorithm is to combine the two smallest probability items into a tree node that has the two items as children, and replace the children in your list of things to process with the parent. (Have a look at the function huffman_encode in compare_encodings.py linked above - I lifted it from rosettacode,org )

Naming the codes for the four pulse lengths A B C & D
codes C and D combine first, so whether you get a flat code (two bits each) or not depends on whether the next combine step joins code B with {parent of C & D} or with code A

If the gap is more than around 28% of the minimum pulse length, the likelyhood of C or D is less than the likelyhood of A, so an unbalanced tree is generated.


spacing of (100+ 100*n)
+----------------+-----------------+--------------+
| Pulse duration | arithmetic code | huffman code |
+----------------+-----------------+--------------+
| 100 cycles     | 0.51879         | 1            |
| 200 cycles     | 0.26914         | 01           |
| 300 cycles     | 0.13963         | 001          |
| 400 cycles     | 0.07244         | 000          |
+----------------+-----------------+--------------+
mean cycles per bit, arithmetic code = 105.6
mean cycles per bit,    huffman code = 107.1

spacing of (100+  28*n)
+----------------+-----------------+--------------+
| Pulse duration | arithmetic code | huffman code |
+----------------+-----------------+--------------+
| 100 cycles     | 0.36380         | 0            |
| 128 cycles     | 0.27410         | 10           |
| 156 cycles     | 0.20651         | 111          |
| 184 cycles     | 0.15559         | 110          |
+----------------+-----------------+--------------+
mean cycles per bit, arithmetic code =  68.6
mean cycles per bit,    huffman code =  71.1

spacing of (100+  27*n)
+----------------+-----------------+--------------+
| Pulse duration | arithmetic code | huffman code |
+----------------+-----------------+--------------+
| 100 cycles     | 0.36057         | 11           |
| 127 cycles     | 0.27376         | 10           |
| 154 cycles     | 0.20785         | 01           |
| 181 cycles     | 0.15781         | 00           |
+----------------+-----------------+--------------+
mean cycles per bit, arithmetic code =  68.0
mean cycles per bit,    huffman code =  70.2
2018-12-17 23:07
Hoogo

Registered: Jun 2002
Posts: 105
Got it! Soo obvious now :)
Previous - 1 | 2 | 3 | 4 - Next
RefreshSubscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search   for   in  
All times are CET.
Search CSDb
Advanced
Users Online
Mike
Airwolf/F4CG
Brittle/Dentifrice^(?)
saimo/RETREAM
Paulko64
Krill/Plush
Guests online: 94
Top Demos
1 Next Level  (9.7)
2 13:37  (9.7)
3 Mojo  (9.7)
4 Coma Light 13  (9.6)
5 Edge of Disgrace  (9.6)
6 What Is The Matrix 2  (9.6)
7 The Demo Coder  (9.6)
8 Uncensored  (9.6)
9 Comaland 100%  (9.6)
10 Wonderland XIV  (9.6)
Top onefile Demos
1 Layers  (9.6)
2 No Listen  (9.6)
3 Party Elk 2  (9.6)
4 Cubic Dream  (9.6)
5 Copper Booze  (9.6)
6 Rainbow Connection  (9.5)
7 Dawnfall V1.1  (9.5)
8 Onscreen 5k  (9.5)
9 Morph  (9.5)
10 Libertongo  (9.5)
Top Groups
1 Performers  (9.3)
2 Booze Design  (9.3)
3 Oxyron  (9.3)
4 Triad  (9.3)
5 Censor Design  (9.3)
Top Coders
1 Axis  (9.8)
2 Graham  (9.8)
3 Lft  (9.8)
4 Crossbow  (9.8)
5 HCL  (9.8)

Home - Disclaimer
Copyright © No Name 2001-2024
Page generated in: 0.065 sec.