You're making the same mistake I was a few days ago :)
Symbols that output longer bit sequences consume more of the original file when it's being converted to a pulse stream, so they need to be weighted accordingly when you're computing the expected cycles per bit.
...There a lots of factors we can play with: 1. short pulse length 2. pulse gap 3. number of pulse lengths (a result of 1. + 2.) 4. ???
And what is causing the inaccuracies? - the varying tape speed (how much? +/-5%?) (increasing pulse gaps can cope with that) - frequencies? E.g high frequencies have a lower amplitude than lower ones, causing less correct reads (then we would need decreasing pulse gaps to handle that) - enthusi's graph shows some intervals of heavy distortions, what is causing these? - varying frequencies? (e.g. a low frequency followed by a high one give less accurate results than two high ones) - aging?...
I just encoded a sequence of 1000 random bits with the [0,10,11]=>[100,150,200] encoding...
Could you show something for 4 pulse lenghts, please? I did a little test in Basic V2 with 2KBit of random data. "1/00/01/111" was always longer than "00/01/10/11", "0/1/000/111" was even worse.
...But can I ask what pulse lengths you were using? 00/01/10/11 will always minimise the count of pulses you output, but I wouldn’t expect it to minimise the total recording length unless your four pulse lengths are quite similar to each other.
spacing of (100+ 100*n) +----------------+-----------------+--------------+ | Pulse duration | arithmetic code | huffman code | +----------------+-----------------+--------------+ | 100 cycles | 0.51879 | 1 | | 200 cycles | 0.26914 | 01 | | 300 cycles | 0.13963 | 001 | | 400 cycles | 0.07244 | 000 | +----------------+-----------------+--------------+ mean cycles per bit, arithmetic code = 105.6 mean cycles per bit, huffman code = 107.1 spacing of (100+ 28*n) +----------------+-----------------+--------------+ | Pulse duration | arithmetic code | huffman code | +----------------+-----------------+--------------+ | 100 cycles | 0.36380 | 0 | | 128 cycles | 0.27410 | 10 | | 156 cycles | 0.20651 | 111 | | 184 cycles | 0.15559 | 110 | +----------------+-----------------+--------------+ mean cycles per bit, arithmetic code = 68.6 mean cycles per bit, huffman code = 71.1 spacing of (100+ 27*n) +----------------+-----------------+--------------+ | Pulse duration | arithmetic code | huffman code | +----------------+-----------------+--------------+ | 100 cycles | 0.36057 | 11 | | 127 cycles | 0.27376 | 10 | | 154 cycles | 0.20785 | 01 | | 181 cycles | 0.15781 | 00 | +----------------+-----------------+--------------+ mean cycles per bit, arithmetic code = 68.0 mean cycles per bit, huffman code = 70.2