[CSDb] - User Forums - Speech-Synthesis

Welcome to our latest new user maak ! (Registered 2024-04-18)

You are not logged in - nap

CSDb User Forums

Forums > C64 Composing > Speech-Synthesis

2005-09-04 17:18

Bamu®
Account closed

Registered: May 2005
Posts: 1332

Speech-Synthesis

Hi,

There are only few people, who used ring-modulation to simulate Speech....
I even want try it, but unfortunately I don't have any idea how this works
Before I waste time by testing I ask here: Is there someone with knowledge?

Btw: Is there someone who can tell me, what the voice in AGEMIXER'S song "Da Shit (Eastwood Jack's Red..)" says?

... 149 posts hidden. Click here to view all posts....

2014-09-23 13:07

Agemixer

Registered: Dec 2002
Posts: 38

Mixer: Exactly - that's why... - if there is no variable speed trackers around with ultra flexibility, i have to work on my own AgentRacker.

2015-02-28 14:31

SIDWAVE
Account closed

Registered: Apr 2002
Posts: 2238

check this short video
https://www.youtube.com/watch?v=LMfNzQFMaeA&index=1&list=UUgOH6..

how do i begin in the right way ?
my goal is to make it say something..
like "sidwave" would be a nice thing.

how to start properly, make an S ?
and how to proceed ?

i can poke around endlessly with notes and waveforms,
but i hit good stuff only at random.

in the video, i especially like the reall low scratch-bass sound. but i dont even know what triggers it..

?

Jammer help!

2015-02-28 15:59

Jammer

Registered: Nov 2002
Posts: 1289

I hear you're quite close in this example. Try $51 for carrier waveform. First of all - pitch difference between carrier (first channel) and modulator channel (latter channel) cannot be too vast - in practice you canno make carrier very low - aim rather at octaves 3 and 4. If you have C-2 carrier and C-6 modulator, it won't do the trick yet :) Achieving exact phoneme is kind of trial and error, unfortunately. The higher carrier is, the higher modulator also to make the sound prominent. I haven't tested 6581 here, unfortunately, and I assume that $57 sounds different and you might make a nice use of saturation to achieve some details.

To make S sound you... put a noise basically :D:D:D

2015-03-01 16:13

Agemixer

Registered: Dec 2002
Posts: 38

Hmm, is there some limit for a reply in CSDb? I think my msg visited cyberspace, no idea if it wanted to go that way..

2015-03-01 16:19

Agemixer

Registered: Dec 2002
Posts: 38

Trying again.. shortly:

Easy stuff! :)

Easier to start with humming - make carrier voice of that.
Then add channel to the "right of the carrier" your vocal pitchs out of "A", try out all the halfnotes and pick one that's nearest. Do the same for AEIOUY vocals - and also some consonants like J and M.

Consonants might be harder: Make new instrument to hit a new "Hard consonant" - should be very easy with SDI and it's amouint of HR settings! Should make a big difference!

And easy with SDI: you can see both channels side by side.. I used Sync and there you can't see WYSIWYG way your tones.

You need to tie subsequent vocal tones.

For S, F, and such tones you definitely need noise (just like Jammer said).

2015-03-01 16:38

Agemixer

Registered: Dec 2002
Posts: 38

The basic idea is real singing. Do sing some easy, long vocal and try to produce as similiar voice with SDI instruments, as possible. It is easier than you think. (with SDI).

One good vocal example is Retro Gold Love (lousy consonants):

MUSICIANS/A/Agemixer/007-Retro_Gold_Love_T01.sid_CSG8580R5.mp3

Then about multispeed - I find it particularly "must" for singing, because the vocal speed must be faster in resolution than most usual instruments - mostly applies to consonants.

Then there's one very important secret that applies to humanly singing - i don't want to spoil it - you will find that simple secret anyway if you are into vocals anyway, so if you know... don't tell! :)

2015-03-02 16:48

Soren

Registered: Dec 2001
Posts: 547

The only time I fiddled with it on c64, I am pretty sure I did it like I did on a synth of mine - saw wave+bandpass filter, resonance, ringmod... for the vowels.
I am not a fan of this kind of speech synth on c64 - it usually just doesn't sound good enough to me.
It's more fun for me to create other types of instruments ;)

2021-10-26 13:53

2bt

Registered: Jun 2021
Posts: 9

I have been fiddling with the C port of SAM a while ago (https://github.com/vidarh/SAM). If you run it with -debug, it will print out lots of interesting information. The last part is a nice table with all the data that is used by the backend to generate the audio. This is the table for "hello".

 flags ampl1 freq1 ampl2 freq2 ampl3 freq3 pitch
------------------------------------------------
   7C     0    14     0    73     0    93    57
    0     4    16     3    70     2    92    56
    0    13    18    11    66     4    91    55
    0    13    18    11    66     4    91    55
    0    13    18     9    60     4    94    55
    0    13    17     6    48     2   102    56
    0    11    16     4    36     1   110    56
    0    11    16     4    36     1   110    56
    0    11    16     4    36     1   110    56
    0    11    16     4    35     1   106    56
    0    13    17     6    33     1    97    56
    0    15    18     9    30     0    88    55
    0    15    18     9    30     0    88    55
    0    15    18     9    30     0    88    55
    0    15    18     9    30     0    88    55
    0    15    18     9    30     0    87    55
    0    15    16     8    30     0    85    56
    0    13    15     6    29     0    83    57
    0    13    13     5    29     0    81    58
    0    11    12     4    28     0    80    58
    0    11    12     4    28     0    80    58
    0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0
    0     0     0     0     0     0     0     0
...

Each row represents a time slice. Flags indicate noise, amplitude and frequency for tree formants, and the voice pitch. The final audio signal is the sum of these three formant oscillators. And they are synced with the pitch.

It turns out that the third formant isn't really that important. Neither is amplitude, really. What's left maps well to the SID. Voice 1 for the pitch and voices 2 and 3 for the first and second formant. Sync voice 2 to voice 1 and sync voice 3 to voice 2 (bummer, since syncing voice 3 to voice 1 is not possible, but that is ok).

I managed to render a 1x SID that sounds a lot like SAM (http://langnerd.de/sid/sam-test.sid). It has some rough edges but I think it shows that the SID is totally capable to render understandable speech without samples. In my estimate, we have only really scratched the surface of what's possible.

2021-10-26 14:04

Frantic

Registered: Mar 2003
Posts: 1627

Quote:

we have only really scratched the surface of what's possible

Most definitely. Actually, the player code in defMON started out as part of a project to play "voice data" similar to what you have done (nicely!) here. Would be interesting to see what people who are skilled in this particular area can come up with. I think some quite unbelievable things would be possible.

2021-10-26 15:33

ChristopherJam

Registered: Aug 2004
Posts: 1370

Nice work, 2bt.

I must admit, I was already thinking the original SAM port to the c64 has a lot of scope for optimisation/quality improvement, even just as a soft synth.

Certainly still many things to try :)

Previous - 1 | ... | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 - Next

Refresh

Subscribe to this thread:

You need to be logged in to post in the forum.

Search the forum:
Search for in
All times are CET.

Search CSDb

Advanced

Users Online

Mike
Didi/Laxity
Sentinel/Excess/TREX
Aomeba/Artline Desig..
Malmix/Fatzone
t0m3000/ibex-crew
Nordischsound/Hokuto..
Guests online: 86

Top Demos

1 Next Level  (9.8)
2 Mojo  (9.7)
3 Coma Light 13  (9.7)
4 Edge of Disgrace  (9.6)
5 Comaland 100%  (9.6)
6 No Bounds  (9.6)
7 Uncensored  (9.6)
8 Wonderland XIV  (9.6)
9 The Ghost  (9.6)
10 Bromance  (9.6)

Top onefile Demos

1 It's More Fun to Com..  (9.9)
2 Party Elk 2  (9.7)
3 Cubic Dream  (9.6)
4 Copper Booze  (9.5)
5 Rainbow Connection  (9.5)
6 Wafer Demo  (9.5)
7 TRSAC, Gabber & Pebe..  (9.5)
8 Onscreen 5k  (9.5)
9 Dawnfall V1.1  (9.5)
10 Quadrants  (9.5)

Top Groups

1 Oxyron  (9.3)
2 Nostalgia  (9.3)
3 Booze Design  (9.3)
4 Censor Design  (9.3)
5 Crest  (9.3)

Top NTSC-Fixers

1 Pudwerx  (10)
2 Booze  (9.7)
3 Stormbringer  (9.7)
4 Fungus  (9.6)
5 Grim Reaper  (9.3)

Page generated in: 0.044 sec.