How can I use a TTS for live translation?

Kronos_09990 · 2022-02-13 22:56:26

I would like to train some model with my voice to stream in a language different from my native one, getting the final audio into OBS. If that's possible how can I do that? And if not how could I do it for a recording instead of a stream so that the generated audio aligns with my voice and the final duration of the audio is the same as the recorded video image?

What I've thought about is using coqui-TTS for example to create a text file. Then I translate it and I use coqui-STT to get an audio file in a different language. Then I transform the audio in audacity to get the same duration as the video.

I couldn't think of anyway of doing that live and input the result into OBS. I would at least like some ideas on how to make the process take as little time as possible so that I don't spend too long in any of the parts, just in the recording.

Last edited by Kronos_09990 (2022-02-13 23:01:10)

Arch Linux

#1 2022-02-13 22:56:26

How can I use a TTS for live translation?

Board footer