Proof-of-concept of a conversational robit using GPT, WhisperAI, and Mimic 3. Inspired by https://www.youtube.com/watch?v=bO-DWWFolPw.
- Install requirements:
python -m pip install -r requirements.txt- You may have to install some additional libraries depending on your system. For example, in Ubuntu, you may have to run
sudo apt install portaudio19-dev.
- You may have to install some additional libraries depending on your system. For example, in Ubuntu, you may have to run
- Get yourself a running transcription service. I've been using whisper-jax with good success.
- If you want to use Mimic 3 instead of Open AI for TTS:
- Spin up Mimic 3, for example using docker: https://mycroft-ai.gitbook.io/docs/mycroft-technologies/mimic-tts/mimic-3#docker-image
- Use
Custom_TTSHelperinstead ofOpenAI_TTSHelperinAIHelper's__init__.py. - Change the
formatargument inAudioHelper.say()towav(Mimic 3 outputs a wav, OpenAI an MP3).
- Put your settings in the env (or stick them in some other way, I'm not your mom).
- This project uses
dotenv, so all you have to do is create a.envfile in the root of the repo. - Supported settings:
OPENAI_API_KEY- API key for OpenAI. Required (chat responses use GPT, and TTS uses OpenAI's TTS model by default)ROBIT_TRANSCRIPTION_ENDPOINT- Endpoint for a transcription service. Defaults tohttp://localhost:4444/transcribe.ROBIT_TTS_ENDPOINT- Endpoint for a TTS service. For use withCustom_TTSHelper. Defaults tohttp://localhost:59125/api/tts(the default Mimic 3 URL).ROBIT_LOG_LEVEL- Specify output log level. Recommend setting this toDEBUGespecially for now.
- This project uses
- Run
robit.py.python robit.py(or some other way. I don't know your life. Calm down.)