Member-only story

Build a ChatGPT based Voice Assistant

ohdarling
5 min readNov 3, 2024

--

ChatGPT from unsplash

This comes from seeing FOLOTOY on Twitter sharing a voice assistant built using a large language model, complete with the open-source framework used. I wanted to try out libraries and code related to large language models, so I followed the instructions and set up a demo application.

Following the introduction by the Twitter user, I also used the relevant framework to build a local voice assistant. However, since I don’t have a GPU and the smaller parameter language models didn’t perform well during testing, I used OpenAI’s API for the LLM part.

Overall Introduction

This voice assistant primarily uses the following frameworks and services:

  • snowboy: Used for voice detection, recording, and also supports Voice Activity Detection (VAD).
  • faster-whisper: Used for speech-to-text conversion, utilizing an implementation of OpenAI’s Whisper model which is significantly faster than the official version.
  • SpeechRecognition: Used for recording. After snowboy detects the wake word, this library records the subsequent conversation and passes it to Whisper for speech-to-text conversion.
  • EmotiVoice: Converts text to speech. After querying GPT with the user’s conversation through an API, it generates and plays the response as audio.

--

--

ohdarling
ohdarling

Written by ohdarling

Coding and creating, build apps on tickplant.com.

No responses yet