Skip to content

N1ptic/theWhisperer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

6 Commits
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŽ™๏ธ WhispererAI ๐Ÿค–

An intelligent voice-based AI assistant that transcribes speech and answers questions in real-time using OpenAI's Whisper and Llama models.


Python Version License Platform


โœจ Core Features

  • ๐ŸŽค Real-time Audio Recording & Transcription: Capture and convert speech to text instantly.
  • ๐Ÿง  Local Speech Recognition: Utilizes the Whisper Base model for efficient on-device processing.
  • ๐Ÿ’ก AI-Powered Responses: Leverages Llama (via Ollama) for intelligent question answering.
  • ๐Ÿ”Š High-Quality Audio Processing: Includes noise filtering for clearer audio input.
  • ๐Ÿš€ CUDA Acceleration: Supports GPU acceleration for faster performance.
  • ๐Ÿ’ป Cross-Platform Compatibility: Works on Windows, Linux, and macOS.

๐Ÿ› ๏ธ Tech Stack

  • Programming Language: Python 3.8+
  • Speech-to-Text: OpenAI Whisper (Base model)
  • Language Model: Llama (via Ollama)
  • Core Libraries:
    • PyTorch
    • Transformers
    • SoundFile
  • Audio Backend: FFMPEG

๐Ÿ“‹ Prerequisites

  • ๐Ÿ Python 3.8 or higher.
  • ๐ŸŽฎ CUDA-capable GPU (Optional, but highly recommended for performance).
  • ๐ŸŽž๏ธ FFMPEG installed and accessible in your system's PATH.
  • ๐Ÿฆ™ Ollama installed and running locally.
  • ๐ŸŽง A compatible audio input device (Defaults to HyperX Cloud Stinger Core Wireless on Windows, or system default otherwise).

๐Ÿš€ Getting Started: Installation

  1. Clone the Repository:

    git clone https://github.com/yourusername/WhispererAI.git
    cd WhispererAI
  2. Set Up a Virtual Environment:

    python -m venv venv
    • On Windows:
      venv\Scripts\activate
    • On macOS/Linux:
      source venv/bin/activate
  3. Install Dependencies:

    pip install -r requirements.txt
  4. Install and Run Ollama:

    • Download and install Ollama from ollama.com.
    • Ensure the Ollama service is running.
    • Pull the Llama model you intend to use (e.g., ollama pull llama3.2).

๐Ÿ’ป How to Use

  1. Launch the Application:

    python app.py
  2. Interact with the Assistant:

    • Press R to Start Recording your voice.
    • Press S to Stop Recording and process the audio.
    • Press C to Clear the terminal screen.
    • Press Q to Quit the application.

โš™๏ธ Configuration Details

The application comes with the following default settings:

  • Audio Sample Rate: 48kHz
  • Audio Channels: Mono
  • Whisper Model: openai/whisper-base
  • LLM (via Ollama): llama3.2 (Ensure this model is available in your Ollama setup)
  • Processing Device: CUDA (if available), otherwise CPU.
  • Audio Filters:
    • High-pass: 50Hz
    • Low-pass: 15kHz
    • Volume Boost: 1.5x

๐ŸŽค Audio Device Setup

  • Windows: Attempts to automatically detect "Microphone (HyperX Cloud Stinger Core Wireless DTS)".
  • Linux/macOS: Uses the default system audio input device.
  • โ„น๏ธ If the preferred device isn't found, the application will list available audio devices. You may need to modify app.py to specify your device.

๐Ÿค Contributing

Contributions are highly encouraged and welcome! If you have improvements or bug fixes, please:

  1. Fork the repository.
  2. Create a new branch (git checkout -b feature/YourAmazingFeature).
  3. Commit your changes (git commit -m 'Add some AmazingFeature').
  4. Push to the branch (git push origin feature/YourAmazingFeature).
  5. Open a Pull Request.

โš ๏ธ Important Notes

  • Ensure Ollama is running with the specified model before starting WhispererAI.
  • Configure your audio input device in app.py if the default settings don't work for your setup.
  • For the best performance, a CUDA-capable GPU is recommended.

๐Ÿ“ License

This project is licensed under the MIT License. See the LICENSE file for more details (assuming a LICENSE file exists or will be created).


Happy Whispering! ๐Ÿ’ฌ

About

This takes the input of audio and answers the questions it detects.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages