Squawk - Real-Time Local Text-to-Speech with AI

Squawk - Real-Time Local Text-to-Speech with AI v0.0.4

Supported Bit Versions
  1. 64-bit
Source Code URL
https://github.com/occ-ai/obs-squawk
Minimum OBS Studio Version
30.1.0
Supported Platforms
  1. Windows
  2. Mac OS X
  3. Linux
The OBS Squawk plugin adds powerful voice generation capabilities to OBS by leveraging sherpa-onnx. With this plugin, you can generate speech on the fly and in real-time inside OBS without any external services or access to the network.

If you like this work drop a review here or ⭐ the Github repo, support our work on Github, Patreon or OpenCollective. Learn about AI for streaming and content on OCC AI.
Need help? Live support on https://discord.gg/Z6CuJQH4S2

Features

  • OBS Audio Source: Seamlessly integrates with OBS as an audio source.
  • Sherpa-onnx: Utilizes sherpa-onnx for high-quality voice synthesis and cloning. Everything built-in - not relying on any external software!
  • Cross-Platform: Works the same way on any operating system that OBS suports: Windows, Mac and Linux.
  • Extensive Voice Library: Access to a huge library of pre-trained voices for dozens of languages.
  • Automated Generation: Monitor an OBS source or text file, when the content changes speech will be generated. Perfect for any automation or for live Transcription-Translation-Generation use case.
  • Real-Time & Lightweight: The very efficient VITS architecture for speech generation is extremely resource efficient and runs real-time on a modest CPU.

Usage

  1. Open OBS and add a new Source: Select the "Squawk Text to Speech" from the list of available audio sources.
  2. Configure the source settings:
    • Select Voice: Choose a pre-trained voice package from the library. Models will be downloaded as necessary.
    • Select the Speaker ID: some voice packages have multiple (even 100s) speakers.
    • You can generate speech directly from the plugin settings by clicking the button.
    • Set up the monitoring of a Text source or a file.
  3. Send text to the monitored text source or text file to produce the speech audio.
  • Like
Reactions: Destroy666
Author
royshilkrot
Downloads
2,344
Views
9,969
First release
Last update
Rating
5.00 star(s) 3 ratings

More resources from royshilkrot

Latest updates

  1. v0.0.4 - debounce, line-by-line, fixes

    In this release Adding "Debounce" to prevent repeat speech on rapid change Adding...
  2. v0.0.3 - Bug and crash fixes, Speed control!

    In this release: - Fix speaker ID bugs - Add speed control - Prepare for UI translations...
  3. v0.0.2 - Critical bugfixes

    Fixing critical bugs reported by early users. Thank you!

Latest reviews

This has been doing a great job of reading my transcriptions. Very pleased, though still working through some quirks where last said things keep getting redisplayed/spoken.
You won't find anything else better than this ⭐
great.
Top