voice-chat-transcriber

Automatically joins voice channels and transcribes speech in real time using local, offline speech recognition. Transcriptions are posted as embeds in a #logs-vc channel with per-user attribution (name and avatar). Supports running multiple bot instances in parallel to cover more than one voice channel at a time.

How It Works

A user joins a voice channel.
An available bot claims the channel and joins it.
Each user's audio is captured separately and streamed through Vosk for recognition.
Completed transcriptions are posted to #logs-vc as an embed attributed to that user.
When the channel is empty, the bot leaves and frees itself up for other channels.

Logs are only stored for a week on Discord and then automatically cleaned up.

Vosk Model

The bot requires a Vosk speech recognition model. Two recommended options:

Model	Size	Word Error Rate	Notes
`vosk-model-small-en-us-0.15`	~40 MB	~9.85%	Fast, low resource usage
`vosk-model-en-us-0.22`	~1.8 GB	~5.69%	Higher accuracy, recommended

Download from alphacephei.com/vosk/models and extract to a model/ directory in the project root.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
gradle/wrapper		gradle/wrapper
src/main		src/main
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
build.gradle		build.gradle
config.json.example		config.json.example
docker-compose.yml		docker-compose.yml
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

voice-chat-transcriber

How It Works

Vosk Model

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

voice-chat-transcriber

How It Works

Vosk Model

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages