Trending:
AI & Machine Learning

discord2sum: local voice transcription bot delivers meeting minutes to Telegram, Slack

Open-source Discord bot transcribes voice calls locally using Whisper, summarizes with local or cloud LLMs, and routes minutes to Telegram or Slack. No cloud upload required. Built for teams who need a paper trail without changing how they work.

What it does

discord2sum is a Discord bot that turns voice channel conversations into structured meeting minutes. When participants leave a voice channel, the bot delivers a summary covering what was discussed, decisions made, action items, and who was present.

The bot uses local Whisper-based speech-to-text processing. Audio stays on your server. For summarization, you can route transcripts to a hosted LLM (OpenAI) or keep everything local with Ollama. Minutes can be delivered to Telegram (default), Slack via webhooks, or any HTTP endpoint as JSON.

The privacy angle

Most call transcription products upload audio to cloud services. discord2sum processes everything locally, with configurable retention limits. This matters for teams handling sensitive discussions or operating under data residency requirements. The trade-off: you're responsible for hosting and managing the pipeline yourself.

Market context

Discord has shifted from gaming platform to workplace tool, particularly among APAC startups using voice channels for agile standups. The bot ecosystem around workplace Discord is fragmented: hobbyist projects with low adoption (most under 30 GitHub stars), MIT/GPL licenses, and no enterprise SLAs.

Similar projects rely on commercial APIs like Deepgram, Azure Speech, or Gemini for transcription and summarization. API costs scale with usage. Self-hosted alternatives exist but require infrastructure and maintenance overhead.

What to watch

Accuracy in multi-speaker, noisy environments remains a challenge without premium APIs. Discord's Terms of Service around commercial bot use create compliance questions for enterprise deployment. The project has minimal traction (no recent commits or releases as of February 2026) and lacks the support infrastructure enterprises expect.

The approach is intentionally minimal: no calendar integration, no meeting invites, no UI. Just one job: when a call ends, send a summary. Whether that's enough depends on whether your team values simplicity over feature depth.

The pattern

This sits in a broader category of self-hosted automation pipelines, often built with n8n or similar tools, connecting Discord voice to Whisper transcription to webhook delivery. The implementation details matter less than the question: do you trust a hobbyist bot with your team's discussions, or do you need vendor accountability?