← 回總覽

Insanely Fast Whisper:高性能开源语音转录工具

📅 2026-03-25 15:11 Nav Toor 人工智能 2 分鐘 1396 字 評分: 80
Whisper AI 开源 语音转录 Flash Attention
📌 一句话摘要 一款基于 Whisper 的开源工具,提供高速本地音频转录功能,性能远超商业云 API 服务。 📝 详细摘要 这条推文介绍了 'Insanely Fast Whisper',这是一个开源项目,实现了本地音频转录,相比 OpenAI、Google 和 AWS 等商业云 API,性能提升巨大(最高可达 19 倍)。它具备说话人分离、翻译功能,并支持 NVIDIA GPU 和 Apple Silicon,为付费转录服务提供了一个高性价比、注重隐私的替代方案。 📊 文章信息 AI 评分:80 来源:Nav Toor(@heynavtoor) 作者:Nav Toor 分类:人工智能

🚨 OpenAI charges $0.006/minute. Google charges $0.024. AWS charges $0.024. Someone just open sourced a tool that does it for $0. And it's faster than all of them.

It's called Insanely Fast Whisper. And that's not hype. That's the benchmark.

150 minutes of audio. 98 seconds to transcribe. On your own machine. No API key. No cloud. No per-minute billing.

Here's what the numbers look like:

→ Whisper Large v3 + Flash Attention 2: 150 min of audio in 98 seconds

→ Distil Whisper + Flash Attention 2: 150 min in 78 seconds

→ Standard Whisper without optimization: 31 minutes for the same job

→ That's a 19x speedup. Same model. Same accuracy. Just faster.

Here's what it does:

→ One command to transcribe any audio file or URL

→ Speaker diarization — knows WHO said WHAT

→ Transcription AND translation to other languages

→ Runs on NVIDIA GPUs and Mac (Apple Silicon)

→ Flash Attention 2 for maximum speed

→ Clean JSON output with timestamps

→ Works with every Whisper model variant

Here's the wildest part: Otter.ai9 charges $100/year. Rev charges $1.50/minute. Descript charges $24/month. Enterprise transcription contracts cost thousands.

Podcasters, journalists, researchers, lawyers, content creators — anyone still paying for transcription is lighting money on fire.

8.8K GitHub stars. 633 forks. MIT License.

100% Open Source.

(Link in the comments)

查看原文 → 發佈: 2026-03-25 15:11:35 收錄: 2026-03-25 16:00:43

🤖 問 AI

針對這篇文章提問,AI 會根據文章內容回答。按 Ctrl+Enter 送出。