arXiv 2410.15608
Moonshine: Speech Recognition for Live Transcription and Voice Commands
By Nat Jeffries, Evan King, et al.
Published 2024-10-21
Wiki summary
Explore the paper's summary, context, and related research on Papiers.
This paper introduces Moonshine, a family of speech recognition models optimized for live transcription and voice command processing. Moonshine is based on an encoder-decoder transformer architecture and employs Rotary Position Embedding (RoPE) instead of traditional absolute position embeddings. The model is trained on speech segments of various lengths, but without using zero-padding, leading to greater efficiency…