

CMU Pocketsphinx is an open source speech recognition toolkit that provides tools for speech recognition and acoustic modeling. It is designed for low-resource platforms, which may help developers building applications for mobile devices or server environments.
This toolkit is intended for software companies and developers who need to implement speech-to-text functionality. It supports multiple languages—including US English, UK English, French, Mandarin, German, Dutch, and Russian—and supports the building of custom models for other languages.
Beyond basic recognition, the toolkit includes tools for keyword spotting and pronunciation evaluation. Because it is released under a BSD-like license, it may be used in commercial products, and commercial support is available.
Buyers and developers should confirm they have the technical expertise to manage acoustic and language models, as the toolkit requires these to function and does not include built-in format converters for encoded audio files.
A speech recognition engine designed for operation on low-resource platforms.
Tools for acoustic modeling that support training the system for specific speech patterns.
A mode used to search for and detect specific keyphrases in a continuous speech stream.
Supports several prebuilt languages and supports the creation of models for additional languages.
Tools designed to evaluate the accuracy of spoken words.
Features that support the alignment of audio data with text.
Implementing speech recognition on Android or iOS devices where system resources are limited.
Using keyword spotting to trigger specific application actions when a keyphrase is detected.
Deploying speech recognition on Unix or Windows servers for batch or live processing.
Building acoustic and language models for specialized domains or other languages.
Pricing was not clearly available from the provided evidence. Buyers should confirm current pricing on the vendor website.
It runs on Unix, Windows, iOS, Android, and various hardware platforms.
Yes, it is released under a BSD-like license that permits commercial distribution.
Yes, it provides prebuilt models for French, Mandarin, German, Dutch, and Russian, and supports the building of models for other languages.
No, the decoders do not include format converters. Audio must be converted to PCM format (typically 16khz 16bit little-endian mono) using tools like ffmpeg before processing.
Source category: Software Development
Source subcategory: Machine Learning Framework
CMU Pocketsphinx is an open source speech recognition toolkit for software developers. It supports mobile and server workflows including keyword spotting and acoustic modeling. Buyers should note that the tool requires external libraries for audio format conversion and technical expertise for model training.