Maria Marcano: Speech Recognition with PocketSphinx

Sunday, September 9, 2012

Speech Recognition with PocketSphinx

The following post puts together information from different sources about speech recognition and gives a brief overview of the CMUSphinx project and how to get started with PocketSphinx on Windows.

CMUSphinx project is an open source speech recognition project developed at Carnegie Mellon University, which consists of various tools use to build speech applications:

CMUclmtk — language model tools
Sphinxtrain — acoustic model training tools

The following recognizers (decoders)

Pocketsphinx: Designed to be fast and for real time speed written in C, supports desktop applications and mobile devices. It needs the library Sphinxbase.
Sphinx3: Speed recognizer intended for researchers .
Sphinx4: Speech recognition written in the Java.

Let’s learn how HMM based speech recognition is handled: it functions by first learning the characteristics (or parameters) of a set of sound units, and then using what it has learned about the units to find the most probable sequence of sound units for a given speech signal. The process of learning about the sound units is called training. The process of using the knowledge acquired to deduce the most probable sequence of units in a given signal is called decoding, or simply recognition.

Setup Pocketsphinx on windows

Environment: Windows 7 and Visual Studio 2012, sphinxbase-0.7, pocketsphinx-0.7

Name the folders (sphinxbase / pocketsphinx ), the project pocketsphinx has external dependencies that use the relative paths like the following “..\..\..\sphinxbase\include\sphinxbase\ad.h”.

To test the installation let's run pocketsphinx_continuous.exe, this tool runs speech recognition both continuous listening from microphone and continuous file transcription. To run it requires:

Copy sphinxbase.dll to the build folder, for example C:\Project\SpeechRecognition\CMUSphinx\pocketsphinx\bin\Debug.
The parameter –hmm, the directory containing acoustic model files.
The parameter –lm, word trigram language model input file.
The parameter –dict, main pronunciation dictionary (lexicon) input file.

This is running the command with the information contained in the project.

pocketsphinx_continuous.exe -hmm C:\Project\SpeechRecognition\CMUSphinx\pocketsphinx\model\hmm\en_US\hub4wsj_sc_8k 
-dict C:\Project\SpeechRecognition\CMUSphinx\pocketsphinx\model\lm\en_US\cmu07a.dic 
-lm C:\Project\SpeechRecognition\CMUSphinx\pocketsphinx\model\lm\en_US\wsj0vp.5000.DMP

this is the output of me saying “no no no”

I will take a closer look at the project to check more the accuracy of the recognition.

Terminology

language model assigns a probability to a sequence of m words P(w1, .., w1) by means of a probability distribution. Language modeling is used in many natural language processing applications such as speech recognition, machine translation, part-of-speech tagging, parsing and information retrieval. In speech recognition and in data compression, such a model tries to capture the properties of a language, and to predict the next word in a speech sequence.

HMM: (Hidden_Markov_model) is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states.

Resources:

50 comments:

UnknownOctober 19, 2012 at 8:57 PM
I run this example, but I get error :(
ReplyDelete
Replies
AnonymousOctober 22, 2012 at 8:53 AM
Great tutorial!! it worked perfect. Thank you
ReplyDelete
Replies
AnonymousOctober 22, 2012 at 8:55 AM
Do you have a tutorial about how to compile the "hello world" example on Windows?
ReplyDelete
Replies
★圣龙战圣の小心我射你★November 27, 2012 at 3:20 AM
Getting this...
ERROR: "acmod.c", line 84: Acoustic model definition is not specified neither with -medef option nor with -hmm
:(
ReplyDelete
Replies
UnknownFebruary 26, 2013 at 1:50 PM
getting this error plz help me out

READY....
ERROR: "pocketsphinx.c", line 625: No search module is selected, did you forget
to specify a language model or grammar?
FATAL_ERROR: "continuous.c", line 274: Failed to start utterance

ReplyDelete
Replies
UnknownFebruary 27, 2013 at 9:04 AM
i did it myself.:)
ReplyDelete
Replies
UnknownAugust 12, 2013 at 7:36 AM
how
ReplyDelete
Replies
UnknownMay 9, 2014 at 9:31 AM
I got this error

Debug Assertion Failed!

Program: .....Parser.exe

file f:\dd\vctools\crt_bld\Self_x86\crt\src\fopen.c

Line 54

Expression: (file!=NULL)"

Could you please help me??
ReplyDelete
Replies
UnknownMay 14, 2014 at 7:09 AM
Thanks for the explanation. It works!
ReplyDelete
Replies
Anna SchaferFebruary 14, 2016 at 1:12 AM
It needs the library Sphinxbase.speech recognition program
ReplyDelete
Replies
pslvseo a7March 17, 2019 at 11:03 PM
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
ReplyDelete
Replies
pslvseo a7March 19, 2019 at 1:55 AM
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
ReplyDelete
Replies
pslvseo a7March 20, 2019 at 4:07 AM
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
ReplyDelete
Replies
pslv seoa10March 20, 2019 at 10:06 PM
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
ReplyDelete
Replies
pslvseo a1March 24, 2019 at 9:40 PM
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
ReplyDelete
Replies
AnonymousMarch 29, 2019 at 12:14 AM
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
ReplyDelete
Replies
AnonymousMarch 29, 2019 at 3:26 AM
Great Post,really it was very helpful for us.
Thanks a lot for sharing!
I found this blog to be very useful!!
Python training in Bangalore
ReplyDelete
Replies
AnonymousMarch 31, 2019 at 11:15 PM
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
ReplyDelete
Replies
byodbuzz08April 2, 2019 at 11:45 PM
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
ReplyDelete
Replies
byodbuzz08April 7, 2019 at 10:48 PM
Reading Buddy Software is advanced, speech recognition reading software that listens, responds, and teaches as your child reads. It’s like having a tutor in your computer
ReplyDelete
Replies
UnknownFebruary 7, 2020 at 2:43 AM
Python Training Institute
ReplyDelete
Replies
ShivarajMay 29, 2020 at 2:28 AM
Thank you for taking the time and sharing this information with us. It was indeed very helpful and insightful while being straight forward and to the point.

digital marketing coaching in hubli
ReplyDelete
Replies
UnknownJune 11, 2020 at 9:45 PM
Great Post. It is really very helpful. Thanks for sharing!
Python training in Bangalore
ReplyDelete
Replies
Sitasnu SekharDecember 23, 2020 at 8:32 AM
Nice Article, Great experience for me by reading this info. Thanks for sharing the information with us.keep updating your ideas.

Looking for the best ppc course in Bangalore? Learn PPC from Ranjan Jena, 10+ Years Expert Google Ads Trainer. 1000+ Students Trained @ eMarket Education, Koramangala, Bangalore.
Best Online Digital Marketing Courses in Bangalore, India
Best Digital Marketing Institute in Bangalore

ReplyDelete
Replies
Sitasnu SekharDecember 23, 2020 at 9:19 AM
Awesome post Maam,
really appreciate for your writing. This blog is very much useful...
Looking for the best ppc course in Bangalore? Learn PPC from Ranjan Jena, 10+ Years Expert Google Ads Trainer. 1000+ Students Trained @ eMarket Education, Koramangala, Bangalore.
Best Online Digital Marketing Courses in Bangalore, India
ReplyDelete
Replies
AnonymousFebruary 4, 2021 at 2:47 AM
Very informative blog and useful article thank you for sharing with us
by cognex is the AWS Training in Chennai
ReplyDelete
Replies
AnonymousAugust 29, 2021 at 9:41 PM
Very informative..
online bus ticket booking
ReplyDelete
Replies
kumal kumarDecember 16, 2021 at 12:28 PM
I am very happy to have seen your blog. Thanks again for all the details.
Data Science Training in Lucknow
ReplyDelete
Replies
French language classes in ChennaiDecember 22, 2021 at 8:14 AM
Wonderful content as always, this is very informative and interesting
French Classes Online | Online French Classes
ReplyDelete
Replies
AnonymousJanuary 5, 2022 at 7:11 AM
nice post
HIV Treatment in India | Treatment of HIV in India | Aids Treatment in India
ReplyDelete
Replies
Agex PharmaJanuary 13, 2022 at 5:33 AM
Chemistry is our forte. We provide chemicals ranging from fine chemcials for early R&D application to large scale industrial production. Glycidol (556-52-5 ) manufacturer USA is a leading developer, manufacturer and exporter of API, intermediates of API, Fragrance intermediates, Specialty Chemicals & other Customized Products.
Located in Asia's largest chemical industrial estate, Rampur, U.P., Agex Pharma begins its operations as a small scale unit in 1990 and in a span of three decades in market has emerged as a leading player
which believes in quality. Today with an inventory of 500+ products, 200+ clients globally Agex Pharma has placed itself in one of the most sought after companies in the nation for Fine and Rare Specialty
chemicals.Our business is based on a simple philosophy: to provide our customers with high quality fine chemicals at reasonable prices and with fast turn-around schedules.
ReplyDelete
Replies
vishnuJanuary 28, 2022 at 12:41 AM
Thank you so much for sharing this nice article.
Introducing our Next Level Complete Laundry Management & Dry-Cleaning software. Connect with us to get accelerated and cost-effective Laundry Management Software.
Visit Now: https://syswash.net/
ReplyDelete
Replies
SitansuFebruary 3, 2022 at 11:07 AM
Such an ideal piece of blog. It’s quite interesting to read content like this. I appreciate your blog

Best Homecare Services in Bangalore
caregiver services in bangalore
ReplyDelete
Replies
UnknownMarch 2, 2022 at 8:31 PM
Thanks for sharing this blog its very helpful to implement in our work
Marketing Agency in Dubai
ReplyDelete
Replies
CCASApril 14, 2022 at 7:22 AM
If you are searching for an Ethical Hackers Academy then you should be join Cyber Crime Awareness Society. Cyber Crime Awareness Society is one of the Best Online Ethical Hacking Institute In Jaipur. Cyber Crime Awareness Society is IT Consulting Company managed by Ethical Hackers & IT professionals, working with Police Agencies and Cyber Crime Cell of Government in India. It is also backed by a team of Experts working with RAW, CBI, ATS, IB and Cyber Crime Cell with an aim to create India the safest place of internet in the World
ReplyDelete
Replies
AnonymousMay 20, 2022 at 10:45 PM
폭스나인 폭스나인 폭스나인
ReplyDelete
Replies
fghSeptember 22, 2022 at 10:19 PM
Truly fascinating article. Wonder how this apply to digital transformation consultancy in dubai or when looking to hire a remote developer.
ReplyDelete
Replies
Interface Digital SolutionsAugust 12, 2023 at 2:05 AM
Unified communications and Ip Pbx includes the connection of various communication systems both for the collaboration tools as the digital workforce.
ReplyDelete
Replies
PradeepNovember 16, 2023 at 9:33 PM
Thanks for sharing this awesome and informative article. Your post has really made it easy to understand. I have also written some technical blogs at:

https://www.programink.com/python-training-in-bangalore.html
https://www.programink.com/django-training-in-bangalore.html
https://www.programink.com/aws-training-in-bangalore.html
https://www.programink.com/devops-training-in-bangalore.html
https://www.programink.com/selenium-training-in-bangalore.html
https://www.programink.com/data-science-training-courses-in-bangalore.html
ReplyDelete
Replies
DigitalBrollyJune 28, 2025 at 6:14 AM
Great overview, Maria! Your post on PocketSphinx is concise yet informative, and I appreciate how you've broken down the CMUSphinx suite and its components. The explanation of HMM-based speech recognition—distinguishing between training and decoding—makes it much easier to grasp for beginners stepping into voice tech. Looking forward to more insightful posts like this one.Looking to master the world of digital marketing? We provide comprehensive digital marketing courses that equip you with the skills to succeed in today's fast-paced online landscape AI Powered Digital Marketing Course In Hyderabad.
ReplyDelete
Replies
leotrainingsJuly 11, 2025 at 12:58 PM
Leo training is the best institute in Hyderabad
ReplyDelete
Replies
GenerativeaimastersOctober 4, 2025 at 6:28 AM
This insightful blog post demonstrates how to implement speech recognition in Python using the PocketSphinx library. The author provides a clear walkthrough of setting up the environment, capturing audio input, and processing it to convert speech into text. This approach is particularly beneficial for offline applications, as PocketSphinx operates without requiring an internet connection. The tutorial is well-structured, making it accessible for both beginners and experienced developers interested in integrating speech recognition into their projects.

Generative AI Training In Hyderabad
ReplyDelete
Replies
littuNovember 25, 2025 at 7:57 AM
Thanks for the detailed tutorial on using PocketSphinx for speech recognition! Your step-by-step approach really helps make the setup process easier to understand. If you're also preparing for exams, I recommend checking out sat coaching online from Fast Prep Academy to stay organized and focused during your study sessions.
ReplyDelete
Replies
Upskill GENERATIVE AIDecember 12, 2025 at 12:14 AM
🗣️ Great overview! Thank you for putting together such a clear and practical explanation of PocketSphinx and how it fits into the CMUSphinx project. I especially appreciate how you walked through the setup steps and clarified how HMM-based speech recognition works — it makes a complex topic much easier to understand for beginners like me. Looking forward to reading more posts like this! 🙌
mariangemarcano.blogspot.com

generative ai training in hyderabad
ReplyDelete
Replies
FDFebruary 2, 2026 at 5:40 AM
I must say this article is quite helpful and practical. It doesn’t feel like generic content at all.
Generative AI Training In Hyderabad
ReplyDelete
Replies
FDFebruary 27, 2026 at 1:12 AM
This makes it accessible to a wider audience interested in tech and learning.
Guidewire course offline
ReplyDelete
Replies
ashwithMarch 13, 2026 at 7:42 AM
Interesting tutorial on speech recognition with PocketSphinx. Voice-enabled applications are becoming increasingly important in AI and automation projects. Alongside these skills, learning cloud data platforms is also valuable for handling and analyzing large datasets. Those interested can explore Snowflake Training in Hyderabad to gain practical experience in cloud data warehousing and analytics.
ReplyDelete
Replies
unlimitmobile.comMarch 17, 2026 at 3:45 AM
Great post on speech recognition using PocketSphinx. It’s interesting to see how offline voice recognition can be implemented efficiently. Very useful for developers exploring AI and voice-based applications.
t mobile prepaid sim
ReplyDelete
Replies

Add comment