My favorites | Sign in
Project Home Downloads Wiki Issues Source
Search
for
Discussion  
Discussion or questions about the speech module.
Featured
Updated Feb 4, 2010 by gundl...@gmail.com

Discussion

The wiki makes for a pretty simple forum. Feel free to discuss or ask questions below. I'll be happy to help you out.

Comment by drfrog...@gmail.com, Aug 14, 2008

im getting

ImportError?: No module named win32com.client

is there other stuff needs to be installed? speech sdk or com stuff?

Comment by project member gundl...@gmail.com, Aug 14, 2008

Right on both counts. From the Installation notes on the homepage:

If you don't already have it, you'll also need pywin32 (for Python 2.5 or for Python 2.4) and the Microsoft Speech kit (installer here).

Could you let me know how you found my module and how you went about getting it onto your system, so I can know better how to make those instructions less missable?

Michael

Comment by drfrog...@gmail.com, Aug 14, 2008

winxp

i tink i just read too fast

i found out about it on reddit.com

everything seems good after installing those two modules

thanks

Comment by Da.Dekud...@gmail.com, Aug 17, 2008

Think you could add an optional parameter to speech.input() that allows text, as well as speech? That would be very useful, particularly when it is having trouble recognizing voice.

Comment by project member gundl...@gmail.com, Aug 18, 2008

Hi,

It might be doable, but I'm not sure -- the builtin raw_input() function blocks the thread while you enter text, so if speech.input() wanted to allow text input as well, I think it would be forced to call raw_input(), thus forcing the user to type something in order to continue. Maybe it could be done by listening for individual keystroke events, or running things on multiple threads, but no guarantees.

As this would also complicate speech.input()'s interface, I'm leaving it out for now. If more people clamor for this, or if someone sent a working patch, I'd be more inclined to support it.

Be aware that speech.input() can be cancelled with Ctrl-C, so if you're having trouble with speech recognition, you could try something like

try:
  answer = speech.input("Say something, or press Ctrl-C if frustrated.")
except KeyboardInterrupt:
  answer = raw_input("OK, just type your answer: ")

Good luck, Michael

Comment by Da.Dekud...@gmail.com, Aug 20, 2008

You have a function that checks for voice input, but does other stuff until that happens, right? What if you made a raw_input, but if voice comes, skip it? Would that be possible?

Comment by project member gundl...@gmail.com, Aug 21, 2008

Right, that's the "multiple thread" approach I mentioned above -- not sure if it could be made to work, but possible. If you do get it to work, by all means post it here for everyone else's benefit! :)

Comment by Da.Dekud...@gmail.com, Aug 24, 2008

I wouldn't be able to do it-- I'm no good with Python as of yet. :P

Is it possible to have this work even if the Python Window is minimized? Meaning, the window is minimized to the task bar, but if I talk to the mic, it still listens, and acts like a normal Python script? Thanks! I'm coding myself a laptop robot (because I'm kind of lazy, and I'd love to have her help me out :P) and it would be great if I could have the window minimized, and still be able to talk to her.

Comment by project member gundl...@gmail.com, Aug 25, 2008

Yep, talking to a minimized window works fine.

Comment by arunkak...@gmail.com, Aug 31, 2008

I have Python 2.5 on Win XP Prof SP 2. I have installed the Microsoft Speech SDK and pywin32 linked in your install page.

When I first tried 'import speech' in IDLE I got: File "C:\Python25\Lib\site-packages\win32com\client\gencache.py", line 554, in AddModuleToCache?

dict = mod.CLSIDToClassMap
AttributeError?: 'module' object has no attribute 'CLSIDToClassMap'

After a little googling, I found a post regarding deleting the 'gen_py' directory in %WINDIR%\Temp. I did that and am now getting this error for which I can't seem to find any solution:

>>> import speech

Traceback (most recent call last):

File "<pyshell#1>", line 1, in <module>
import speech
File "C:\Python25\lib\site-packages\speech-0.5.1-py2.5.egg\speech.py", line 112, in <module>
TypeError?: Error when calling the metaclass bases
cannot create 'NoneType?' instances

Comment by project member gundl...@gmail.com, Aug 31, 2008

I'm sorry that you're having trouble! I'm using the same setup as you (except XP Home) and can't reproduce this.

A couple of suggestions:

  1. Type import speech in the regular Python shell (Start -> Run -> "cmd" -> "python.exe" or maybe "python25.exe") and see if you still have a problem. If not, IDLE is somehow playing badly with pywin32.
  2. If you feel like doing some digging, get the .tar.gz version of pyspeech (also available on PyPI for download), unzip it into a directory, and try import speech again from within that directory. Aka Start -> Run -> "cmd", then "cd <whatever unzipped directory has the file 'speech.py' in it>", then "python.exe". When you import speech it will use that local speech.py file rather than the installed egg, and your traceback will have more info in it.
  3. If all else fails, ask on the pywin32 mailing list, as this seems to be a problem with pywin32 as opposed to speech.py.

Good luck and please follow up with your findings,

Michael

Comment by arunkak...@gmail.com, Aug 31, 2008

A quick follow up. The problem turned out to be linked to the first error in my post above. There are two 'gen_py' directories, one in %WINDIR%\Temp, and the other in %PYTHON%\Lib\site-packages\win32com\. Deleting the one in site-packages and re-importing the speech module fixed the problem for me.

Arun

Comment by project member gundl...@gmail.com, Aug 31, 2008

Thanks, Arun. I should have paid more attention and noticed you mentioning the Temp directory... glad you set things right.

Michael

Comment by grauw...@gmail.com, Sep 23, 2008

What about non-standart voices? How can I choose the voice?

Grauwelf

Comment by project member gundl...@gmail.com, Sep 24, 2008

I only support very simple text-to-speech. Check out pyTTS for lots of flexibility in choosing voices. My module is more focused on speech-to-text.

Comment by tang.j...@gmail.com, Oct 20, 2008

Hi,

It's very kind of you to provide this speech.py. It's very helpful.

I want to write a application running at background and it could grab user voice input and do speech-to-text work.

But I don't know how to do this? Is it possible write a python app running at background and get user voice input as normal. Do you have any idea about this?

Any hint will be very appreciated. Thanks in advance.

Comment by project member gundl...@gmail.com, Oct 21, 2008

Hi Jiyu,

I've just responded to your private email as well -- sorry to take so many days to get back to you.

I believe that installing the Microsoft Speech SDK will fix the problem you were having, at which point the example code will work as intended. It is the basic building block for a background python app that gets user voice input and runs a callback function on the text.

I wrote http://musicbutler.googlecode.com as a test application, which runs in the background and controls your stereo in response to speech input. It's still in alpha, but it's a proof-of-concept for you if the example.py doesn't provide enough detail.

Thanks and good luck!

Comment by Da.Dekud...@gmail.com, Jan 2, 2009

I'm not sure if you are still supporting this, but if so, would it be possible for you to write a function that adds words to the dictionary (temporarily?)

For example. I say, "I have a bad case of the heebie-geebies" the system would be like, "what?" and say something completely different. Would it be possible for you to create a function that, should I put the phrase, "heebie-geebies" in it as a parameter (perhaps in an array?) it will recognize it if it can't find a better match?

I recall playing with: http://surguy.net/articles/speechrecognition.xml

And though I couldn't get it to work correctly, there is one snippet that could prove quite useful:

if name=='main':

wordsToAdd = "One", "Two", "Three", "Four" speechReco = SpeechRecognition?(wordsToAdd) while 1:
pythoncom.PumpWaitingMessages?()

Any chance of this happening?

Comment by project member gundl...@gmail.com, Jan 3, 2009

AFAIK, speech either works in dictation mode, in which it recognizes words based off of an unmodifiable dictionary, or in command mode, in which it tries to match your string of text to what you said.

The snippet you refer to (specifying words, then pumping messages) is done under the hood in speech.py by

listener = speech.listenfor(["one", "two", "three", "four"])

which pumps messages as long as listener.islistening().

So the short answer is, I don't think it's doable. The best you can hope for is specifying specific phrases that you expect it to hear, or else work with the standard dictation dictionary.

Good luck, Michael

Comment by titusz....@gmail.com, Feb 3, 2009

is it possible to use an audio file as input for speech recognition?

Comment by project member gundl...@gmail.com, Feb 5, 2009

titusz.pan, I haven't tried it, but I would love to know if it worked! I'm thinking about making a Skype audio bot, and that would almost certainly involve feeding speech.py the audio from a call as a file.

If you can get it to work, please post here for the good of the community, and I'll roll your work into the module!

Comment by Da.Dekud...@gmail.com, Feb 7, 2009

And there is no way, as far as you know, to add new phrases to the standard dictionary?

Comment by project member gundl...@gmail.com, Feb 8, 2009

Correct.

Comment by titusz....@gmail.com, Feb 10, 2009

here is a hint, that vista speech recognition can be feed with audio files: http://www.mymsspeech.com/microphones/prod_details.asp?prodID=228 see WSRToolkit Feature number 7

Comment by Da.Dekud...@gmail.com, Feb 20, 2009

I appear to be having trouble importing a Python file that has already imported speech.

http://code.google.com/p/pyspeech/issues/detail?id=17

Comment by project member gundl...@gmail.com, Mar 2, 2009

Thanks for your bug report. See http://code.google.com/p/pyspeech/issues/detail?id=17 for the solution.

Comment by chiragja...@gmail.com, Jul 5, 2009

I want to know during writing my python code, how can I know which voices are pre-installed?? Means in my windows Vista, Microsoft Anna comes as default, but if I install espeak for windows, then 3 more voices comes automatically. Now what I want during coding is that, which voice options are available to the user, so that I can show them into my application and user is able to select them directly from my application.

I want something like:

speak = win32com.client.Dispatch('Sapi.SpVoice?')

voices= speak.get_all_voices()

so that voices contains a list of all the voices available currently?? Please note that get_all_voices() is an imaginary function only for sake of clarity, I want something like this function. Is it available or not??

Thanks and regards

Chirag

Comment by project member gundl...@gmail.com, Jul 10, 2009

Hi Chirag, sorry for the delay in responding. pyspeech does not give you any control or insight over the installed voices; it just uses the one that you have currently selected for the system.

pyspeech is more focused on speech input, where you talk to the computer; it has minimal speech output support just as a convenience. You might check out the pyTTS package for more control over speech output.

Good luck!

Comment by ChrisM6794@gmail.com, Jul 10, 2009

This module is great! Do you know if there's any way to run two instances at one time on separate microphones? I'm thinking of having two people talking on their own USB headsets at once (different voice profiles) and it being able to listen to both of them.

Comment by project member gundl...@gmail.com, Jul 10, 2009

Man, that is a neat idea. Windows lets you do that -- have one profile trained on a male voice and the other on a female voice, or something? How do you normally tell it which profile to use for which microphone, when interacting directly with Windows?

I haven't heard of that and haven't any idea what the COM interop commands might be that would need to be implemented to get pyspeech to support that. You might ask over at the dragonfly project, as they are doing similar work and may have thought about this feature.

Comment by ChrisM6794@gmail.com, Jul 10, 2009

Windows lets you train separate profiles and select which one is active through the Speech Properties dialog in the Control Panel; but it looks like maybe you can only have one audio device and profile selected at any given time. But, I don't know if that's a limitation of the engine itself or if they just didn't put more options in the dialog.

Comment by project member gundl...@gmail.com, Jul 10, 2009

If Windows doesn't expose it in the UI, I'm going to guess that you can't do it through COM. But I'd be pleased to be proven wrong if you want to dig through their (poor) COM docs and find a way to do it! :)

Comment by ChrisM6794@gmail.com, Jul 14, 2009

I am going to take a crack at running two audio devices with separate profiles, although my head may explode in the process. I know nothing whatsoever about COM so it will be interesting, but looking at this page suggests that maybe you can set the AudioInput? and the Profile properties... or something:

http://msdn.microsoft.com/en-us/library/ms722071(VS.85).aspx

If I ever get something going, I'll let you know and contribute it back.

Comment by ChrisM6794@gmail.com, Jul 15, 2009

I've got a basic demo up and working! It seemed incredibly complicated at first, but turned out to be pretty convenient:

- for recognizer we use a SAPI.SpInProcRecognizer? instead, which is an instance bound to a specific process instead of one shared between all processes. The interface is the same though, so all your existing code still works.

- We can iterate over the profiles with recognizer.GetProfiles?() and set one for the recognizer.Profile property. If you don't pick a profile, it will default to "Default Speech Profile".

- The same goes for recognizer.GetAudioInputs?() and the recognizer.AudioInput? property. Note that you must specify an audio device here; it won't automatically use the default device like the shared one does.

That's actually all it takes, now you can run one process with a given audio device and profile, and a second process with a second audio and profile. It's not clear to me whether you can run both of them in the same process (probably not) but that shouldn't be a huge problem for me (we can use Popen or multiprocessing to spawn a process for each device).

You can see my code here: http://bitbucket.org/kiv/speech/ Please do whatever you want with it _

Comment by ChrisM6794@gmail.com, Jul 15, 2009

Wow, the wiki markup mangled my post a lot. Ah well.

Comment by project member gundl...@gmail.com, Aug 28, 2009

ChristM6794?: That's awesome! I've been busy since you wrote and haven't had a chance to check out your code until today. Way to go! I hope to have time some day to add a simple API to speech.py to optionally support multiple profiles. It would default to one, but you could specify a second if needed. I'll add a feature request for the future, pointing to your bitbucket code. (Of course, patches are very welcome :)

Comment by project member gundl...@gmail.com, Aug 28, 2009

ChrisM6794?: http://code.google.com/p/pyspeech/issues/detail?id=20 . Also, I got emailed your original wiki post, so no worries, I saw it in the original form :)

Comment by lpogor...@gmail.com, Nov 3, 2009

This works great in XP, but in Vista the speech recognizer recognizes not only the words in my grammar (e.g., the dwarfs in one of the example programs), but also system commands. Is there any way to turn off the Microsoft Speech Recognizer from accepting system commands? I think if it were in "dictation-only mode", this would work, but I don't see the functionality to do this.

Comment by project member gundl...@gmail.com, Nov 4, 2009

Ipogorman: Sorry, but I don't own a Vista machine; all my Vista testing has been through users reporting their experience.

Does this happen both when you use listenfor(phrases) and when you use listenforanything()? Or only in one of these modes?

Comment by lpogor...@gmail.com, Nov 4, 2009

It happens for both listenfor() and listenforanything() modes. It seems that the system first tests recognition against voiced system commands. So, if for instance, you say, "Open WordPad?", then WordPad? pops up, and this is not echoed in the program containing listenforanything(). I think if the program were able to specify dictation-only mode, then system commands would be disregarded. However, I can't see if this is possible from the Python program, and I can't figure out if there is the ability to turn off recognition of the system commands.

As I say, the XP speech was designed better to enable dictation and command-and-control separation. Maybe I should upgrade Vista to Windows 7 and hope it's better than Vista speech.

Comment by whitecro...@bak.rr.com, Nov 15, 2009

Hi Gundlach! It's Loren from Daniweb! Thought I'd FINALLY get to telling you how great your code is!

I'm using it on my robot, NINA, and its fabulous. My robot is pretty much completed and I'm working on a complex chat-kind of feature. Last Holloween, people were pretty impressed by my project (I stood outside with the robot and helped serve candy). They were especially excited that it was speech-commanded!

Great work, Gundlach! Many, many, many thanks!

By the way, any idea how your code might work on a Windows 7 machine?

Comment by project member gundl...@gmail.com, Dec 3, 2009

lpogorman: Sorry you've had trouble. I just bought a Windows 7 machine so will hopefully be able to diagnose this at some point in the next few months (unfortunately somewhat loaded with work ATM so not a lot of time for fun projects!)

Comment by project member gundl...@gmail.com, Dec 3, 2009

Loren: hi! You made my day. I'm so glad you've enjoyed the project and made such an awesome project of your own!

No idea how it will work on Windows 7, except for the comments you see in this thread re: Windows Vista (as Windows 7 has a lot of Vista under the covers.) I just got a Windows 7 machine so hope to try it out myself in the next few months.

Is NINA's code on the web? Does she physically move? How the heck did you do it?

Comment by tkoe...@gmail.com, Jan 20, 2010

has anybody else noticed that when using listenfor in the callback example it seems to keep repeating the last valid phrase it heard until you say something else?

Comment by project member gundl...@gmail.com, Jan 21, 2010

@tkoenig: Are you on Windows XP? On XP I've not seen a problem. I haven't yet tested on Vista or 7.

Comment by priyankp...@gmail.com, Jun 7, 2010

hey m getting some error like, Traceback (most recent call last):

File "<pyshell#1>", line 1, in <module>
from Bio.Tools import translate
ImportError?: No module named Tools

is there anything missin while installation???

Comment by project member gundl...@gmail.com, Jun 10, 2010

@priyankpatel4903: I don't use pyshell or Bio.Tools, whatever those are. Maybe you installed pyshell and should get rid of it? Sorry I can't help besides to tell you that it's not a pyspeech problem.

Comment by jeffcham...@gmail.com, Jul 8, 2010

First of all, I love pyspeech!

Second, is there any way to make Windows 7 stop responding to commands outside of my Python program? It's incredibly annoying when I say one thing and Windows thinks I'm talking to it and, for example, minimizes windows or switches windows, etc.

Basically, is there a way to limit speech input to ONLY my program and prevent Windows from listening for its own commands?

Any information would be VERY much appreciated!

Comment by project member gundl...@gmail.com, Jul 19, 2010

Hi Jeff, I'm glad to hear you like pyspeech! :) And, sorry to have taken so long to respond -- I've been out of the country with no internet access.

I've only used pyspeech a bit in Windows 7 (I did most of the work in XP and haven't added more features since then), but during that time I didn't have the problems you mentioned. Maybe I just didn't say the right things. Are you able to turn off Windows 7 speech recognition and then start your pyspeech-enabled program successfully? Maybe under the Control Panel there's a way to ask it to not listen for anything except for Dictation?

Sorry I can't be of more help -- all of my time is spent on AdBlock? for Chrome+Safari these days! Last tip: you might try out Dragonfly, another module like pyspeech; the author may have figured out a way around this problem already.

Comment by saulpila...@gmail.com, Aug 18, 2010

hi, when I call input() the program gets blocked even though I said a command (this only happens when I use a code that also uses listenfor().

Comment by project member gundl...@gmail.com, Aug 19, 2010

Hi Saulpila2000. After reviewing the code (which I haven't looked at in a while) I would guess that your listenfor() handler is receiving your speech instead of your input() call. You could test this by making your listenfor() handler just print out the text phrase that it received; then call input(), speak into it, and see if listenfor() prints instead.

The workaround would be something like:

listener = listenfor(whatever) # ... # now it's time to call input(), so you do this: listener.stoplistening() x = input() # listenfor isn't around to interfere listener = listenfor(whatever) # restart listenfor()

hope this helps (and I hope my guess is correct!)

Comment by saulpila...@gmail.com, Aug 19, 2010

no, I tried to call first stoplistening() but the program is still not responding besides I say a comand... Is there another way to asign a sayed command like comand = what you say? that works like input but REALLY WORKS

Comment by saulpila...@gmail.com, Aug 19, 2010

ok I solved my problem but i hav to say the command twice to happen, Is there a way to emulate a command?

Comment by MissLHan...@gmail.com, Sep 2, 2010

Hey before I start, I have python 2.7 already installed, do I need to downgrade or is the package upward compatible?

Comment by project member gundl...@gmail.com, Sep 3, 2010

Hi MissLHansen, the 2.x series introduces no breaking changes, so the 2.5 code should work with 2.7. I honestly can't remember what the difference between 2.4 and 2.5 pyspeech versions are now... it may just have to do with the setup.py packaging.

Anyway, let me know if you have trouble! - Michael

Comment by Da.Dekud...@gmail.com, Sep 29, 2010

Is it at all possible to turn off / hide the Windows Speech toolbar (and most of its features!) while using pyspeech? For example, I'd like to be able to talk, but not have to worry about the Windows 7 accessibility speech program.

Comment by project member gundl...@gmail.com, Sep 29, 2010

i'm not aware of how to use this is Windows 7. If you do figure it out, please post it here!

Comment by Da.Dekud...@gmail.com, Sep 29, 2010

I believe it is the same as Vista. Though it is, indeed, possible to hide the toolbar, I don't yet see a way to prevent it from writing down everything I say to it...which appears to be the biggest flaw in using pySpeech in Windows Vista or 7.

I'll let you know what I find out.

Comment by jake...@gmail.com, Dec 18, 2010

I have python 2.5 with everything installed correctly. for some reason I get this error message:

Traceback (most recent call last):

File "C:\Documents and Settings\Jared\My Documents\Speech test.py", line 1, in <module>
import speech
File "C:\Documents and Settings\Jared\My Documents\speech.py", line 55, in <module>
from win32com.client import constants as constants
File "C:\Python25\lib\site-packages\win32com\init.py", line 5, in <module>
import win32api, sys, os
ImportError?: DLL load failed: The specified module could not be found.

Any help?

Comment by fil...@gmail.com, Jun 18, 2011

Any progress on getting speech.py to work on windows 7?

Comment by fil...@gmail.com, Jun 18, 2011
Comment by rohc...@gmail.com, Jan 21, 2012

i have downloaded pyspeech for python 2.5, but will it work on 2.7 i also installed it using easy_install, but when i try to import it in my program its unable to do so kindly tell me what i need to do

Comment by sjkre...@gmail.com, Feb 21, 2012

When I use speech.input() the Microsoft GUI opens up and indicates that it is OFF. Shouldn't it be "Listening"? I'm using windows7 with python 2.7

Comment by troubled...@gmail.com, Feb 24, 2012

Question! What voice recognition software do you suggest when using it with the code?

Comment by loureir...@gmail.com, Apr 9, 2012

Anyone found out if this works well in Windows 7 (especially Home Premium 64bits) with Python 2.7 ? Thanks in advance!


Sign in to add a comment
Powered by Google Project Hosting