Mac speech recognition - text to speech, and speech to text

Mac speech recognition FAQ: How can I work around the bugs in the Apple Mac speech recognition software?

I'd like to say I've been having a great time with the Mac OS X speech recognition capabilities in OS X 10.6 (Snow Leopard), but the truth is that it seems to have a lot of bugs. Many AppleScript developers on the internet are saying that Apple apparently "broke" the speech recognition server in Leopard, and has never fixed it in Snow Leopard. That's a real bummer, because it's a lot of fun to work with.

Mac speech recognition workarounds

Fortunately, Mac AppleScript developers keep coming up with workarounds, so here's one Mac speech recognition workaround that shows:

  1. How to get your Mac OS X system to prompt you with a question,
  2. Listen for your reply from a list of possible replies, and
  3. Take some action based on your reply.

I can't take credit for most of this script; it's based on this excellent thread on MacScripter.net, including (a) the workaround posted at the end of the thread, which partially solves the problem, and (b) my own addition, where I have to kill the Mac SpeechRecognitionServer on my system to get my AppleScript script to keep running.

A Mac OS X "text to speech" and "speech to text" example

Without any further ado, here's an AppleScript "text to speech" and "speech to text" example, with a few comments to make it all easier to understand. (AppleScript comments begin with the "--" characters.)

-- the computer says this
say "Do you think the Cubs will win the World Series this year?"

-- the computer listens for possible answers. we've told it to
-- listen for either "yes" or "no"
tell application "SpeechRecognitionServer" to set theResponse to listen for {"yes", "no"}
if theResponse is "yes" then
  -- if you answer yes, the computer responds here
	say "Wow, you have a lot of faith."
else
  -- if you answer no, the computer responds here
	say "The odds are definitely against them."
end if

-- the Mac Snow Leopard SpeechRecognitionServer won't go away until it times out with
-- an error, so kill it here
delay 1
do shell script "killall SpeechRecognitionServer"

Mac OS X text/speech speak/listen example - summary

I hope this simple Mac OS X "text to speech" and "speech to text" example is helpful. Again, kudos to the AppleScript developers on MacScripter.net for the initial example and bug fix.

I just looked for the AppleScript Dictionary for the Mac SpeechRecognitionServer, and when I did, I think I see why there are so many problems with this; it is written in Carbon, Apple's older programming technology. I found this when looking for the AppleScript dictionary for the speech recognition server, which I found by using the Unix locate command, and then browsing here from the AppleScript Dictionary browse command:

/System/Library/Frameworks/Carbon.framework/Versions/A/Frameworks/SpeechRecognition.framework/Versions/A/Resources/SpeechRecognitionServer.app

As you can see from that path name, the SpeechRecognitionServer has a Carbon footprint. (Sorry, bad joke.) Hopefully one day they will update this to their newer Cocoa developer framework.

These bugs and this Carbon finding have me down a little bit this morning, but I'm really trying to get to a point where I can interact with my Mac OS X system using the "text to speech" and "speech to text" capabilities. I'd like to do all the usual things people dream about, including using the Mac "text to speech" capability to have the system prompt me with questions and listen for answers, have it read the weather, news, stock market and email reports, and having the system read documents, such as Wikipedia pages, using the Mac "text to speech" capability.

On the flip side, I'd like to use the Mac "speech to text" capability (the ability for the Mac OS X system to listen to my speech) to tell the Mac what to do, including play music, radio stations, or again, have it open and read (speak) documents.