Eighteen months before Apple released their Siri software on the iPhone 4S, I found myself stranded in Dease Lake, British Columbia, and wrote a little speech recognition, text to voice, and computer interaction application. Here's a demo of how my Mac "Siri like" software application works:
If this demo looks impressive at all, 99% of that is due to the open source Sphinx4 speech recognition project. All I did was wrap a little "logic" code around that project, and resolve a few bugs that you can run into when talking to a computer that also talks back to you.
There's a very simple relationship in the software where I map what the computer thinks you said to actions I've defined. (The computer can't "learn" new actions, and the voice commands it allows are also predefined. I'd like to make it much smarter, but alas, time and bills have a way of taking me away from this project.)
When the computer speaks back to me, I'm just using the AppleScript "say" command. I've programmed some responses to be randomly selected from a list of potential responses, which you can hear when I say things like "computer" or "thank you."
Related technology and tutorials
If you're interested in the technology behind this speech recognition application, here are a few links to get you started:
- The Sphinx4 project
- My Scala tutorials
- My AppleScript tutorials
- My AppleScript alarm clock tutorial, which demonstrates the "say" command
The "eyes" at the top of the screen are a separate application, my Java Xeyes application.
The plain color background is provided by my free Mac Hyde ("Hide Your Desktop") application.
(I should add that the current name for the project is "Sarah", which is inspired by SARAH (the computer-controlled house) on the tv show Eureka.)