|
Speech
We tried working with the available open source Speech Recognition Software but each one has its own limitations. They require training to be done manually and understanding their code is again the challenge. Due to limited amount of time and meet the demands of the project, we use Microsoft Speech Recognition(MSR) for Kinect which is a freeware. This has many advantages. It supports many language packs both for speech recognition and speech synthesis. Its grammar and vocabulary are much more advance and very easy to use. It doesn't require any training for the models and its performance is much better than existing available open source software's.
The above figure shows how Speech Recognition is performed using MSR. The Kinect handle is requested from the main application. Using Kinect handle we initialize audio stream and start capturing audio. Once it is successful, we start speech recognition engine. Once engine is started, we load the grammar which we want to recognize. Now we are ready to listen from the Kinect. Once we are ready to listen from user, we wait for the commands to arrive. Once we receive commands, the MSR gives us the recognized word with its confidence. If the confidence greater than threshold(0.5), we send the command, its confidence and source angle of user to the Integrator. Otherwise if confidence less than threshold, then we first stop the audio capture and the robot says "Again Please" (using TTS) and then you have to repeat the command. | |