Framework
- User does speech or/and gesture command
- Kinect senses audio signal or/and skeleton joint trajectory
- Commands are sent to their respective modules
- Speech and gesture recognition modules are run on separate threads
- Each module sends its command to Integrator(Primary/Complementary Integrator).
- If the speech command is "Go", then Complementary Integrator is called otherwise Primary Integrator is called.
- Primary Integrator waits for the other command (from different modality, speech/gesture) for 3 seconds.
- If it receives the other command then it decides which command to execute depending on the confidence levels and whether or not commands from both modalities are recognized as same commands.
- If the other command doesn't arrive,it decides the action with one command only.
Communication between Windows and Linux
- Speech and Gesture run in Windows under Visual C++ environment.
- The ROS runs on Linux in Virtual Machine.
- Integrator sends command (linear and angular velocity) from Windows to Linux over network. The client is the integrator module and http server, which is running in Linux, listens on 8080 port. The server receives the command and sends it to ROS.
- ROS executes command on the Turtlebot.
The two machines are connected via wireless router and communicate via simple server-client scripts.