Visual information with voice and gestures is combined in multimodal interaction systems and also it provides flexible and powerful dialogue approaches, it enables the users to choose one or more of the multiple interaction modalities. These multiple modalities break down the barriers in adopting mobile devices for value added services and use of integrated multiple input modes enables users benefit from the natural approach used in the human communication. In order to satisfy the needs of pattern recognition and for image generation and for
Spoken natural language may appeal to users in the general public since it is the main modality used, together with pointing gestures or gaze, in face-to-face human communication. Our work on multimodal human-computer interaction is based on the two following observations. On the one hand, speech- and gesture-based multimodality has been extensively studied, both from a software and an ergonomic point of view. However, speech plus graphics as an output form of multimodality has raised fewer research studies, especially regarding the utility and
Multimodal Interaction Should Degrade Gracefully
Human interaction degrades gracefully; for example, a face-to-face conversation degrades gracefully in that it still remains effective when one of the participants in the conversation is functionally blind, e.g., when talking over a telephone. This form of graceful degradation is due to the high level of redundancy in human communication. As man-machine interfaces come to include multimodal interaction, we need to ensure that these interfaces degrade gracefully in a manner akin to human conversation.