/ Hikaru Date / Professor
/ Jintae Lee / Assistant Professor
The multimedia devices presently available add sound, still pictures, and moving pictures to traditional computer displays which show only text and diagrams. The addition of these few functions, however, offers remarkable convenience to the user of such multimedia devices. For this reason, it is widely believed that the impact of these devices on human society in the near future will be extremely great.
The properties desirable for multimedia devices can be visualized on a graph with two axes. The horizontal axis represents a device's capacity for depth and variety of communication and the vertical axis represents the capacity for naturalness and high fidelity. Ideally, multimedia devices should be far along both axes if they are to offer the most comfortable and efficient human interfaces. Unfortunately, progress along these dimensions has not gone far enough. Prolonged use of computers still fatigues the body and fails to match the depth and quality of communication that can exist between humans in face-to-face encounters.
For example, human beings can communicate with great ease and variety through the spoken word. Computers cannot. In spite of the fact that speech synthesis by computers is almost perfect, speech recognition has not reached the level of practical application that natural speech can achieve. Furthermore speech recognition algorithms are overly sensitive to existing noise. The area of speech recognition is still in the primitive stages and needs to evolve further to make multimedia devices more effective.
Sign language is another means of human communication that computers cannot participate in as easily as those who are forced to use it because of hearing impairment. If a computer could be made to communicate via this medium, far more people could access intellectual services and actively participate in multimedia that have been unable to do so thus far. At present, pioneer research to incorporate this dimension of communication as far as synthesis and recognition technology has just begun in our laboratory.
As for the dimension of high fidelity, the digitalization of high definition TV is one example of progress that has been made along the vertical axis. Another area that shows potential is research in virtual reality which can capture suspected experience of the three dimensional world. Our laboratory is contributing to the research in this area with studies of high fidelity technology related to visual and auditory sensations which allow natural, integrated spatial sensation with minimal mental effort.
Because the fiscal year 1993 was the starting year of our laboratory, we spent much time assembling and learning how to use the equipment to its greatest potential. Signal processing technology is indispensable for the generation, processing, and recognition of visual, auditory and control signals. Furthermore, multimedia devices are preferable to work in real-time. These requirements can be fulfilled not only by high computing processor speed and large memories, but also by fast algorithms. Therefore, the study of signal processing is fundamental in our laboratory.
This year, Professor Date proposed a modified FIR method for fast wavelet transform at the International Wavelet Conference held in Italy in October. In order to develop a practical speech recognition machine, Professor Date, in cooperation with Professor Sugiyama of the Human Interface Laboratory, is now developing a noise canceling microphone based on a new principle. Another member of our laboratory, Professor Lee continued his sign language research and presented papers at two international conferences in Europe. Professor Lee also prepared the experimental equipment for pen-computer research, in cooperation with Professor Mori of the Image Processing Laboratory.
Refereed Journal Papers
The multi-microphone system (MMS), which was previously described according to Kirchhoff's integral equation representation of sound field, was shown to have spatially separable characteristics in sound reception. This paper introduces an adjustment of of the MMS, in order to convert it into a super close-talking microphone (CTM) with excellent range dependent sensitivity, by changing the gain distribution of microphone pairs on the MMS boundary. Initially, a qualitative explanation of this method is given, and then several criteria which determine the optimum characteristics of a close-talking microphone are described. Finally, the results of a computer simulation are described in detail, using a criterion which is a combination of two gain differences. One gain difference is between two points near the MMS; while the other gain difference is between a point near the MMS and a point distant from the MMS. It is concluded that both spatial differentiation characteristics in the gain distribution, and a small division factor in the signal processing, promises us a super close-talking microphone with a wide frequency range and other excellent features.
Computer animation and visualization can facilitate communication between the hearing impaired and those with normal speaking capabilities. This paper presents a model of a system that is capable of translating text from a natural language into animated sign language. Techniques have been developed to analyze language and transform it into sign language in a systematic way. A hand motion coding method as applied to the hand motion representation, and control has also been investigated. Two translation examples are also given to demonstrate the practicality of the system.
Refereed Proceeding Papers
Simulation of hand motions is a complicated task since its articulation makes complex movements with at least 27 degrees of freedom involving various constraints. A new approach to hand modeling, reflecting constraints of human hands, is presented. The validity of the presented model is verified through experiments to automatically recognize complex hand motions based on the model.
The use of computers to break down communication barriers with the hearing impaired is one of the most challenging tasks of computer application. Our work to generate and recognize sign language based on graphic models, along with the in-depth review of the problems in the conventional approaches, has been summarized in this paper. The explanation is done divided into two subtasks, sign language generation and sign language recognition.