English
◆ Annual Review 2002

Human Interface Laboratory


Masahide Sugiyama
Professor

Michael Cohen
Associate Professor

Jie Huang
Associate Professor

Minoru Ueda
Assistant Professor
Referred Journal Papers
[j-huang-01:2002]C. Zhao, Y. Ohtake, and J. Huang. Robot Position Calibration Using Colored Rectangle SignBoards. J. three dimensional images, 17(1):166-169, 2003.
[j-huang-02:2002]K. Yoshida and J. Huang. Comppetition between Different Primary Cues for Sound Component Organization in Human Audition. J. three dimensional images, 16(4):42-47, 2002.
[mcohen-01:2002]Michael Cohen Wenxi Chen and Daming Wei. Ubiquitous health monitoring and management using mobile telephony. 3D Forum: J. of Three Dimensional Images, 17(1):104-108, 2003.
A mobile phone-based Internet system for health monitoring and management is proposed. This system comprises an Internet server for automatic health information processing and thousands of wearable compound devices for monitoring physiological signals. A wearable compound device (dubbed an "i-monitor") consists of a tiny attachable pad ("i-pad") and an i-mode cellular phone ("i-phone"). Multiple physiological signals are acquired by the i-pad, which is attached to the breast surface beneath the clothes of a patient. Preliminary diagnosis is performed within the i-pad. Whenever abnormality is detected, the i-pad initiates a connection with the i-phone, transmitting data to the i-phone using the Bluetooth protocol. The i-phone relays this data to an i-health server through the DoCoMoDoPa wireless network using the tcp/ip protocol. The i-health server performs comprehensive analysis of the received data using various data mining methods, and then returns the diagnostic outcome as well as healthcare instructions back to the patient.
[mcohen-02:2002]Michael Cohen. The Internet Chair. IJHCI: Int. J. of HumanComputer Interaction, 15(2):297-311, 2003.
A pivot (swivel, rotating) chair is considered as an i/o device, an information appliance. As implemented, the main input modality is orientation tracking, which dynamically selects transfer functions used to spatialize audio in a rotation-invariant soundscape. In groupware situations, like teleconferencing or chat spaces, such orientation tracking can also be used to twist iconic representations of a seated user, avatars in a virtual world, enabling social situation awareness via coupled visual displays, fixed virtual source locations, and projection of non-omnidirectional sources.
[mcohen-03:2002]Noor Alamshah Bolhassan Dishna Wanasinghe Owen Newton Fernando, Michael Cohen and Toshifumi Kanno. Mobile Control in Cyberspace of Image-based & Computer Graphic Scenes and Spatial Audio Using Stereo QTVR and Java3D. 3D Forum: J. of Three Dimensional Images, 16(4):101-106, 2002.
Anticipating ubicomp (for ubiquitous computing) networked applications and information spaces, we have developedand integrated various multimodal i/o devices into a virtual reality groupware system. In this paper, we present one of many interesting part of our CVE system that enables users to view stereographic QTVR movie with spatial audio. We have deployed a Java-equipped mobile phone capable of interacting with this virtual environment groupware suite, interfaced through a "servent," a Server/Client hybrid http tcp/ip gateway.By adjusting audio (intensity stereo) panning, through the Java Media Foundation, of a virtual source with respect to the orientation of a user, a virtual soundscape may be rotated and stabilized, registrable with an actual physical space (for alignment of auditory cues with real-life events or locations). Beside using ordinary qtvr controllers like mouse and keyboard, users can also utilize their mobile phone (DoCoMo i-appli) to control the display. A Java framework integrates these different spatial media modalities: audio and visual.
[mcohen-04:2002]Jens Herder and Michael Cohen. The Helical Keyboard: Perspectives for Spatial Auditory Displays and Visual Music. J. of New Music Research, 31(3):269-281, 2002.
Auditory displays with the ability to dynamically spatialize virtual sound sources under realtime conditions enable advanced applications for art and music. The listener can be deeply immersed while interacting and participating in the experience. We review some of those applications while focusing on the Helical Keyboard project and discussing the required technology. Inspired by the cyclical nature of octaves and helical structure of a scale, amodel of a piano-style keyboard was prepared, which was then geometrically warped into a helicoidal configuration, one octave/revolution, pitchmapped to height and chroma. It can be driven by midi events, realtime or sequenced, which stream is both synthesized and spatialized by a spatial sound display. The sound of the respective notes is spatialized with respect to sinks, avatars of the human user, by default in the tube of the helix. Alternative coloring schemes can be applied, including a color map compatible with chromastereoptic eyewear. The graphical display animates polygons, interpolating between the note of a chord across the tube of the helix. Recognition of simple chords allows directionalization of all the notes of a major triad from the position of its musical root. The system is designed to allow, for instance, separate audition of harmonyand melody, commonly playedby the left and right hands, respectively, on a normal keyboard. Perhaps the most exotic feature of the interface is the ability to fork one's presence, replicating subject instead of object by installing multiple sinks at arbitrary places around a virtual scene so that, for example, harmony and melody can be separately spatialized, using two heads to normalize the octave; such a technique e??ectively doubles the helix from the perspective of a single listener. Rather than a symmetric arrangement of the individual helices, they are perceptually superimposed in-phase, coextensively,
Referred Proceeding Papers
[j-huang-03:2002]H. Sato and J. Huang. Investigating the Quantitative Factors for Sound Integration and Segregation in Human Audition. In Proc. 9th Australian Int. Conf. Speech Science and Technology, 2002.
[j-huang-04:2002]J. Huang, K. Kume, A. Saji, M. Nishihashi, T. Watanabe, and W Martens. Robotic Spatial Sound Localization and Its 3-D Sound Human Interface. In Proc. First Int. Sym. Cyber Worlds, 2002.
[j-huang-05:2002]J. Huang, Y. Ohtake, and C. Zhao. Robot Position Calibration Using Colored Rectangle Signboards. In Porc. Third Int. Conf. Computer and Information Technology, pages 139-142.
[j-huang-06:2002]K. Yoshida and J. Huang. Competition between Different Pri- mary Cues for Sound Component Organization in Human Audition. In Porc. Int. Conf. Humans and Computers, pages 7-12, 2002.
[j-huang-07:2002]J. Huang, K. Kume, M. Saji, M. Nishihashi, T.Watanabe, and W.Martens. AMultimodalTele-Robot as a Mobile Intelligent Communication Interface - Spatial Processing and 3-D Sound Human Interface -. In Porc. Int. Conf. Distributed Multimedia Systems, pages 464-470.
[j-huang-08:2002]J. Huang. Spatial Auditory Processing for a Hearing Robot. In Proc. IEEE Int. Conf. Multimedia and Expo, 2002.
[mcohen-05:2002]Michael Cohen. Emerging exotic auditory interfaces. In 114th Convention of the AES (Audio Engineering Society), Amsterdam, March 2003.
Anticipating some emerging audio devices and features, this survey outlines some trends in mobile telephony and computing, wearable/intimate multimedia computing, handheld/nomadic/portable interfaces, and embedded systems like robotics, wearable computing, multimedia furniture, and spatially immersive displays, gleaned from recent announcements and publications by industrial and academic laboratories, and the author's own research group. Such extended and enriched audio interfaces, especially coupled with position tracking systems, encourage multipresence, the inhabiting by sources and sinks of multiple spaces simultaneously, allowing, for instance, a user to monitor several aligned spaces at once (conferences, entertainment, navigation, warnings, etc.). Representative instances are cited, and groupware selection functions and their predicate calculus notation are revi ewed. Keywords: audio and speech interaction, CVEs (collaborative virtual environments), embedded systems, handheld/mobile/nomadic/portable interfaces, integration of mobile devices and telecommunication, location-aware interaction and location-based services, mobile information device, mobile internet services, multimodal interaction, novel user interfaces, pervasive Java, telematics, telerobotics, ubicomp (ubiquitous computing) (a.k.a. ambient, calm, pervasive) technology, wearable/intimate multimedia computing.
[mcohen-06:2002]Owen Newton Fernando Tomoya Kamada William L. Martens Hiroki Osaka Noor Alamshah Bolhassan, Michael Cohen and Takuzou Yoshikawa. "Just Look At Yourself!": Stereographic Exocentric Visualization and Emulation of Stereographic Panoramic Dollying. In Proc. ICAT: Int. Conf. on Artificial Reality and Tele-Existence, pages 146-153, Tokyo: University of Tokyo, December 2002. VRSJ (Virtual Reality Society of Japan).
Previous research introduced the idea of stereographic panoramic browsing, via our VR U C client, subsequently extended to allow not only zooming, a simple magnification-driven manipulation of the image, but also dollying, allowing limited viewpoint motion| enabled bymultiple panoramic captures and enabling looming, viewpoint-selective occlusion and revealing of background scene objects. We have integrated this client with a sibling client based on Java3D, allowing realtime visualization of the dollying, and juxtaposition of stereographic cg (computer graphic) and ibr (image-based rendering) displays. The J3D application visualizes and emulates VR U C, integrating cylinders dynamically texture-mapped with a set of panoramic scenes into a 3d cg model of the same space as that captured by the set of panoramic scenes. The transparency of the cg 3d scene and also the photorealistic panoramic scene, as well as the size of the cylinder upon which it is texture-mapped, can be adjusted at runtime to understand the relationship of the spaces. Keywords: mixed reality/virtuality, image-based rendering, panoramic navigation.
[mcohen-07:2002]Yoshiki Yatsuyanagi Masahiro Sasaki S??o Yamaoka Michael Cohen, Takuya Azumi and Osamu Takeichi. Networked Speaker Array Streaming Back to Client: the World's Most Expensive Sound Spatializer In Proc. ICAT: Int. Conf. on Artificial Reality and Tele-Existence, pages 162-169, Tokyo: University of Tokyo, December 2002. VRSJ (Virtual Reality Society of Japan).
We have integrated the spatial sound capability of our University-Business Innovation Center's 3D Theater with our collaborative virtual environment suite to build asound spatializer based onthe actual room, which can combine artificially directionalized sources naturally with ambient sounds and transmit to remote users, either unicast to a designated user running multimodal groupware or broadcast through a re??ector, who can experience telepresence and dynamic control of artificially spatialized sources. A "reality room" equipped with two sets of sound spatialization speaker arrays is networked to allow remote control. An interface crafted with Java3D presents a 3D model of that space, along with widgets to steer its three mixels of spatial audio, through a perspective display configurable for stereographic or mixed-mode ("security camera") styles. A binaural dummy head suspended from the middle of the ceiling picks up the soundscape, comprising sources spatialized by the hemispherical and equatorial speaker arrays as well as ambient sounds (the voices of people in the room). The stereo pair is captured, sampled, digitized, compressed and encoded by an mpeg-4 broadcaster, and streamed by a network streaming server back to the original remote client, which feedback closes the control/display loop. Keywords: mobile computing, cve (collaborative virtual environments), groupware, cscw (computer-supported collaborative work), mixed audio reality, telepresence, immobot, roomware.
[mcohen-08:2002]William L. Martens Noor AlamshahBolhassan and Michael Cohen. Beyond Flat Panning and Zooming: Dolly-Enhanced SQTVR. In Vladimir V.Savchenko Shietung Peng and Shuichi Yukita, editors, Proc. CW'02: First Int. Symp. on Cyber Worlds: Theory and Practice, pages 545-552, Tokyo: Hosei University, November 2002.
[mcohen-09:2002]Noor Alamshah Bolhassan Dishna Wanasinghe Owen Newton Fernando, Michael Cohen and Toshifumi Kanno. Mobile Control in Cyberspace of Image-based & Computer Graphic Scenes and Spatial Audio Using Stereo QTVR and Java3D. In Proc. HC2002: 5th Int. Conf. on Human and Computer, pages 92-97, Aizu-Wakamatsu: University of Aizu, September 2002.
Anticipating ubicomp (for ubiquitous computing) networked applications and information spaces, we have developedand integrated various multimodal i/o devices into a virtual reality groupware system. In this paper, we present one of many interesting part of our CVE system that enables users to view stereographic qtvr movie with spatial audio. We have deployed a Java-equipped mobile phone capable of interacting with this virtual environment groupware suite, interfaced through a "servent," a Server/Client hybrid http tcp/ip gateway.By adjusting audio (intensity stereo) panning, through the JavaMedia Foundation, of a virtual source with respect to the orientation of a user, a virtual soundscape may be rotated and stabilized, registrable with an actual physical space (for alignment of auditory cues with real-life events or locations). Beside using ordinary qtvr controllers like mouse and keyboard, users can also utilize their mobile phone (DoCoMo i-appli) to control the display. A Java framework integrates these different spatial media modalities: audio and visual.
[sugiyama-01:2002]Masatoshi Watanabe and Masahide Sugiyama. Information Retrieval Based on Speech Recognition Results. In Proc. of International Conference on Spoken Langauge Processing, page ThA32p.2, Sep. 2002.
[sugiyama-02:2002]Takahiro Suzuki, Takafusa Kitazume, and Masahide Sugiyama. Latest achievement of VC project for automatic video caption generation. In Proc. of Multi Media Signal Processing, pages 37-40. IEEE, Dec. 2002.
Unrefereed Papers
[sugiyama-03:2002]Tomokazu Muto and Masahide Sugiyama. Efficient Model Based Voice Decomposition Method Using Tree Structured Codebook. In Proc. of Acoustic Society Japan, pages 157{158, Japan, Sep. 2002. ASJ.
[sugiyama-04:2002]Masafumi Kurita, Takahiro Suzuki, and Masahide Sugiyama. Detection for Video Caption Generation in VCML Player. In Technical Report of Speech Processing, pages 13{18. IEICE/ASJ, IEICE/ASJ, Oct. 2002.
[sugiyama-05:2002]Takahiro Suzuki, Masafumi Kurita, and Masahide Sugiyama. Latest Achievement of VC Project for Automatic Video Caption Generation. InTechnical Report of HIResearch Committee, pages 1-8. HI, IPSJ, Nov. 2002.
[sugiyama-06:2002]Masafumi Kurita and Masahide Sugiyama. Laghter Detection using VQ Classifier and Duration Information. In Proc. of ECEI2002, Aug. 2002.
[sugiyama-07:2002]Tomoya Narita and Masahide Sugiyama. Construction of Music 100 Database. In Univ. of Aizu, editor, Technical Report in Univ. of Aizu, pages TR2002-1-002. Univ. of Aizu, Univ. of Aizu, Apr. 2002.
Chapters in Book
[mcohen-10:2002]Michael Cohen. Virtual Reality, pages 1{2. Mathematics. MacMillan Reference, New York, 2002.
Volume 4; isbn0-02-865565-6
Grants
[mcohen-11:2002]Michael Cohen. Fukushima Foundation for the Advancement of Science and Education, 2003.
[sugiyama-08:2002]Masahide Sugiyama. Fukushima Prefectural Foundation for Advancement of Science and Education, 2002.
Academic Activities
[j-huang-09:2002]J. Huang, 2002. Program Committee member
[j-huang-10:2002]J. Huang, 2002. Program Committee member
[sugiyama-09:2002]Masahide Sugiyama, Apr. 2002. Associate Editor of ED, IEICE
[sugiyama-10:2002]Masahide Sugiyama, 2002. Editor of Acoustic Technology Series, ASJ
[sugiyama-11:2002]Masahide Sugiyama, 2002. member of Council, ASJ
Ph.D and Other Thesis
[j-huang-11:2002]Yousuke Endor. Graduation Thesis: 3-D Sound Reproduction by a 4-Speaker System, University of Aizu, 2002.
Thesis Advisor: Huang, J.
[j-huang-12:2002]Ruriko Kaneda. Graduation Thesis: A Method of HRTF Classification and Finding Proper HRTF for Individuals, University of Aizu, 2002.
Thesis Advisor: Huang, J.
[j-huang-13:2002]Takuya Kobayashi. Graduation Thesis: Data Streaming and Managing in a Telerobot System, University of Aizu, 2002.
Thesis Advisor: Huang, J.
[j-huang-14:2002]Keita Sato. Graduation Thesis: Mobile Robot Navigation for a patrol robot system, University of Aizu, 2002.
Thesis Advisor: Huang, J.
[j-huang-15:2002]Kousuke Suzuki. Graduation Thesis: Sound Separation based on Perceptual Cues in Human Audition, University of Aizu, 2002.
Thesis Advisor: Huang, J.
[j-huang-16:2002]Hiroaki Tonooka. Graduation Thesis: Telerobot Operation and Telecommunication Using Cellular Phone, University of Aizu, 2002.
Thesis Advisor: Huang, J.
[j-huang-17:2002]Izumi Furukawa. Graduation Thesis: Interaction Between Different Primary Cues for Sound Integration and Segregation: Psychological Investigation Part III, University of Aizu, 2002.
Thesis Advisor: Huang, J.
[j-huang-18:2002]Teppei Watanabe. Master Thesis: The Development of a RealTime Spatial Sound Localization System for a Mobile Robot, University of Aizu, 2002.
Thesis Advisor: Huang, J.
[mcohen-12:2002]Yoshiki Yatsuyanagi. Graduation Thesis: Cross-Fading RSS-10 Driver Enhancement Using Gain Control, University of Aizu, 2002.
Thesis Advisor: Michael Cohen
[mcohen-13:2002]Masahiro Sasaki. Graduation Thesis: Development of a Networked Spatial Audio Speaker Array System using RSS-10s, University of Aizu, 2002.
Thesis Advisor: Michael Cohen
[mcohen-14:2002]Takuzou Yoshikawa. Graduation Thesis: Humanoid Interface for Hybrid CG / Texture-Mapped Space, University of Aizu, 2002.
Thesis Advisor: Michael Cohen
[mcohen-15:2002]Hiroaki Osaka. Graduation Thesis: Development of a Panoramic Browser Emulator Using Runtime-Adjustable Transparency, University of Aizu, 2002.
Thesis Advisor: Michael Cohen
[mcohen-16:2002]Tomoya Kamada. Graduation Thesis: 3-Dimensional and Interactive Graphical User Interface Featuring Panoramic Stereographic Mixed Reality, University of Aizu, 2002.
Thesis Advisor: Michael Cohen
[mcohen-17:2002]Makoto Kawaguchi. Graduation Thesis: Extending a CVE Client for Mobile Phone, University of Aizu, 2002.
Thesis Advisor: Michael Cohen
[mcohen-18:2002]Yoshio Kawase. Graduation Thesis: Pitch detection with Cepstrum analysis, University of Aizu, 2002.
Thesis Advisor: Michael Cohen
[mcohen-19:2002]Owen Newton Fernando. Master Thesis: Java3D-based Development of a Spatial Audio Multipresence Application Featuring Narrowcasting Functions and Rotatable Loudspeaker Axis (Pairwise-Selected Speakers) for a Collaborative Virtual Environment, University of Aizu, 2002.
Thesis Advisor: Michael Cohen
[sugiyama-12:2002]Yamato Wada. Graduation Thesis: Automatic VCML Document Generation for Sounds with Voice, Music andBGM, University of Aizu, 2002.
Thesis Advisor: Masahide Sugiyama
[sugiyama-13:2002]Keishiro Satoh. Graduation Thesis: Quick Image Searching, University of Aizu, 2002.
Thesis Advisor: Masahide Sugiyama
[sugiyama-14:2002]Hiroyuki Kanno. Graduation Thesis: Correspondence between Sentences and Words, University of Aizu, 2002.
Thesis Advisor: Masahide Sugiyama
[sugiyama-15:2002]Naoki Aida. Graduation Thesis: Characterization of Music Players, University of Aizu, 2002.
Thesis Advisor: Masahide Sugiyama
[sugiyama-16:2002]Kouji Itou. Graduation Thesis: Kaibun Generation, University of Aizu, 2002.
Thesis Advisor: Masahide Sugiyama
[sugiyama-17:2002]Kei Igawa. Graduation Thesis: Correspondence Generation between Sound and Scenarios, University of Aizu, 2002.
Thesis Advisor: Masahide Sugiyama
[sugiyama-18:2002]Masashi Sakai. Graduation Thesis: EAEcient Music Retrieval, University of Aizu, 2002.
Thesis Advisor: Masahide Sugiyama
[sugiyama-19:2002]Hirokazu Sakuma. Master Thesis: Document Clustering System for Information Retrieval, University of Aizu, 2002.
Thesis Advisor: Masahide Sugiyama
[sugiyama-20:2002]Takeshi Akatsuka. Master Thesis: utomatic Generation of Iroha-Uta, University of Aizu, 2002.
Thesis Advisor: Masahide Sugiyama
[sugiyama-21:2002]Tomokazu Mutoh. Master Thesis: Model-based Voice Decomposition Method under Time Constraint, University of Aizu, 2002.
Thesis Advisor: Masahide Sugiyama
[sugiyama-22:2002]Shin'ichiTakeuchi. Master Thesis: Speaker Indexing in Sound Streams, University of Aizu, 2002.
Thesis Advisor: Masahide Sugiyama
[sugiyama-23:2002]Takahiro Suzuki. Master Thesis: Development of Complete VCML System, University of Aizu, 2002.
Thesis Advisor: Masahide Sugiyama