|
|
|
Most of the courses taken by engineers and computer science students emphasize scientific discipline and the accumulation of "truth." The Computer Arts Lab. activities include such technically objective factors, but also encourage original expression, subjectively motivated by aesthetics rather than "correctness," sometimes "putting the art before the course!" Unlike many other labs' activities that try to converge on a "right answer" sharable by everyone else, artistic disciplines encourage originality, in which the best answer is one that is like no one else's.
The Computer Arts Lab., through its resident Spatial Media Group 1 is researching projects including practical and creative applications of virtual reality and mixed (augmented, enhanced, hybrid, mediated) reality and virtuality; panoramic interfaces and spatially-immersive displays (especially stereotelephonics, spatial sound, and stereography); wearable and mobile applications, computing, and interfaces; and networked multimedia, with related interests in cve (collaborative virtual environments), groupware and cscw (computer-supported collaborative work); hypermedia; digital typography and electronic publishing; force-feedback displays; telecommunication semiotics (models of teleconferencing selection functions); information furniture; way-finding and navigation (including using a Segway personal transporter); entertainment computing; ubicomp (ubiquitous computing), calm (ambient), and pervasive technology. We are particularly interested in narrowcasting commands, conference selection functions for adjusting groupware situations in which users have multiple presence, virtually existing in more than one space simultaneously. We explore realtime interactive multimedia interfaces— auditory, visual, haptic, and multimodal:
Auditory We are exploring interfaces for multichannel sound, including stereo, quadraphonic, and nearphones (mounted on our Sc haire rotary motion platform), as well as two separate speaker array systems in the University-Business Innovation Center 3d Theater 2. A permanent installation there of the Helical Keyboard,3 refined and extended by Juliän Villegas and featuring realtime visual music with spatial sound and stereographic graphics, is on permanent exhibition. Working with Dr. Durand Begault of NASA, we deployed online courseware, "Sonic,"4 organizing an introduction to desktop audio and presenting many sound samples. We use these contents in the "Intro. to Sound and Audio" graduate school course,5 which is a prerequisite for "Spatial Hearing and Virtual 3D Sound,"6 taught jointly with Prof. Jie Huang in the Human Interface Lab.
With Profs. Robert Fujii and Satoshi Nishimura we host a Computer Music Studio, featuring computer music workstations complemented by assorted amplifiers, racks, mixers, and effects processors.
We annually conduct a Student Cooperative Class Project.7 In the past we sponsored sccps on Digital Compositing (using Photoshop and the Gimp8), but in recent years the sccp has been focused on Computer Music,9 studying basic music theory and dtm (desk-top music) software, including samplers and midi sequencers 10 to compose and perform student-authored songs. 11 This sccp segues into a graduate level computer music course.12
Visual We promote creative applications of scientific visualization, encouraging the use of Mathematica 13 and stereoscopy,14 including chromastereoscopy15 (3d images with depth layers cued by color). We enjoy exploiting the unique large-format immersive stereographic display in the Ubic 3d Theater. The "M-Project" student cad and cg circle16 is hosted in our lab, under the supervision of Profs. Satoshi Nishimura and Michael Cohen. We are experimenting with various cad authoring tools, such as 3DStudioMax, Blender, Maya, and Sketch-Up. Various group members are exploring the application of cad techniques to the design and animation of exotic fashions, or the construction of a model of the university from photographs. We are also exploring creative applications of panoramic imaging and object movies,17 including a virtual tour of the university.18
Haptic We are also exploring the use of haptic interfaces, including force-display joysticks and a rotary motion platform (the "Sc haire [for 'shared chair'] Internet Chair"). A recently finished project uses the Sudden Motion Sensor in a laptop as a gyroscopic control of avatars in a virtual environment.19 We also convene annual Creative Factory Seminars. Past cfss explored advanced audio interfaces and panoramic imaging, but in recent years, in conjunction with Prof. Rentaro Yoshioka of the Active Knowledge Engineering Lab., we conduct a workshop on Haptic Modeling and 3d Printing, using force-feedback cad workstations 20 to make models that are then rapid prototyped (as stereolithograms) with the dppl's personal fabricator,21 closing the "idea (stored in brain neurons) information (stored as bits) matter (atoms)" pathway.
Multimodal Using such multimodal interfaces, our students have crafted driving simulators, location-based games featuring the rotary motion platform 22 and synaesthetic (cross-sensory modality) visual and haptic music players (rendering songs as light shows 23 or dancing chairs 24). Using visual sensing techniques, narrowcasting postures can be recognized, and used to control distributed chatspaces or virtual concerts. A student project deployed a microphone vector to track a moving sound source, using its network interface to trigger internet appliances (like lights that follow the source). We are also developing a driving simulator using collision-detection modulation of the force-feedback steering wheel and the rotary motion platform. A recent version of the project features a dual-steering (front and back) fire truck, racing through a 3d model of our campus to reach a fire, piloted by two drivers, and featuring spatial sound effects. We are interested in exploring using figurative interfaces to express emotion and to control narrowcasting privacy using a media mixing system based on the Session Initiation Protocol for advanced conferencing features. We are also exploring extensions of Open Wonderland,25 an open-source framework for developing virtual reality environments. This year, group members have developed windshield wipers that dance, featuring beat detection, a digital phase-locked loop, and articulated wiper gestures.26
We are also exploring mobile (nomadic, portable) computing, working in conjunction with university spin-offs The Designium,27 Eyes, JAPAN,28 and GClue.29 Such keitai-based interfaces can be used to design kaleidoscopic "wallpaper" screen savers, or to control internet appliances, panoramic imaging, spatial sound, or motion platforms. An exciting project combines spatial sound with way-finding, using gps tracking, the Segway personal transporter,30 and directional transfer functions.
A advanced undergraduate course on "Human Interface and Virtual Reality"31 surveys many of these topics, contextualized by "machinema" (machine cinema) using "Alice,"32 featuring student-designed and -programmed, computer-generated interactive stories with 3d animation- including texture maps, photographic compositing, audio effects, speech synthesis, background music- and segments on panoramic and turnoramic imagery, stereopsis, and groupware.
Other activities:
We host an annual symposium, the Int. Symposium on Spatial Media,33 inviting experts to share their knowledge and passion regarding such themes as "Spatial Sound and Spatial Telepresence" ('01), "Magic in Math and Music" ('02), "Advanced Multimedia and Virtual Reality" ('03), "Spatial Sound" ('04), "Hearing and Sound Installations" ('05), "Sound, Audio, and Music" ('06), "Interactive Media, Security, and Stereography" ('06), "Music XML and the Structure of Swing, Understanding Color Media, Media Grid, and Visualization Tools" ('07), "Multimedia Computing" ('08), "Systems and Applications" ('09'10) and "Multimodal Interfaces" ('10'11).
Our lab sponsors several student performance circles, including the Yasakoi Dance Circle,34 and Disco Mix Club.35 We also sponsor a couple of other student circles, the Dual Boot (Ultimate Frisbee) Flying Disc Club,36 and the Furiten Mah Jongg Circle.37
We are working with Aizu Yougo Gakko,38 the special education school next to the university, to develop multimedia interfaces that can encourage and entertain students with special needs. We have consulted on deployment of switch-adapted media players, have deployed some iPad accessibility applications, and have developed a song selection program using a "step-scan, select" affordance.
Through the research & development, the deployment & integration, of stereographic, spatial sound, haptic, and mobile applications, including virtual and mixed reality, we nurture scientific and artistic interest in advanced computerhuman and humanhuman communication. Our ultimate domain is the exploration of interfaces and artifacts that are literally sensational.
Some relevant links:
Juliän Villegas and Michael Cohen. Roughness Minimization Through Automatic Intonation Adjustments. JNMR: J. of New Music Research, 39(1):75-92, 2010.
http://www.tandf.co.uk/journals/nnmr. We have created a reintonation system that minimizes perceptual roughness of parallel sonorities as they are produced. Intonation adjustments are performed by finding, within a user-defined vicinity, a combination of fundamental frequencies that yields minimal roughness. The vicinity imposition limits pitch drift and eases realtime computation. Prior knowledge of the temperament and notes being played is not necessary for the operation of the algorithm. We test a proof of concept prototype adjusting equal temperament intervals reproduced with a harmonic spectrum towards pure intervals in realtime. Pitch drift of the rendered music is not prevented but limited. This prototype exemplifies musical and perceptual characteristics of roughness minimization by adaptive techniques. We discuss the results obtained, limitations, possible improvements, and future work.
Wai-Man Pang, Jing Qin, Yuqiang Lu, Yongming Xie, Chee-Kong Chui, and Pheng-Ann Heng. CUDA-Accelerated 3D Algebraic Computed Tomography Reconstruction with Motion Compensation. International Journal of Computer Assisted Radiology and Surgery, 6(2):187-199, March 2011.
Purpose To accelerate the simultaneous algebraic reconstruction technique (SART) with motion compensation for speedy and quality computed tomography reconstruction by exploiting CUDA-enabled GPU. Methods Two core techniques are proposed to fit SART into the CUDA architecture: (1) a ray-driven projection along with hardware trilinear interpolation, and (2) a voxel-driven back-projection that can avoid redundant computation by combining CUDA shared memory. We utilize the independence of each ray and voxel on both techniques to design CUDA kernel to represent a ray in the projection and a voxel in the back-projection respectively. Thus, significant parallelization and performance boost can be achieved. For motion compensation, we rectify each ray's direction during the projection and back-projection stages based on a known motion vector field. Results Extensive experiments demonstrate the proposed techniques can provide faster reconstruction without compromising image quality. The process rate is nearly 100 projections s − 1, and it is about 150 times faster than a CPU-based SART. The reconstructed image is compared against ground truth visually and quantitatively by peak signal-to-noise ratio (PSNR) and line profiles. We further evaluate the reconstruction quality using quantitative metrics such as signal-to-noise ratio (SNR) and mean-square-error (MSE). All these reveal that satisfactory results are achieved. The effects of major parameters such as ray sampling interval and relaxation parameter are also investigated by a series of experiments. A simulated dataset is used for testing the effectiveness of our motion compensation technique. The results demonstrate our reconstructed volume can eliminate undesirable artifacts like blurring. Conclusion Our proposed method has potential to realize instantaneous presentation of 3D CT volume to physicians once the projection data are acquired
Jing Qin, Wai-Man Pang, Yim-Pan Chui, Tien-Tsin Wong, and Pheng-Ann Heng. A Novel Modeling Framework for Multilayered Soft Tissue Deformation in Virtual Orthopaedic Surgery. Journal of Medical Systems, 34:261-271, June 2010.
Realistic modeling of soft tissue deformation is crucial to virtual orthopedic surgery, especially orthopedic trauma surgery which involves layered heterogenous soft tissues. In this paper, a novel modeling framework for multilayered soft tissue deformation is proposed in order to facilitate the development of ortho- pedic surgery simulators. A multilayered 3D mass-spring model is constructed based on the segmented Chinese Visible Human (CVH) dataset. We adopt a bilinear elasticity scheme to simulate the passive nonlinear biome- chanical properties of skin and skeletal muscle. An optimization approach is employed in conguring the parameters of the mass-spring model with experimental data in biomechanics literature as benchmarking references. The computation intensive physics simulation is accelerated by the newly released Physical Processing Unit (PPU), a consumer-level hardware tailor made for physically based computation. With an integration of high Send oprint requests to:? Present address: Insert the address here if needed quality volume visualization, our framework serves as an interactive and intuitive platform for virtual orthopedic surgery simulation. A prototype developed based on this framework as well as a series of experiments performed on it demonstrate the feasibility of the proposed framework in providing interactive and realistic tissue deformation for orthopedic surgery simulation.
Wai-Man Pang, Jing Qin, Yim-Pan Chui, and Pheng-Ann Heng. Fast Prototyping of Virtual Reality Based Surgical Simulators with PhysXenabled GPU. Transaction in Edutainment IV, pages 176-188, 2010.
We present our experience in fast prototyping of a series of important but computation-intensive functionalities in surgical simulators based on newly released PhysX-enabled GPU. We focus on soft tissue deformation and bleeding simulation, as they are essential but have previously been difficult to be rapidly prototyped. A multilayered soft tissue deformation model is implemented by extending the hardware accelerated mass-spring system (MSS) in PhysX engine. To ensure accuracy, we configure spring parameters in an analytic way and integrate a fast volume preservation method to overcome the volume loss problem in MSS. Fast bleeding simulation with consideration of both patient behavior and mechanical dynamics is introduced. By making use of the PhysX built-in SPH-based fluid solver with careful assignment of parameters, realistic yet efficient bleeding effects can be achieved. Experimental results demonstrate that our approaches can achieve both interactive frame rates and convincing visual effects even when complex models are involved.
Wing-Yin Chan, Dong Ni, Wai-Man Pang, Jing Qin, Yim-Pan Chui ajd Simon Chun-Ho Yu, and Pheng-Ann Heng. Learning Ultrasound-Guided Needle Insertion Skills through an Edutainment Game. Transactions on Edutainment IV, pages 200-214, 2010.
Ultrasound-guided needle insertion is essential in many of minimally invasive surgeries or procedures, such as biopsy, drug delivery, spinal anaesthesia, etc. Accurate and safe needle insertion is a difficult task due to the high requirement of hand-eye coordination skills. Many proposed virtual reality (VR) based training systems put their emphasis on realistic simulation instead of pedagogical efficiency. The lack of schematic training scenario leads to boredom of repetitive operations. To solve this, we present our novel training system with the integration of game elements in order to retain the trainees' enthusiasm. Task-oriented scenarios, time attack scenarios and performance evaluation are introduced. Besides, some state-of-art technologies are also presented, including ultrasound simulation, needle haptic rendering as well as a mass-spring-based needle-tissue interaction simulation. These works are shown to be effective to keep the trainees up with learning.
Senaka Amarakeerthi, Rasika Ranaweera, and Michael Cohen. Speechbased Emotion Characterization Using Postures and Gestures in CVEs. In Proc. Int. Conf. on Cyberworlds, page (electronic proceedings), Singapore, oct 2010.
www3.ntu.edu.sg/SCE/cw2010. Collaborative Virtual Environments (CVEs) have gained an increasing popularity in past two decades. Most CVEs use avatar systems to represent each user logged into a CVE session. Some avatar systems are capable of expressing emotions with gestures, postures, and facial expressions. In previous studies, various approaches have been explored to convey emotional states to the computer, including voice, facial movements, and emotions. We propose a technique to extract emotions from the voice of a speaker and animate avatars to reflect extracted emotions in real-time. The system has been developed in "Open Wonderland," a Java-based open-source framework for creating collaborative 3D virtual worlds. In our prototype, six primitive emotional states – anger, dislike, fear, joy, sadness, and surprise – were considered. An emotion classification system, which uses short time log frequency power coefficients (LFPC) to represent features and hidden Markov models (HMMs) as the classifier, was modified to build an emotion classification unit. Extracted emotions were used to activate existing avatar gestures in Wonderland.
Hiromitsu Sato and Michael Cohen. Using Motion Capture for Realtime Augmented Reality Scenes. In Vitaly Kluev and Michael Cohen, editors, Proc. HC-2010: 13th Int. Conf. on Humans and Computers, pages 58-61, Aizu-Wakamatsu, December 2010.
http://sparth.u-aizu.ac.jp/hc2010, isbn 978-4-900721-01-2. The purpose of this study is improvement of real-time human-computer interface in using augmented reality. In this article, we suggest avatar animated in real-time with motion capture system and augmented reality. Development of ubiquitous computing is impressive in late years. Augmented reality is an instance of ubiquitous computing technology. Augmented reality is related to virtual reality, characterized by composition of real and virtual, and showing of virtual object as electronic information and giving the real world addition of the information. Because the computer's performance is continually upgrading, we can show complicated 3D models, and it is possible easily to add information to a computer such as a mobile telephone. Core technologies of augmented reality are position recognition, image recognition, and orientation recognition. In the past, we needed high performance machine in order to process these. Using modern hardware, we can process such data using mobile phone as well as notebook. Recently, Sekai Camera became famous as augmented reality software that runs on iPhone and Android. Assuming continual computer performance upgrade, there will appear a lot of augmented reality software we can use in the future. However, the means to affect information in augmented reality have not developed yet.
Juliän Villegas and Michael Cohen. "GABRIEL": Geo-Aware BRoad casting for In-Vehicle Entertainment and Localizability. In AES 40th Int. Conf. "Spatial Audio: Sense the Sound of Space", page (electronic proceedings), Tokyo, October 2010.
http://www.aes.org/events/40/. We explore the potential of modern locationaware multimedia applications to tour services and driver navigation advisories, the deployment of telematic resources through spatial sound. We have retrofitted a commercial tour bus with location-aware spatial sound advisories, delivered via wireless headphones (for passengers) and bone conduction headphones (for the driver). The prototype is intended to serve as a "proof of concept" as well as a testbed for anticipated future research in which geo-located bilingual tourist information, navigation instructions, and traffic advisories are rendered simultaneously.
Akira Inoue and Michael Cohen. Time-aware Geomedia Browsing Integration with Virtual Environment. In Vitaly Kluev and Michael Cohen, editors, Proc. HC-2010: 13th Int. Conf. on Humans and Computers, pages 47-54, Aizu-Wakamatsu, dec 2010.
http://sparth.u-aizu.ac.jp/hc2010, isbn 978-4-900721-01-2, We propose a pap application named "Human History Maps" to use data of human history information and MultiSound Manager as rich contents Map application. We provided Human History Maps that aims to learn more easily the historical famous person. Web services that combine map with person of history. It is service that makes tracks of a person's life visible on the map. The System memorizes the person that having met, the place, time, and so on. The system can visualize the connection of multiple people of the person and the person at the same time. Human History maps constructed to display Maps by Google Maps API that was developed and offered by Google. In that, we focus to the feature of Google Street View, one of the functions of Google Maps. And we provide and approach some extended feature of Google Street View. There are functions to combine street view with sound and to use AR (Augmented Reality) with which a user can control web applications interactively. The learning effect a rises when a user learns to use visualized history and recognized with a sound. As a result, it is possible to be interested in Human (person) of history for user that become not passive but more active to learn. Moreover, in the future, it will be possible to apply it for various services and the researches by recording in our data base, and sharing the data of Human History Maps among users.
Prabath Weerasinghe, Rasika Ranaweera, Senaka Amarakeerthi, and Michael Cohen. "Emo Sim": Expressing Voice-Based Emotions in Mobile
Interfaces. In Vitaly Kluev and Michael Cohen, editors, Proc. HC-2010: 13th Int. Conf. on Humans and Computers, pages 28-31, Aizu-Wakamatsu, dec 2010. http://sparth.u-aizu.ac.jp/hc2010, isbn 978-4-900721-01-2. Human interaction with mobile devices is currently a very active research area. Speech enriched with emotions is one of the major ways of exchanging ideas, especially via telephony. By analyzing a voice stream using a Hidden Markov Model (HMM) and Log Frequency Cepstral Coefficients (LFPC) based system, different emotions can be recognized by a separate system. Using a simple Java client, the recognized emotions are delivered to a sever as an index number. A mobile client then retrieves the emotion and displays it through colored icons. Each emotion is mapped to a particular color, as it is natural to use colors to represent various expressions. We believe that with the help of this application one could conceivably change one's way of talking or avoid chatting with somebody whose emotional state is negative!
Mamoru Ishikawa, Takeshi Matsuda, and Michael Cohen. Guitarto-MIDI Interface: Guitar Tones to MIDI Notes Conversion Requiring No
Additional Pickups. In Proc. AES 129th Conv., pages 28-31, San Francisco, nov 2010. Preprint 8209, http://www.aes.org/events/129/. Many musicians, especially guitarists (both professional and amateur), use effects processors. In recent years, a large variety of digital processing effects have been made available to consumers. Further, desktop music, the "lingua franca" of which is MIDI, has become widespread through advances in computer technology and DSP. Therefore, we are developing a "Guitar to MIDI" interface device that analyzes the analog guitar audio signal and emits a standard MIDI stream. Similar products are already on the market (such as the Roland GI-20 GK-MIDI Interface), but almost all of them need additional pickups or guitar modification. The interface we are developing requires no special guitar modification. We describe a prototype platformed on a PC that anticipates a self-contained embedded system.
Norbert Gyërbirë, Henry Larkin, and Michael Cohen. Spaced Repetition Tool for Improving Long-term Memory Retention and Recall of Collected Personal Experiences. In Proc. ACE, Int. Conf. on Advances in Computer Entertainment Technology, pages 28-31, Taipei, nov 2010.
http://ace2010.ntpu.edu.tw. A variety of electronic displays available for home use creates opportunities for intelligent applications. This paper presents a semipassive photo reviewing tool for consolidating memories of experiences utilizing personal picture libraries. A form of spaced repetition algorithm is used to create visual journeys which link photos together around a user-chosen central theme. Systematically reviewing images from positive personal experiences can be useful to remember significant events, as well as to balance out stressful events in our lives. The design exploits existing digital home displays and aims to improve usage of media collections.
Owen Noel Newton Fernando, Michael Cohen, and Adrian David Cheok. Multipresence-Enabled Mobile Spatial Audio Interfaces. In Proc. ICEC: Int. Conf. on Entertainment Computing, pages 434-436, Seoul, September 2010.
http://icec2010.or.kr/xe/. Mobile telephony offers an interesting platform for building multipresence-enabled applications that utilize the phone as a social or commercial assistant. The main objective of this research is to develop multipresence-enabled audio windowing systems for visualization, attention, and privacy awareness of narrowcasting (selection) functions in collaborative virtual environments (CVEs) for mobile devices such as 3rd- and 4th-generation mobile phones. Mobile audio windowing system enhances auditory information on mobile phones and encourages modernization of office- and mobile-based conferencing.
Norbert Gy}orbirë, Henry Larkin, and Michael Cohen. Long-term Memory Retention and Recall of Collected Personal Memories. In Computer Graphics (Proc. acm Siggraph 2010), pages 28-31, Los Angeles, July 2010.
A novel photo view and review personal information system is proposed to increase retention and recall of collected memories by utilizing the spacing effect in learning and presenting a context-aware selection of photos as the "learning material." The primary aim is to improve memory connections by automatically creating "visual photo journeys."
Norbert Gy}orbirë, Henry Larkin, and Michael Cohen. Collaborative Capturing of Significant Life Memories. In CHI Workshop "Know Thyself: Monitoring and Reflecting on Facets of One's Life", page (electronic proceedings), Atlanta, April 2010.
http://personalinformatics.org/chi2010. The continual advancement in technology and form-factor of personal devices have made possible "memory archiving" and "life logging," the recording of a person's life through wearable and ubiquitous computing technologies. The purpose of this research is to minimize the total data recorded while maximizing the number of most significant or user-preferred media. This media is captured automatically based on the person's arousal level and collaboratively employing nearby camera phones. The recording groupware application analyzes collective emotion for more accurate capture. Keywords: Collaborative life logging, significant memories, mobile groupware.
Rasika Ranaweera, Michael Cohen, and Michael Frishkopf. Virtual World Music— Music Browser in Wonderland. In iED: Immersive Education Initiative Boston Summit, page (electronic proceedings), Boston, April 2010.
http://mediagrid.org/summit/2010 Boston Summit program full.html, As immersive virtual environments and online music networks become increasingly popular, it behooves researchers to explore their convergence: groupware music browsers populated by figurative avatars. Collaborative virtual environments (CVEs), like Second Life, offer immersive experiential network interfaces to online worlds and media. We developed a virtual environment, based upon and similar to the "Music in Wonderland" proof-of-concept by Sun Microsystems, that enables a place where avatar-represented users can go to browse musical databases. Music is a medium that is used for a wide range of purposes in different situations in very different ways in real life. Different systems and interfaces exist for the broad range of needs in music consumption. Locating a particular recording is well supported by traditional search interfaces via metadata, but improving search techniques via different strategies is becoming a growing need. World music browser is a cylinder in which a transparent rectangular map of the world is coated, and tracks are placed according to origins of tracks, enabling location-aware browsing. Hemisphere is a well-known structure which can be easily defined and projected in the 3D environment either declaratively or procedurally. Since Wonderland does not permit recumbency, the avatar always stands vertically. A sphere limits angle of view more and more as objects approach the poles; but on the other hand a cylinder can be effectively used to indicate geographical location using a suitable projection. This would be ideal as the avatars in Wonderland can fly! The avatar can move in to cylinder, and click to listen track samples, if s/he is inside the hearing range. The selected tracks are highlighted and also multiple tracks can be heard at the same time when the user moves near from one track to another and in between sound may always overlap. The system is collaborative: multiple users can hear the same music together, and they can hear each others' speech via voice chat. Wonderland supports stereophonic audio communication, with a headset, participants get realtime immersive stereo audio with distance attenuation, means the voices of others present in Wonderland will become louder when approached and decays when apart. Audio spatialization supports stereo (2 channels) and only left/right positioning including delay effect. The stereo effect is done at the server side by the Audio Treatment Component, two interleaved channels are sent in each packet. One channel is delayed by 0-0.63 ms, depending on the location of the source relative to the receiver. As this is done on the server it is possible to mix many audio sources appropriately for each receiver which also reduces the required bandwidth. Technically, the music collection is an XML file which keeps the list of albums along with the tracks (songs) as children nodes. Using SAX (Simple API for XML) parser, which invokes callback methods as it reads XML, generic, artistic, and geographic details about each album can be parsed and displayed in client browser.
Norbert Gy}orbirë, Henry Larkin, and Michael Cohen. Collaborative Capturing of Significant Life Memories. In CHI Workshop "Know Thyself: Monitoring and Reflecting on Facets of One's Life", page (electronic proceedings), Atlanta, April 2010.
http://personalinformatics.org/chi2010. The continual advancement in technology and form-factor of personal devices have made possible "memory archiving" and "life logging," the recording of a person's life through wearable and ubiquitous computing technologies. The purpose of this research is to minimize the total data recorded while maximizing the number of most significant or user-preferred media. This media is captured automatically based on the person's arousal level and collaboratively employing nearby camera phones. The recording groupware application analyzes collective emotion for more accurate capture. Keywords: Collaborative life logging, significant memories, mobile groupware.
Juliän Villegas, Michael Cohen, Ian Wilson, and William Martens.
Influence of Roughness on Preference Regarding Musical Intonation. In 128th Conv. of the Audio Engineering Society, page Convention Paper 8017, London, May 2010. http://www.aes.org/events/128/. An experiment to compare the acceptability of three different music fragments rendered with three different intonations is presented. These preference results were contrasted with those of isolated chords also rendered with the same three intonations. The least rough renditions were found to be those using Twelve-Tone Equal-Temperament (12-tet). Just Intonation (ji) renditions were measured as the roughest and least preferred. A negative correlation between preference and psychoacoustic roughness was also found.
Kyoko Katono and Michael Cohen. DollyZoom Camera Perspective with Alice. In Vitaly Kluev and Michael Cohen, editors, Proc. HC-2010: 13th Int. Conf. on Humans and Computers, pages 55-57, Aizu-Wakamatsu, December 2010.
http://sparth.u-aizu.ac.jp/hc2010, isbn 978-4-900721-01-2. This paper describes achievement of the "Vertigo" cinematographic effect in "machinima" software that can make 3D animation named "Alice." By composing virtual camera gestures of zoom in and zoom out with camera motion, "dolly zoom effect" can be realized.
Rasika Ranaweera, Michael Cohen, and Michael Frishkopf. Virtual World Music— Music Browser in Wonderland. In iED: Immersive Education Initiative Boston Summit, page (electronic proceedings), Boston, April 2010.
http://mediagrid.org/summit/2010 Boston Summit program full.html, As immersive virtual environments and online music networks become increasingly popular, it behooves researchers to explore their convergence: groupware music browsers populated by figurative avatars. Collaborative virtual environments (CVEs), like Second Life, offer immersive experiential network interfaces to online worlds and media. We developed a virtual environment, based upon and similar to the "Music in Wonderland" proof-of-concept by Sun Microsystems, that enables a place where avatar-represented users can go to browse musical databases. Music is a medium that is used for a wide range of purposes in different situations in very different ways in real life. Different systems and interfaces exist for the broad range of needs in music consumption. Locating a particular recording is well supported by traditional search interfaces via metadata, but improving search techniques via different strategies is becoming a growing need. World music browser is a cylinder in which a transparent rectangular map of the world is coated, and tracks are placed according to origins of tracks, enabling location-aware browsing. Hemisphere is a well-known structure which can be easily defined and projected in the 3D environment either declaratively or procedurally. Since Wonderland does not permit recumbency, the avatar always stands vertically. A sphere limits angle of view more and more as objects approach the poles; but on the other hand a cylinder can be effectively used to indicate geographical location using a suitable projection. This would be ideal as the avatars in Wonderland can fly! The avatar can move in to cylinder, and click to listen track samples, if s/he is inside the hearing range. The selected tracks are highlighted and also multiple tracks can be heard at the same time when the user moves near from one track to another and in between sound may always overlap. The system is collaborative: multiple users can hear the same music together, and they can hear each others' speech via voice chat. Wonderland supports stereophonic audio communication, with a headset, participants get realtime immersive stereo audio with distance attenuation, means the voices of others present in Wonderland will become louder when approached and decays when apart. Audio spatialization supports stereo (2 channels) and only left/right positioning including delay effect. The stereo effect is done at the server side by the Audio Treatment Component, two interleaved channels are sent in each packet. One channel is delayed by 0-0.63 ms, depending on the location of the source relative to the receiver. As this is done on the server it is possible to mix many audio sources appropriately for each receiver which also reduces the required bandwidth. Technically, the music collection is an XML file which keeps the list of albums along with the tracks (songs) as children nodes. Using SAX (Simple API for XML) parser, which invokes callback methods as it reads XML, generic, artistic, and geographic details about each album can be parsed and displayed in client browser.
Yuya Sasamoto and Michael Cohen. Spatial Sound Control with Yamaha Tenori-On. In Vitaly Kluev and Michael Cohen, editors, Proc. HC-2010: 13th Int. Conf. on Humans and Computers, pages 62-65, AizuWakamatsu, dec 2010.
http://sparth.u-aizu.ac.jp/hc2010, isbn 978-4-900721-01-2. In this research, we explore control of spatial sound using the Yamaha Tenori-On and the University of Aizu Busi- ness Innovation Center (UBIC) 3D Theater speaker array. The TenoriOn is a new type of musical interface that combines control and display functions in a matrix of 16 × 16 LED buttons. The 16× 16 Tenori-On matrix can be operated easily and intuitively, so that anyone can begin playing the instrument right away. Light animation is generated in synchronization with the performance and sound.
Wai-Man Pang. Towards Fast Gabor Wavelet Feature Extraction for Texture Segmentation by Filter Approximation. In Proceeding of 9th IEEE/ACIS International Conference on Computer and Information Science (ICIS2010), pages 252-257, 2010.
Abstract--Gabor wavelet transform is one of the most effective feature extraction techniques for textures. As the Gabor wavelets are believed to be rather consistent to the response of Human Vision System (HVS), and many successful examples are reported in the areas of texture analysis. However, computational complexity of the feature extraction is still high even for computers nowadays, especially large sized image is involved. This paper attempts to break through the bottleneck in the whole extraction process, that is to accelerate the convolutions by approximating the originally non-separable Gabor filter kernels to separable ones. Although the final computed features are not exactly the same as original ones, we prove that acceptable results can be achieved for segmentation purpose. While the acceleration ratio is as satisfactory as a gain of about 30% in time in the worst case with a MATLAB implementation.
Xinmin Tang, Jing Qin, and Wai-Man Pang. Automatic and Accurate 3D Face Registration under the Guidance of Intra-class Difference Measurement. In Proceeding of IEEE TENCON 2010, pages 173 178, 2010.
The accuracy of 3D face registration algorithm greatly influences the effect of 3D face recognition. While a lot of efforts have been dedicated in developing automatic and accurate registration methods, these methods usually cannot be compared and evaluated on a fair basis because of the lack of a standard quantitative measurement. In this paper, we propose a practical quantitative analysis method based on intra-class difference to evaluate the accuracy of face registration methods, and apply it to guide the procedures of an automatic nose symmetry plane (NSP) method. After every step, we calculate the mean and standard deviation (STD) of intraclass pose differences for all involved face images to assess the effect of this step and determine how to further improve the accuracy in the following steps. Extensive experiments have been conducted using the FRGC (V1.0 and V2.0) benchmark 3D face dataset to demonstrate the feasibility of our guided registration method.
Jing Qin, Wai-Man Pang, Binh P. Nguyen, Dong Ni, and Chee-Kong Chu. Particle-based Simulation of Blood Flow and Vessel Wall Interactions in Virtual Surgery. In Proceeding of Symposium on Information and Communication Technology 2010 (SoICT2010), pages 365-368, 2010.
We propose a particle-based solution to simulate the interactions between blood flow and vessel wall for virtual surgery. By coupling two particle-based techniques, the smoothed particle hydrodynamics (SPH) and mass-spring model (MSM), we can simulate the blood flow and deformation of vessel seamlessly. At the vessel wall, particles are considered as both boundary particles for SPH solver and mass points for the MSM solver. We implement an improved repulsive boundary condition to simulate the interactions. The computation of blood flow dynamics and vessel wall deformations are performed in an alternating fashion in every time step. To ensure realism, parameters of both SPH and MSM are carefully configured. Experimental results demonstrate the potential of the proposed method in providing real-time and realistic interactions for virtual vascular surgery systems.
Ka-Wai Ng, Hon-Cheng Wong, Un-Hong Wong, and Wai-Man Pang. Probe-volume: An exploratory volume visualization framework. In 3rd International Congress on Image and Signal Processing (CISP), 2010, pages 2392 2395, 2010.
Direct volume rendering has been an important technique for visualizing volume data. An unobstructed view on the inspected features of interest can help the user to get the insight of the volume data being explored. In this paper an exploratory volume visualization framework named Probe-Volume is presented. Probe-Volume provides a moveable and sizeable probing box for the user to examine an interested volumetric space of the data enclosed by the box. High-quality volume rendered results with such a feature can be achieved efficiently by providing two different user specified transfer functions for regions inside and outside of the probing box. In addition, our framework allows the user to visualize the volume with four rendering modes, namely, Full, MIP, MIDA, and Contour. Combining with the probing box, the user can select different rendering effects for regions included and excluded by the box in order to improve the visual effects. Experimental results show that Probe-Volume can provide informative direct volume rendered images of raw volume data efficiently and effectively.
Wai-Man Pang and Hon-Cheng Wong. Compression of Pre-computed Per-pixel Texture Features using MDS. In Proceeding of 28th Picture Coding Symposium (PCS2010), pages 390 393, December 2010.
There are many successful experiences on employing texture analysis to improve the accuracy and robustness on image segmentation. Usually, a per-pixel based texture analysis is required, this involves intensive computation especially for large images. While, precomputation and storing of the texture features involves large file space which is not cost effective. To adopt to this novel needs, we propose in this paper the use of multidimensional scaling (MDS) technqiue to reduce the size of perpixel texture features of an image, while preserving the textural discrminiability for segmentation. As per-pixel texture features will create very large dissimilarity matrix, and make the solving of MDS intractable. A sampling-based MDS is therefore introduced to tackle the problem with a divide-and-conquer approach. A compression ratio of 1:24 can be achieved with an average error lower than 7%. Preliminary experiments on segmentation using the compressed data show satisfactory results as good as using the uncompressed features. We foresee that such a method will enable texture features to be stored and transferred more effectively on low processing power devices or embedded system like mobile phones.
J. Qin, C.-K. Chui, J. Zhang, T. Yang, W.M. Pang, S.H. Teoh, and V. Sudhakar. Wall stress analysis of non-stented/stented abdominal aortic aneurysm based on fluid structure interaction simulation. In Proceeding of 24th International Congress on Computer Assisted Radiology and Surgery (CARS 2010), pages 154-155, 2010.
Purpose: To estimate the wall stress distribution of non-stented and stented Abdominal Aortic Aneurysm (AAA) based on fluid-structure interaction (FSI) simulation. Methods: The 3D geometric models of AAA are reconstructed from computed tomography angiographic (CTA) images. A combined logarithm and polynomial strain energy equation is applied to model the elastic properties of arterial wall while a novel reduced quasi-linear viscoelastic model is proposed to describe the strain dependent relaxation behavior. The blood flow across the aneurysm is modeled as an incompressible laminar flow. The obtained pressure of blood flow is applied as a load on the AAA mesh with and without stent-graft and then the wall stress distribution is calculated by FSI solver equipped in ANSYS. Results: A series of in vitro experiments including uniaxial tensile tests and relaxation tests are conducted on pieces of human abdominal arteries. Both the combined logarithm and polynomial model and the modified reduced relaxation function fit the experimental data well. Two sets of geometric models with different shapes of AAA and stent-graft are employed to evaluate the proposed method. Our simulation results are consistent with the diagnosis of experienced radiologists. Conclusions: The proposed method can be used as both a new criterion in evaluating the rupture risk of AAA and a tool for preoperative planning of Endovascular Aneurysm Repair (EVAR). It also has potential to be applied in designing an effective patient-specific stent-graft.
Michael Cohen. Keynote Address: The Future of Immersive Education— Virtual Worlds, Simulators, Games, and Augmented/Mixed Reality in K12 and Higher Education. In FutureCampus Forum, Singapore, May 2010.
http://www.futuregov.net/events/futurecampus-forum-2010/. Immersive education has the potential to redefine the learning space, combining 3D and virtual reality technology with digital media to immerse and engage students. This keynote reviews the power of experiential learning and Dale's "Pyramid of Learning," outlines the potential of multimedia courseware and digital interactive story-telling, and describes "Alice," object-oriented programming environment for `machinema,' and "Project Wonderland," a "Second Life"-style groupware for conferencing, chat spaces, distance learning and immersive virtual learning environments.
Michael Cohen and Juliän Villegas. From Whereware to Whence- and Whitherware: Augmented Audio Reality for Position-Aware Services. In ISVRI: Proc. Int. Symp. on Virtual Reality Innovations, Singapore, March 2011.
http://isvri2011.org, Since audition is omnidirectional, it is especially receptive to orientation modulation. Position can be defined as the combination of location and orientation information. Location-based or location-aware services do not generally require orientation information, but position-based services are explicitly parameterized by angular bearing as well as place. "Whereware" suggests using hyperlocal georeferences to allow applications location-awareness; "whenceand whitherware" suggests the potential of position-awareness to enhance navigation and situation awareness, especially in realtime high-definition communication interfaces, such as spatial sound augmented reality applications. Combining literal direction effects and metaphorical (remapped) distance effects in whence- and whitherware position-aware applications invites over-saturation of interface channels, encouraging interface strategies such as audio windowing, narrowcasting, and multipresence.
Juliän Villegas and Michael Cohen. Hrir~: Modulating Range in Headphone-Reproduced Spatial Audio. In VRCAI: Proc. of the 9th Int. Conf. on Virtual-Reality Continuum and Its Applications in Industry, Seoul, December 2010.
isbn 978-1-4503-0459-7, http://vrcai2010.org. Hrir , a new software audio filter for Head-Related Impulse Response (HRIR) convolution is presented. The filter, implemented as a Pure-Data object, allows dynamic modification of a sound source's apparent location by modulating its virtual azimuth, elevation, and range in realtime, the last attribute being missing in surveyed similar applications. With hrir users can virtually localize monophonic sources around a listener's head in a region delimited by elevations between [−40; 90]ffi, and ranges between [20, 160] cm from the center of the virtual listener's head. An application based on hrir is presented to illustrate its benefits.
Michael Cohen. Under-explored Dimensions in Spatial Sound. In VRCAI: Proc. of the 9th Int. Conf. on Virtual-Reality Continuum and Its Applications in Industry, Seoul, December 2010.
isbn 978-1-4503-0459-7, http://vrcai2010.org. An introduction to spatial sound in the context of hypermedia, interactive multimedia, and virtual reality is presented. Basic principals of relevant physics and psychophysics are reviewed (itds: interaural time differences, iids: interaural intensity differences, and frequencydependent attenuation capturable by transfer functions). % especially including acoustical intensity, spatial hearing, Doppler Shift, and hrtfs. Modeling of sources and sinks (listeners) elaborates such models to include such as intensity, radiation, % auditory source width, distance attenuation & filtering, and reflections & reverberation. %, and refraction around obstacles. Display systems— headphones and headsets, loudspeakers, nearphones, stereo, home theater and other surround systems, discrete speaker systems, speaker arrays, % [systems, discrete speaker systems, speaker arrays, % [?], wfs (wave field synthesis), ], wfs (wave field synthesis), and spatially immersive displays— are described. Distributed applications are surveyed, including stereotelephony, chat-spaces, and massively multiplayer online role-playing games (mmorpgs), with references to immersive virtual environments.
Aaron Walsh, Nicole Yankelovich, Michael Gardner, and Michael Cohen. Panel: The Future of Immersive Education— Virtual Worlds, Simulators, Games, and Augmented/Mixed Reality in K12 and Higher Education. In iED: Immersive Education Initiative Boston Summit, Boston, April 2010. http://mediagrid.org/summi/2010 Boston Summit program full.html.
Juliän Villegas and Michael Cohen. Workshop: Spatial Sound and Entertainment Computing. In ICEC: Int. Conf. on Entertainment Computing, Seoul, September 2010.
http://icec2010.or.kr/xe/. This half-day tutorial introduces the theory and practice of spatial sound for entertainment computing, including psychophysical (psychoacoustic) basis of spatial hearing; outlines the mechanism for creating and displaying spatial sound the hardware and software used to realize such systems and display configurations; and reviews some applications of spatial sound to entertainment computing, especially multimodal interfaces featuring spatial sound. Many case studies reify the explanations; animations, videos, and live demonstrations are featured. %Chromastereoptic eyewear will be distributed to each attendee. Participants in the course can expect to acquire a solid introduction to spatial sound and a survey of some applications of spatial sound to entertainment computing, hopefully stimulating some new ideas that they can take back to their labs and hack on their own.
Juliän Villegas and Michael Cohen. Principles and Applications of Spatial Hearing, chapter Mapping Musical Scales Onto Virtual 3d Spaces. World Scientific, 2010.
eisbn 978-981-4299-31-2, http://eproceedings.worldscinet.com/9789814299312/ 9789814299312 005%6.html, http://www.worldscibooks.com/lifesci/7674.html. We introduce an enhancement to the Helical Keyboard, an interactive installation displaying three-dimensional musical scales aurally and visually. The Helical Keyboard features include tuning stretching mechanisms, spatial sound, and stereographic display. The improvement in the audio display is intended to facilitate pedagogic purposes by enhancing user immersion in a virtual environment. The newly developed system allows spatialization of audio sources controlling the elevation and azimuth angles at a fixed range. In this fashion, we could overcome previous limitations on the auditory display of the Helical Keyboard, for which we heretofore usually displayed only azimuth.
Michael Cohen, October 2011.
Program Committee, NIME 2011 (Int. Conf. on New Instruments for Musical Expression), http://www.nime2011.org
Michael Cohen, October 2011.
Program Committee, IFIP ICEC 2011 (Int. Conf. on Entertainment Computing), http://www.icec2011.org
Michael Cohen, March 2011.
Co-chair, IEEE 3DUI (Symp. 3D User on Interfaces) Grand Contest, http://conferences.computer.org/3dui/3dui2011/cfp-contest.html
Michael Cohen, March 2010-11.
Executive Committee, IEEE Computer Society Technical Committee on ComputerGenerated Music
Michael Cohen, 2010-11. Voting Member, IEEE MMTC (Multimedia Communications Technical Committee)
Michael Cohen, Dec. 2010.
Program Committee, HC-2010: Thirteenth Int. Conf. on Human and Computer (Hamamatsu and Aizu-Wakamatsu and Hamamatsu), http://sparth.u-aizu.ac.jp/hc2010
Michael Cohen, September 2010.
Program Committee, IEEE ICEC 2010 (Int. Conf. on Entertainment Computing), http://www.icec2010.or.kr
Michael Cohen, October 2010.
Program Committee, IEEE VSMM (Int. Conf. on Virtual Systems and Multimedia), http://www.vsmm2010.or.kr
Michael Cohen, 2010.
Member, Audio Engineering Society Technical Committee on Spatial Sound
Michael Cohen, 2010-11. Reviewer and Scientific Committee, J. of Virtual Reality and Broadcasting
Michael Cohen, 2010.
Reviewer, Computer Music Journal, http://www.mitpressjournals.org/cmj, http://www.computermusicjournal.org
Sasamoto Yuya (s1150100). Graduation Thesis: "Spatial Sound Control with the Yamaha Tenori-On", School of Computer Science and Engineering, March 2011.
Thesis Adviser: Michael Cohen
Saze Shougo (s1150101). Graduation Thesis: "Augmented Reality Interface for Spatial Sound", School of Computer Science and Engineering, March 2011.
Thesis Adviser: Michael Cohen
Shimizu Shinichirou (s1150114). Graduation Thesis: "Setting Rigging for Maya with a MEL Scrip", School of Computer Science and Engineering, March 2011.
Thesis Adviser: Michael Cohen
Hiromitsu Sato (m5131106). Master Thesis: "Using Motion Capture for Realtime Augmented Reality Scenes", Graduate School of Computer Science and Engineering, March 2011.
Thesis Adviser: Michael Cohen
Shun Shiratori (s1150117). Graduation Thesis: "Comparison of Images Generated by Mentalray and RenderMan", School of Computer Science and Engineering, March 2011.
Thesis Adviser: Michael Cohen
Kyoko Katouno (s1150061). Graduation Thesis: "ZoomDolly Camera Perspective with Alice", School of Computer Science and Engineering, March 2011.
Thesis Adviser: Michael Cohen
Mamoru Ishikawa (m5131103). Masters Thesis: "Dancing Wiper in a Driving Simulator", School of Computer Science and Engineering, March 2011.
Thesis Adviser: Michael Cohen
Norbert Gyërbirë (d8092101). Doctoral Thesis: Novel Interaction Methods for Recording and Recollecting Significant Personal Experiences", Graduate School of Computer Science and Engineering, March 2011.
Adviser: Michael Cohen
Akira Inoue Akira (m5131102). Masters Thesis: "Time-aware Geomedia Browsing Integration with Virtual Environments", Graduate School of Computer Science and Engineering, March 20 11.
Thesis Adviser: Michael Cohen