Annual Review 2011 > Division of Information Systems

Software Engineering Laboratory

Vitaly V. Klyuev

Associate Professor

The main directions in research conducted by the Software Engineering Lab members were

Semantic Methods for Information Retrieval

This year, the focus was on semantic relatedness measures. Semantic relatedness describes the degree to which concepts are associated via any kind of semantic relationship. Its evaluation is a fundamental natural language processing problem, with applications in word-sense disambiguation, text classification, information retrieval, automatic summarization and many other fields. We proposed a new semantic relatedness measure combining the Wikipedia-based Explicit Semantic Analysis measure, the WordNet path measure and the mixed collocation index. This measure was tested on different applications including query expansion and cross-lingual information retrieval tasks. This research was done in tight cooperation with Prof. Yannis Haralambous from Institute Telecom Bretagne, France.

Results of our investigations and tests were presented at the 5th International Joint Conference on Natural Language Processing (Thailand, 2011), the Federated Conference on Computer Science and Information Systems (Poland, 2011), the 2nd International Conference on Pervasive and Embedded Computing and Communication Systems (Italy, 2012), and the 2012 IEEE International Conference on Information Science and Technology (China, 2012). One paper was published in the Informatica journal.

The aim of our work in the area of text mining was to propose new approaches to sentence alignment from comparable corpora. This research was supported by Competitive Grant, University of Aizu. Students of our lab were involved in this research. Results of the research were presented at The 6th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (Czech Republic, 2011), and the XLII Conference on Control Processes and Stability (Russia, 2011).

Software Engineering and Advanced Learning Technologies: Multimedia, Web-based, and Mobile Learning

Software Engineering techniques can be a keystone in developing reliable learning systems that can help in designing an efficient learning process either in e-learning or normal learning settings.

Applications of technology can provide course content with multimedia systems, active learning opportunities and instructional technology to facilitate the learning process at all levels. For example Multimedia is an exciting area that spans many disciplines within the learning process: it is a computer-based communication system that integrates and delivers a complete package of audio, video, animations, graphics and text to learners.

Web-based learning is currently a hot research and development area. Benefits of Web-based learning are clear at hand: learners everywhere can enroll in learning activities, communicate with other learners or teachers, can discuss and control their learning progress. The modern university needs to extend lifelong learning opportunities to its students anytime and anyplace to be successful in the global educational marketplace. Online web-based learning is made possible by advancements in network infrastructure and development of video/voice/multimedia protocols for seamless transport of information. However, it is a challenging task to design an online learning environment that ensures, effective, accessible, and secure student interaction, especially in computer engineering courses involving high-tech content, such as in a networking laboratory environment, which extensively uses networking hardware and computer/simulation software tools.

Smart Reminding Systems

A significant number of elders live with memory impairment issues, as a result of the normal aging process. Therefore various kinds of supporting systems have been developed to help the elders, who have mild memory problems. But most of those systems are not designed to provide reminders for crucial complex human activities in daily life. In our research we have proposed a Smart Reminder System for reminding forgotten complex activities, in home environment. Results of the research were presented at The 14th International Conference on Advanced Communication Technology (Korea, 2012) and the 2012 Joint International Conference on Human-Centered Computer Environments (Japan, 2012). A paper submitted to IEEJ Transactions on Electronics, Information and Systems, Section C was accepted for publication (Vol. 132 / No. 6 / Sec. C). The main result of this research was in defending a PhD dissertation by our PhD student, Mr. Hapugahage Thilak Chaminda.

International Relations

At the end of 2011, a new agreement between our university and Saint Petersburg State University (http://www.spbu.ru/) on scientific and educational cooperation was signed. After five years of fruitful cooperation between the Faculty of Applied Mathematics and Control Processes, Saint Petersburg State University and our university, this new agreement was expanded to the university-university level. The Software Engineering Lab plays a key pole in the collaboration activities. Prof. Smirnov, coordinator from the side of our Russian partners paid a brief visit to Aizu in March 2012. We discussed the future plans and coordinated out activities.

Within the framework of the joint research program between our universities, Evgeny Pyshkin, Associate Prof. of our partner university was working in Aizu a visiting researcher in April - June 2010. Results of joint research On Document Evaluation for Better Context-Aware Summary Generation were presented at the 2nd International Symposium on Aware Computing, sponsored by IEEE, Tainan, Taiwan.

Exchange of Undergraduate Students

Our undergraduate student Mr. Ueno and master student Min-Hsiang Li visited Saint Petersburg State University, Russia in April 2011 and presented their paper at the XLII Conference on Control Processes and Stability. Russian PhD student Mr. Shachov and master student Mr. Kotelnikov attended the 2012 Joint International Conference on Human-Centered Computer Environments in March 2012. This exchange of students was done in accordance with our agreement with Saint Petersburg State University.

Scientific Events

Our lab with support of professors of our university played the key role in organizing the Joint International Conference on Human-Centered Computer Environments held in March 2012 at the University of Aizu. Stating from this year, the conference got the status ICPS published by ACM. The conference proceedings are included into the ACM digital library.

Foreign Students

One master student joined the lab in autumn 2011.

Mr. Zeng from Chaoyang University of Technology, Taiwan enrolled in the dual-degree program (DDP).

A DDP is a system where students can earn two degrees, from the home and the partner university through mutual recognition of credits attained at the universities, and the goal of the program includes fostering excellent human resources educated internationally, as well as strengthening relations between partner universities through concrete exchanges. The Memorandum of Understanding establishing the international dual degree program for students of our university and Chaoyang University of Technology was concluded in 2009.

This is a second time when our lab welcomes the student from Chaoyang University of Technology.

Refereed Journal Papers

[vkluev-01:2011]

Vitaly Klyuev and Yannis Haralambous. A Query Expansion Technique Using the EWC Semantic Relatedness Measure. Informatica, 35:401-406, 2011.

This paper analyses the efficiency of the EWC semantic relatedness measure in an ad-hoc retrieval task. This measure combines the Wikipedia-based Explicit Semantic Analysis (ESA) measure, the WordNet path measure and the mixed collocation index. EWC considers encyclopaedic, ontological, and collocational knowledge about terms. This advantage of EWC is a key factor to find precise terms for automatic query expansion. In the experiments, the open source search engine Terrier is utilised as a tool to index and retrieve data. The proposed technique is tested on the NTCIR data collection. The experiments demonstrated superiority of EWC over ESA.

Refereed Proceedings Papers

[vkluev-02:2011]

Vitaly Klyuev and Yannis Haralambous. Query Expansion: Term Selection using the EWC Semantic Relatedness Measure. In Proceedings of the Federated Conference on Computer Science and Information Systems, pages 195-199. Polish Information Processing Society, September 2011.

This paper investigates the efficiency of the EWC semantic relatedness measure in an ad-hoc retrieval task. This measure combines the Wikipedia-based Explicit Semantic Analysis measure, the WordNet path measure and the mixed collocation index. In the experiments, the open source search engine Terrier was utilised as a tool to index and retrieve data. The proposed technique was tested on the NTCIR data collection. The experiments demonstrated promising results.

[vkluev-03:2011]

V. Klyuev and Y. Haralambous. Query Translation for CLIR: EWC vs. Google Translate. In Proceedings of the 2012 IEEE International Conference on Information Science and Technology, pages 707-713, Wuhan, China, March 2012. IEEE.

A new approach to find accurate translation of search engine queries from Japanese into English for the CLIR task is proposed. The Mecab system and online dictionary SPACEALC are utilized to segment Japanese queries and to get all possible English senses for every term detected. To disambiguate terms, the idea of the shortest path on an oriented graph is applied. Nodes of this graph symbolize word senses and edges connect nodes representing neighboring Japanese terms. The EWC semantic relatedness measure is used to select the most related meanings for the translation results. This measure combines the Wikipedia-based Explicit Semantic Analysis measure, the WordNet path measure and the mixed collocation index. The proposed technique is tested on the NTCIR data collection. Queries generated by Google Translate were used to evaluate the quality of translation.

[vkluev-04:2011]

Min-Hsiang Li, Vitaly Klyuev, and Shih-Hung Wu. Sentence Alignment from Wikipedia as Comparable Corpora by using STF-IDTF. In Proc. of the XLII Conference on Control Processes and Stability, pages 390-395. Saint Petersburg State University, Saint Petersburg State University Publishing Press, April 2011.

Cross-lingual applications need multilingual databases consisting of sentences with the same meaning written in different languages. A promising way to create them is to utilize comparable corpora via sentence alignment. We introduce a technique to remove more than 50% of data with a low probability for alignment. We used the stf-idtf measure to calculate the similarity between sentences and extract sentences with the same meaning written in Chinese and English.

[vkluev-05:2011]

Vitaly Klyuev and Yannis Haralambous. ACCURATE QUERY TRANSLATION FOR JAPANESE-ENGLISH CROSS-LANGUAGE INFORMATION RETRIEVAL. In Proceedings of the 2nd International Conference on Pervacive and Embedded Computing and Communication Systems, pages 214-219, Rome, Italy, February 2012. INSTICC, IEICE.

In this paper, a novel approach to translate queries from Japanese into English for the CLIR task is discussed. To get all possible English senses for every Japanese term, the online dictionary SPACEALC is utilized. The EWC semantic relatedness measure is used to select the most related meanings for the results of translation. This measure combines the Wikipedia-based Explicit Semantic Analysis measure, the WordNet path measure and the mixed collocation index. The preliminary tests of the proposed technique are done utilizing the NTCIR data collection. The performance of retrieval is compared with the variant of retrieval using queries generated by Google Translate.

[vkluev-06:2011]

Yannis Haralambous and Vitaly Klyuev. A Semantic Relatedness Measure Based on Combined Encyclopedic, Ontological and Collocational Knowledge. In Proceedings of the 5th International Joint Conference on Natural Language Processing, pages 1397-1402, Chiang Mai, Thailand, November 2011. AFNLP.

We describe a new semantic relatedness measure combining the Wikipedia-based Explicit Semantic Analysis measure, the WordNet path measure and the mixed collocation index. Our measure achieves the currently highest results on the WS353 test.

[vkluev-07:2011]

Min-Hsiang Li, Vitaly Klyuev, and Shih-Hung Wu. A Novel Approach to Sentence Alignment from Comparable Corpora. In The 6th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications, number doi: 10.1109/IDAACS.2011.6072842, pages 618-623. IEEE, September 2011.

This paper introduces a new technique to select candidate sentences for alignment from bilingual comparable corpora. Tests were done utilizing Wikipedia as a source for bilingual data. Our test languages are English and Chinese. A high quality of sentence alignment is illustrated by a machine translation application.

[vkluev-08:2011]

Hapugahage Thilak Chaminda, Vitaly Klyuev, Keitaro Naruse, and Minetada Osano. Recognition of coupling-paired activities in daily life. In V. Klyuev and A. Vazhenin, editors, Proceedings of the 2012 Joint International Conference on Human-Centered Computer Environments, number doi: 10.1145/2160749.2160776, pages 124-130. ACM, ACM New York, NY, USA, March 2012.

Recognition of daily activities is one of major focuses in health care systems. But limited number of studies has been done on recognition of complex human activities in daily life. Therefore in this work an attempt has been done to recognize complex human activities in daily life. The activities, which were related with some other activity, were subjected in the study. Further the simultaneous involvement of the both hands of the user to perform the activity was also considered. Therefore subjected activities were called as Coupling-Paired Activities in this paper. An algorithm was designed to monitor the behavior of both hands and recognize the target activities. Designed algorithm was tested and evaluated with several Coupling-Paired Activities in daily life. Proposed approaches could achieve 80% average recognition rate in the experiments.

[vkluev-09:2011]

Hapugahage Thilak Chaminda, Vitaly Klyuev, and Keitaro Naruse. A Smart Reminder System for Complex Human Activities. In Proceedings of The 14th International Conference on Advanced Communication Technology, sponsored by IEEE, pages 235-240, February 2012.

A significant number of elders live with memory impairment issues, as a result of the normal aging process. Therefore various kinds of supporting systems have been developed to help the elders, who have mild memory problems. In this paper we propose a Smart Reminder System for reminding forgotten complex activities, in home environment. Subjected complex activities are the activities, which should be completed as originally intended, after they are initiated. Due to strong relationship among initiation and conclusion activities, those activities are called as “ Coupling Activities ”in this paper. Reminders for forgotten Coupling Activities are predicted according to the user’s current behaviour, current location and past activity patterns. Therefore wearable sensors are used to gather required data for identifying user’s context. A reason for forgetting also is predicted with the reminder. Reminders are predicted with minimum supervision of the user, as the system learns the user’s dynamic behaviour by itself. Proposed Smart Reminder System could achieve 80accuracy rate for reminder prediction in a system evaluation, which was done using four subjects.

[vkluev-10:2011]

R. Ueno and V.V. Klyuev. Semantic Search Engine Query Expansion using WordNet. In Proc. of the XLII Conference on Control Processes and Stability, pages 402-407. Saint Petersburg State University, Saint Petersburg State University Publishing Press, April 2011.

Getting appropriate information from the Internet is still difficult for many users. For improving that situation, the method called query expansion has been introduced in the research area of information retrieval. Query expansion is the procedure that enriches the query by adding semantically related terms. In this paper, we propose a semantic approach to the query expansion using WordNet and its API, WordNet::SenseRelate::AllWords. Our algorithms analyze, disambiguate all words in the query, and expand it.

Academic Activities

[vkluev-11:2011]

V. Klyuev, Apr. 2011.

Member: IEEE, ACM, IEICE

Ph.D., Master and Graduation Theses

[vkluev-12:2011]

Min-Hsiang Li. Multi-lingual sentence alignment from Wikipedia as multi-lingual comparable corpora. Master thesis, Graduate School of Computer Science and Engineering, September 2011.

Thesis Adviser: V. Klyuev

[vkluev-13:2011]

Hapugahage Thilak Chaminda. A Smart Reminding System for Coupling Paired Activities. Ph.d. thesis, Graduate School of Computer Science and Engineering, March 2012.

Adviser: V. Klyuev

Others

[vkluev-14:2011]

Vitaly Klyuev and Alehander Vazhenin (Editors). Proceedings of the Joint International Conference on Human-Centered Computer Environments. ACM New York, NY, USA, March 2012.