This web page is prepared for providing research materials of our image categorization visualization project. |
Introduction
Developing effective visual categorization techniques is crucial to accelerate image retrieval from databases due to the rapidly increased data size. The bag-of-features (BoF) model is one of the most popular and promising approaches for extracting the underlying semantics from image databases. Nonetheless, the associated image categorization approaches based on machine learning techniques may not convince us of its validity since we cannot visually verify how the images have been classified in the high-dimensional image feature space.
Our work aims to visually rearrange the images in the projected feature space by taking advantage of a set of representative features called visual words obtained using the bag-of-features model. The main idea is to associate each image with a specific number of visual words to compose a bipartite graph, and then lay out the overall images using anchored map representation, as shown in Figure 1.
|
Method |
Each image is represented as a sparse vector of visual words in BoF model, as shown in Figure 2. We associate each image with a specific number of visual words to compose a bipartite graph, and then layout the overall set of images using anchored map representation in which the ordering of anchor nodes is optimized through a genetic algorithm. For handling relatively large image datasets, we adaptively merge a pair of most similar images one by one to conduct the hierarchical clustering through the similarity measure based on the weighted Jaccard coefficient. Voronoi partitioning has been also incorporated into our approach so that we can visually identify the image categorization based on support vector machine.
|
Figure 2. The overview of the Bags-of-Features model |
Results
Here, we present several results that are generated from our system.
(You can click the thumbnail image for that of the original resolution.) |
|
|
(a) Original layout. | (b) Enhanced layout with an optimized circular ordering of visual words annotated with representative images. Images in same category are brought closer to each other. |
Figure 3: Using anchored maps to visualize the bag-of-features image categorization for coin and eyeglass images. ( #{input images} = 20, #{visual words} = 24. ) |
|
|
(a) 10% of images. | (b) 30% of images. |
|
|
(c) 40% of images. | (d) 100% of images. |
Figure 4: Discriminating car images using the support vector machine at multiple hierarchical levels. Images of the training set are labeled as red (car images) and blue (others). The inferred region of the car images is rendered in yellow through the Voronoi tessellation. ( #{input images} = 240, #{visual words} = 100. ) |
|
|
(a) Coarse level. | (b) Fine level. |
Figure 5: This experiment demonstrates how we can categorize images of a specific category even when we train our image classifier indirectly with similar looking images. In this experiment, we represent each image in terms of visual words obtained from training images containing tomatoes, coins, and cars and try to collect images of round shapes. However, we also take as input images of additional categories such as CDs and glasses in this example. ( #{input images} = 420. #{visual words} = 100. ) |
Paper & VideoYi Gao, Hsiang-Yun Wu, Kazuo Misue, Kazuyo Mizuno, and Shigeo Takahashi: Visualizing Bag-of-Features Image Categorization using Anchored Maps, the 7th International Symposium on Visual Information Communication and Interaction (VINCI 2014), 2014. Paper-preprint (PDF, 9.2MB), Video(MOV, 19.5MB) |