Monday, March 28, 2011

Image Retrieval Using Histogram Intersection of Detected Edges from Color Images


Data

The image database used in this activity was the initial compilation of images provided by the students taking CMSC 265 (2nd sem 2010-2011) which has 3563 images. The images used for searching came from the database and from a small collection of environment sceneries that I compiled myself consisting of 26 images which were not part of the image database.

Sample Images


Image from the Database




Additional Images

 
 



Feature Used
  • Edges using Sobel Operator
           Edges can have varied semantic meaning in an image. For this activity, edges where detected from the image to be used as basis for "important pixels" that can represent the objects in the image. This is used in the hope to minimize the possible pixels to process. The Sobel operators where used in edge detection.
         Since image resulting from edge detection lost their color information and cannot be further used for color analysis procedures, thus edge detection were not actually used on the original image, but to three single-channel images from the original image - Red channel-only image, Green channel-only image, and Blue channel-only image. The three resulting images were used as basis for identifying important pixels from the images.
       
  • Color
          Image color is primarily the main feature used in searching. Colors were not extracted from the original image but from the three single-channel images  - Red, Green, Blue. Not all pixels in the image where used as samples but only those pixels identified as edges.
         The three values from the three channels where combined as a single value. This would yield 224 possible color values = 28 (for Red values) *   28 (for Green values) *  28 (for Blue values). To minimized this only the most significant bit were set for each channel value, thus resulting only to 83  = 512 possible color values.

Matching Function

    Color Histogram Intersection

      To match a given image from the images from the database, color histogram intersection where used. Images from the database where first transformed to a color histogram with 512 bins using the color and edge feature presented above. This procedure is done off-line. During the image search the image to be searched is then converted to a histogram using the similar procedure used to each image in the database.
      After converting the input image to a histogram, histogram matching is done to all the images in the database. The rank for each image is based from the intersection - the greater the intersection, the higher the rank. 

Results






Generalization
      Color histogram is a very compact way of storing color image information and with the help of edge detection the number of pixels considered was lessen by considering only important pixels and further  compact the required color information. color histogram intersection, being one of the fundamental color image comparison tool is effective in relating color details between images.
      Although  color histogram,edge detection, and color histogram image intersection can sometime produce good results it does not come for free and not completely robust in all possible case. Using color information will not  produce correct result every time because there can be images with object having similar color information but are in any sense unrelated. The full meaning of an object is not captured completely by color alone, Spatial, shape, and texture information are also important which are either ignored or removed by just using color and color histogram. Another drawback with the above mention is slow preprocessing speed since it can run quite slow for images with many possible edges. 

Monday, January 10, 2011

Aerial Images

Source Images








Wednesday, December 22, 2010

Document Layout Analysis

The Problem
Create a program that will analyse a document image and will determine the document's physical layout and in some extent the logical layout of the document.


The Test Document Images




The Solution
My solution is a simple and quite restrictive bottom-up document physical layout analyzer.
Bottom-up layout method  depends on  local information to analyze and build the parts of the document, like black or ON pixels and connected components as opposed to top-down methods that makes use of black/white stripes/spaces/gaps that is use to strip the document into smaller parts for analysis. Many variation have been proposed for each of the two methods and hybrid methods are also available but there is none (based from the literatures that I have read) that will work on any type of document layout. 
The solution presented in this blog makes use of three level/pass of simple connected components (CCs)  analysis which is described below :
1. For the first level the images where converted to binary image using a a naive threshold method that makes use of a predefined gray threshold value of 220. 
2. After binarization, connected components (CCs) where extracted from the binary image and where placed on a list. 
3.After producing the CCs, the components where sorted with respect to their Y and X coordinate in the binary image, having greater priority for the Y value than the X, making the component  with both the least X and Y value (the  topmost-leftmost component) the first on the list. Small components like the dot of the letter "i", period , comma, hypens, and the like plus some unfiltered noise where placed near the end of the list so that they will not hinder later processing.
4. After sorting, the euclidian distance of adjacent pair components on the list where computed, grouped based from the range of the Y - value of the components and then stored in a group-distance list.
5. Each group-distance  where sorted and for each group the 20th percentile was computed. This value will be the maximum allowable distance (MAD) of adjacent connected components to determine which will be linked as one bigger component.
6. The building up of a bigger component will stat by analyzing the list of sorted CCs again. Distance of adjacent pair components where measured again and compared to the computed MAD. If the distance of the pair is acceptable, the component with the lesser X and Y value will be linked to the bigger component  being formed. Once the distance of a chosen pair goes beyond the MAD value, the component with the lesser X and Y value will be linked to the currently being formed component and the building up this bigger component will be stopped and a new bigger component will be built using the other half of the pair. The process is done repeatedly until no pair of component can be produced. The process of linking components is achieved by drawing a solid rectangular binary box that covers all the linked component in the produced binary image in step 1. This linking is shown as transparent rectangular box drawn in the color image. (presented in the Sample Results section below)
7. After linking all the smaller components, the second level processing follows, which is just a repetition of Step 2 - 6 changing only the percentile value to 80 in step 5.
8.The third level processing follows, which is also a repetition of Step 2 - 6 changing only the percentile value to 10 in step 5.



Sample Results

First Pass ( Forming of Words)

 
Second Pass ( Forming of Lines/Sentences)


Third Pass ( Forming of Paragraph)


Final Remarks
The solution presented is far from perfect. As seen from the images above, there are components that were either fail to be linked or has been linked incorrectly. Also, no preprocessing was made to the image, like skew correction, noise removal, and background removal.