## Wednesday, 4 April 2012

### Content Based Image Retrieval using cross correlation co-efficiency

Have you ever searched google for images? I am sure everyone who is reading this article has done this before more than once. Wouldn't it be nice if you could search images for the similar images, for a given image? So for an example, if you want to search for an image of a particular object, by supplying a probe image (a image which to find similar images of) to the search engine, you can search for images that are similar to the given image.

Content-based image retrieval (CBIR) or Query by image content (QBIC), is an active area in Computer Vision research. Comparing images using cross-correlation efficiency was similar to the state of the art for CBIR in the early 1990s. Comparing the colours in the probe image and other test images, is one conservative way of doing this. Histogram is a good way of discriminating the colours in an image.

One way of doing this is explained below. There are many ways of doing this although this is not necessarily the right or the best way of doing it.

1. Split the R,G,B channels of the probe image
2. Split the R,G,B channels of the "test image" that is subject to comparison.
3. Calculate the histograms of the R,G,B channels of each images.
4. Calculate the cross correlation of the R, G, B channels of the probe image and the "test image" histograms. Such that,
r1 = correlation(img1.R,img2.R);
r2 = correlation(img1.G,img2.G);
r3 = correlation(img1.B,img2.B);

5. Finally multiply the results all together ( r1 x r2 x r3 )

The result (r) of a cross correlation calculation lies between -1 and +1. If the value is +1 in average the two images are identical. If it is -1 then the images are identical but the image looks like a photographic negative version of the probe image. If r=0 then the images are on average very unlike each other.

People who are interested in Computer Vision are likely to be familiar with the OpenCV library. The following code snippet (incomplete) is an example for calculating the cross correlation co-efficiency using the OpenCV library in C.

```  //split img1 in to RGB channels and store them in seperate images

cvSplit(img1, img1R, img1G, img1B, NULL);

//calculate histograms of the the image

CvHistogram *h1R = calHistogram(img1R, bins, ranges);

CvHistogram *h1G = calHistogram(img1G, bins, ranges);

CvHistogram *h1B = calHistogram(img1B, bins, ranges);

//split the image two in to RGB channels and store them

//in seperate images

cvSplit(img2, img2B, img2G, img2R,0);

//calculate histograms of the RGB channel images of the "img2"

CvHistogram *h2R = calHistogram(img2R, bins, ranges);

CvHistogram *h2G = calHistogram(img2G, bins, ranges);

CvHistogram *h2B = calHistogram(img2B, bins, ranges);

//Calculate the correlation coefficient of RGB images, of img1 and img2

rR = cvCompareHist(h1R,h2R,CV_COMP_CORREL);

rG = cvCompareHist(h1G,h2G,CV_COMP_CORREL);

rB = cvCompareHist(h1B,h2B,CV_COMP_CORREL);

//Multiply the results together

tempR= rR*rG*rB;

```

The performance:

Some performance evaluation has been done on this program using a test harness called 'Fact'.

```
Error rates calculated from mycbir

tests   TP         TN       FP        FN   accuracy  recall precision specificity       class

8        0         0         8         0     0.00     0.00    0.00        0.00   box

8        0         0         8         0     0.00     0.00    0.00        0.00   cockeral

8        6         0         2         0     0.75     1.00    0.75        0.00   dog

9        7         0         2         0     0.78     1.00    0.78        0.00   horus

8        7         0         1         0     0.88     1.00    0.88        0.00  ornament

10        8         0        2         0     0.80     1.00    0.80        0.00   penguin

8        1         0         7         0     0.12     1.00    0.12        0.00   reindeer

9        2         0         7         0     0.22     1.00    0.22        0.00   tortoise

68       31        0        37         0     0.46     1.00    0.46        0.00          overall
```

TP = True Positives
TN= True Negatives
FP = False Positives
FN= False Negatives

In the matrix above it shows the amount of test cases for the each picture in the “test” column, True Positives, True Negatives, False Positives and False Negatives in the next four columns.  Then the application calculates the accuracy, recall, precision and specificity as follows, and displays in the next four columns. The last “class” column show the type of picture that it has carried out the test on,

Precision is the number of relevant pictures that are retrieved, divided by the total number of of pictures retrieved. Recall is the number of relevant pictures retrieved divided by the total existing relevant pictures. The specificity and the accuracy is calculated as below

TP                                                                        TN
sensitivity = recall = ------------------                                       specificity =     -----------------
TP + TN                                                                   FP + TN

As you can see above, the application has failed to find the True Positives for the “box” and “cockeral” pictures. So in contrast it has found False Positives in all cases. The best case is the tests on the ornament pictures as it has found 7/8 TPs and 1/8 FPs. Overall this algorithm has done 68 test cases and have found 31 True Positives and 37 False Positives, but no True Negatives or False Negatives.
The matrix below, is the class confusion matrix.

```
Confusion matrix calculated from mycbir

expected

actual    box    cockeral         dog    horus   ornament  penguin reindeer tortoise

box        0        0              0        0        0        0        0        1

dog        0        0              6        0        0        0        0        0

horus      0        0              1        7        0        0        0        4

ornament   1        1              0        0        7        2        7        0

penguin    1        0              0        0        1        8        0        0

reindeer   5        7              0        1        0        0        1        2

tortoise   1        0              1        1        0        0        0        2

```

The columns and rows indicates that, the classes in the rows titles were obtained when the classes in the column title should have been found (Rows as obtained results, columns as expected result). The numbers indicates the number of times it has been obtained. So as an example, as the worst case: in the “box” column,

ornament =1, penguin = 1, reindeer = 5, tortoise = 1 has been obtained when “box” pictures should have been found. But it has not obtained any “box” pictures as the value of the “box” raw is 0.

As I have mentioned earlier the best results has been obtained on the tests with the “ornament” picture. It has obtained the correct “ornament” pictures 7 times and a penguin picture just one time.

Conclusion:

Cross correlation is not the best Content based Image Retrieval technique to date, though it helps us to understand how the histograms of two images can vary or can be complementary. But it is no doubt that analysing just the colours in images are a good way of characterising them though it is hard to compare two images using just this factor.