Friday 29 April 2016

Content based image retrieval(CBIR) 02--Flow of CBIR, part B

    This is the second part of the the flow of CBIR, I would record step 6 and step 7 in this post, although there are two steps only, the last step is a little bit complicated.

Step 6 : Build inverted index

void cbir_bovw::build_code_book(size_t code_size)
   hist_type hist;
   hist.load(setting_["hist"].GetString() +
             std::string("_") +

   invert_index invert;
   ocv::cbir::build_inverted_index(hist, invert);["inverted_index"].GetString() +
              std::string("_") +

  This part is quite straigh forward, the invert_index is simply an encapsulation of std::map and std::vector. Apply inverted index may improve the accuracy of the CBIR system, this need to measure.

Step 7 : Search image

  After step 6, I have prepared most of the tools of this CBIR system, it is time to start searching. I have four ways to search the image, it is shown at pic00.

  As usual, a graph is worth a thousand words. The first solution(pic01) is the most easiest one, without IDF(inverse document frequency) and spatial information.

//api of this function is suck, but I think it is
//acceptable in this small example.However, in real case,
//we should not let this kind of codes exist, bad codes
//will attract more bad codes, in the end, your projects
//will become ultra hard to maintain
double measure_impl(Searcher &searcher,
                    ocv::cbir::f2d_detector &f2d,
                    BOVW const &bovw,
                    hist_type const &hist,
                    arma::Mat<cbir_bovw::feature_type> const &code_book,                    
                    rapidjson::Document const &doc,
                    rapidjson::Document const &setting)
    //toal_score save the number of "hit" image of ukbench
    double total_score = 0;
    auto const folder =
    auto const files = ocv::file::get_directory_files(folder);
    for(int i = 0; i != files.size(); ++i){        
        cv::Mat gray =
                cv::imread(folder + "/" + files[i],
        //f2d.get_descriptor is the bottle neck
        //of the program, more than 85% of computation
        //times come by it
        auto describe =
        //transfer cv::Mat to arma::Mat without copy
        arma::Mat const
        //build the histogram of the image we want to search
        auto const target_hist =
        //search the image     
        auto const result =
      , hist);        

        //find relevant file of the image "files[i]"
        auto const &value = doc[files[i].c_str()];
        std::set relevant;
        for(rapidjson::SizeType j = 0;
            j != value.Size(); ++j){
        //increment total_score if the first 4 images
        //of the search result belongs to relevant image
        for(size_t j = 0; j != relevant.size(); ++j){
            auto it = relevant.find(files[result[j]]);
            if(it != std::end(relevant)){

    return total_score;

  This is it, I wrote down how to apply IDF and spatial info on github.


Without inverse document frequency(IDF) and spatial verification(pic01) : 3.044
With inverse document frequency : 3.035
With spatial verfication : 3.082
With inverse document frequency and spatial verification : 3.13

  In conclusion, if I apply IDF and spatial verification, I am able to get best results. The results could be improve if I invest more times to tune the parameters, like the number of code books, parameter of kaze, use another feature extractor to extract the features etc.

Problems of this solution

1 : It is slow, it took me about 300ms~500ms to extract kaze features and keypoints from a 640x480 image, single channel.
2 : It consume a lot of memory, kaze use about 150MB to extract keypoints and features.

  If your applications only run on local machine, this is not a problem, but if you want to develop a web app, this would be a serious problem. We need a much faster yet quite accurate CBIR system if we want to deploy it on high traffic web app, just like TinEye and Google did.

Wednesday 13 April 2016

Content based image retrieval(CBIR) 01--Flow of CBIR, part A

    Before I dive into the codes, let me summarize the flow of CBIR, it is quite straightforward(pic00).


    pic00 tell us the general idea of CBIR, in this post I would like to record how to implement  step 1~5 by the codes located at github. There are too many variables need to pass in to this example, so I prefer to save those variables in json file--setting.json.

Step 1 ~ 4 

cv::Mat cbir_bovw::
read_img(const std::string &name, bool to_gray) const
        return cv::imread(name, cv::IMREAD_GRAYSCALE);
        return cv::imread(name);

void cbir_bovw::
    using namespace ocv;

    //use kaze as feature detector and descriptor
    cv::Ptr<cv::KAZE> detector = cv::KAZE::create();
    cv::Ptr<cv::KAZE> descriptor = detector;
    cbir::f2d_detector f2d(detector, descriptor);

    //read the folder path from setting.json
    auto const folder =
    //iterate through the image inside the folder,
    //extract features and keypoints
    for(auto const &name : file::get_directory_files(folder)){
        auto const img = read_img(folder + "/" + name);
            //find the keypoints and features by detector
            //and descriptor
            auto const result = f2d.get_descriptor(img);
            //first is keypoints, second is features
            fi_.add_features(name, result.first,
            throw std::runtime_error("image is empty");

    In this example, I prefer to store the features, keypoints and other info into the hdf5 format, because these data could be very big, the ram of pc may not able to read them all at once.

Step 5 : Build code book

    After I save the features and keypoints into the hdf5, it is time to build the code book. What is code book?In this case, it is just a bunch of features cluster by clustering algorithm. I pick kmeans for this task, because it is fast, robust and support by armadillo and opencv.

void cbir_bovw::build_code_book(size_t code_size)
            cb(fi_, setting_["features_ratio"].GetDouble(),
                        arma::uword(15), true);
    cb.get_code_book().save(setting_["code_book"].GetString() +
            std::string("_") +

    After I generate the code book, I try to view what are those codes of the code book represent, although visualize the code book is not necessary, but it could be helpful for debug. Following(pic01, pic02, pic03) are part of the visualization results of code book.




    The codes of this post are located at github.

Monday 11 April 2016

Content based image retrieval(CBIR) 00--Use CBIR to find similar images of ukbench

    Content based image retrieval(CBIR), also called as query by image content(QBIC), google search by image and TinEye maybe are the famous example in our daily live. In short, CBIR search the images based on the content of the image, not the name,date,meta data or other info.

    I study how to implement CBIR from PyImageSearch Gurus, the algorithms are almost the same, but my codes are written by c++, build on top of opencv, hdf5, armadilloboost, rapidjson. I pick c++ but not python(PyImageSearch use python) for this project because

1 : c++ suit for building stand alone package
2 : I like c++

    These series of post would not discuss the implementation details of the codes(codes located at github) but summarize the keys I learn from the CBIR lessons of PyImageSearch Gurus.

    The keys of this CBIR system

1 : Feature detector--kaze
2 : Feature descriptor--kaze
3 : Bag of visual words
4 : Data structure(hdf5, inverted index)
5 : Create Code book(I prefer kmeans)
6 : Quantization(build a histogram)
7 : Tf-idf(Term frequency and inverse document frequency)
8 : Spatial verification
9 : Evaluation

   ukbench contain 6376 images, it would be a tedious job to find relevant images by human, this is why we need CBIR to save us from this kind of labor.  Before I begin to summarize the keys of this CBIR system, I would post some examples, a picture is worth a thousand words.

Case 1 : Find similar image of the camera(pic00) within ukbench


Search result of pic00

Case 2 : Find similar image of the toy(pic01) within ukbench


Search result of pic01
    The most similar image are shown at the first row. Until now, I think it should be clear enough to show what are CBIR intent to solve. We could use it to deal with a lot of problems, like object recognition(however, cnn is state of the art when I writing this post), search web site with the image(like google and TinEye), remove duplicate images and so on.

    On next post, I would record part of the flows of this CBIR system, write down how to use the codes located on github(without explanation of  implementation details).