If opencv do not have one, why not just create one for it?
1 : The algorithms of image hash are not too complicated.
2 : PHash library already implement many of image hash algorithms, we could port them to opencv and use it as golden model.
3 : opencv is an open source computer vision library. If we ever found any bugs, missing features, poor performance, we can do something to make it better.
The good news is I have implement all of the algorithms I mentioned above, refine the performance(ex : block mean hash able to process single channel image), free you from memory management chores. The bad news is this pull request hasn't merged yet when I write this post, so you need to clone/pull it down and build by yourself. Fear not, this module only depend on the core and imgproc of opencv, it should be fairly easy to build(opencv is quite easy to build from the beginning :)).
Following examples will show you how to use img_hash, you will find out it is much easier to use than PHash library because the api are more consistent + you do not need to manage the memory by yourself.
How to use it
#include <opencv2/core.hpp> #include <opencv2/core/ocl.hpp> #include <opencv2/highgui.hpp> #include <opencv2/img_hash.hpp> #include <opencv2/imgproc.hpp> void computeHash(cv::Ptr<cv::img_hash::ImgHashBase> algo) { cv::Mat const input = cv::imread("lena.png"); cv::Mat const target = cv::imread("lena_blur.png"); cv::Mat inHash; //hash of input image cv::Mat targetHash; //hash of target image //comupte hash of input and target algo->compute(input, inHash); algo->compute(target, targetHash); //Compare the similarity of inHash and targetHash //recommended thresholds are written in the header files //of every classes double const mismatch = algo->compare(inHash, targetHash); std::cout<<mismatch<<std::endl; } int main() { //disable opencl acceleration may boost up speed of img_hash //however, in this post I do not disable the optimization of opencl //cv::ocl::setUseOpenCL(false); computeHash(img_hash::AverageHash::create()); computeHash(img_hash::PHash::create()); computeHash(img_hash::MarrHildrethHash::create()); computeHash(img_hash::RadialVarianceHash::create()); //BlockMeanHash support mode 0 and mode 1, they associate to //mode 1 and mode 2 of PHash library computeHash(img_hash::BlockMeanHash::create(0)); computeHash(img_hash::BlockMeanHash::create(1)); computeHash(img_hash::ColorMomentHash::create()); }
With these functions, we can measure the performance of our algorithms under different "attack", like resize, contrast, noise and rotation. Before we start the test, let me define the thresholds of "pass" and "fail".One thing to remember is, to make thing simple, I only use lena to show the results, different data set may need different thresholds/algorithms to get best results.
Threshold |
After we determine our threshold, we could use our beloved lena to do the test :).
lena.png |
Resize attack
Resize attack |
Every algorithms(BMH mean block mean hash) work very well on different size and aspect ratio except of radial variance hash, this algorithms work on different size, but we need to keep the aspect ratio.
Contrast Attack
Contrast Attack |
Every algorithms works quite well under different contrast, although Radical variance hash, BMH zero and BMH one do not works well under very low contrast.
Gaussian Noise Attack
Gaussian noise attack |
Salt And Pepper Noise Attack
Salt and pepper noise attack |
Rotation Attack
Rotation attack |
We have go through all of the tests, now let us measure the performance of hash computation time and comparison time of different algorithms(my laptop is Y410P, os is windows 10 64bits, compiler is vc2015 64bits with update 2 install).
You can find all the details of different attacks at here(click me).
Computation Performance Test--img_hash vs PHash library
I use different algorithms to compute the hash of 100 images from ukbench(ukbench03000.jpg~ukbench03099.jpg). The source codes of opencv comparison is located at here(check the function measure_computation_time and measure_comparison_time, I am using img_hash_1_0 when I am writing this post), source codes of PHash performance test(version 0.94 since I am on windows) is located at here.
In most cases, img_hash is faster than PHash, but the speed of BMH zero and BMH one are slower than PHash version almost 30% or 40%. The bottleneck is cv::resize(over 95% of times spend on it), to speed things up, we need a faster resize function.
Find similar image from ukbench
The results looks good, but could it find similar images? Of course dude, let me show you how could we measure the hash values of our target from ukbench(for simplicity, I only pick 100 images from ukbench).
target |
void find_target(cv::Ptr<cv::img_hash::ImgHashBase> algo, bool smaller) { using namespace cv::img_hash; cv::Mat input = cv::imread("ukbench/ukbench03037.jpg"); //not a good way to reuse the codes by calling //measure comparision time, please bear with me std::vector<cv::Mat> targets = measure_comparison_time(algo, ""); double idealValue; if(smaller) { idealValue = std::numeric_limits<double>::max(); } else { idealValue = std::numeric_limits<double>::min(); } size_t targetIndex = 0; cv::Mat inputHash; algo->compute(input, inputHash); for(size_t i = 0; i != targets.size(); ++i) { double const value = algo->compare(inputHash, targets[i]); if(smaller) { if(value < idealValue) { idealValue = value; targetIndex = i; } } else { if(value > idealValue) { idealValue = value; targetIndex = i; } } } std::cout<<"mismatch value : "<<idealValue<<std::endl; cv::Mat result = cv::imread("ukbench/ukbench0" + std::to_string(targetIndex + 3000) + ".jpg"); cv::imshow("input", input); cv::imshow("found img " + std::to_string(targetIndex + 3000), result); cv::waitKey(); cv::destroyAllWindows(); } void find_target() { using namespace cv::img_hash; find_target(AverageHash::create()); find_target(PHash::create()); find_target(MarrHildrethHash::create()); find_target(RadialVarianceHash::create(), false); find_target(BlockMeanHash::create(0)); find_target(BlockMeanHash::create(1)); }
You will find out every algorithms give you back the same image you are looking for.
Conclusion
Average hash and PHash are the fastest algorithms, but if you want a more robust one, pick BMH zero, BMH zero and BMH give similar resutls, but BMH one is slower since it need to spend more computation power. Hash comparision of Radial hash are much slower than other's, because it need to find out peak cross-correlation values from 40 combinations. If you want to know how to speed things up and know more about rotation invariant image hash algorithm, give this link(click me) a try.
You can find the test cases at here. If you think this post helpful, please give my repositories(blogCodes2 and my img_hash of opencv_contrib) a star :). If you want to join the developments, please open a pull request, thanks.
Cool
ReplyDeleteI was looking for Image hash because when I build opencv_worldxxx with img_hash can't be included in the module. I think it's cmake issue or visual studio integrator issue.
DeleteIt is in the opencv_contrib repository
DeleteIT'S NOT INCLUDED FROM OPENCV_CONTRIB, STUPID!
DeleteIt is, you can find it at here
Deletehttps://github.com/opencv/opencv_contrib/tree/master/modules/img_hash
This comment has been removed by the author.
ReplyDeleteGood post sir, I was wondering on ways to apply this for videos particularly in python and some basic theory regarding it, any help would be appreciated.
ReplyDelete