Qt and openCV: September 2013

Saturday, 28 September 2013

Segmenting leaf by openCV2--01 : By watershed and contours

In the last post(recognize leaf--00) I use threshold and contours to cut the leafs from the images, this time I will give watershed a try.

graph_00

graph_01

graph_02

Step 0

Read the images(graph_00~graph_02) and convert it to gray image.

cv::Mat color_image = cv::imread(Folder + "leaf" + num + ".png");
if(color_image.empty()){
  std::cerr<<"input is empty"<<std::endl;
  return -1;
}

cv::Mat image;
cv::cvtColor(color_image, image, CV_BGR2GRAY);

Step 1

Binarize the image.

cv::Mat binary;
cv::threshold(image, binary, 140, 255, cv::THRESH_BINARY);

Step 2

Remove small objects and noise.

auto const structure = cv::getStructuringElement(cv::MORPH_ELLIPSE, 
                       cv::Size(5, 5));
cv::Mat fore_ground;
cv::erode(binary, fore_ground, structure, cv::Point(-1, -1), 4);

We severely erode the images in order to retain only pixels belonging to the foreground.

Step 3

Some of the objects can't removed by erosion, so we find out the contours of the image and remove unimportant objects from the images in order to get the foreground(255), they will be considered to correspond to objects of interest.

std::vector<std::vector<cv::Point>> contours;
cv::findContours(fore_ground, contours, CV_RETR_EXTERNAL, 
      CV_CHAIN_APPROX_NONE);
OCV::remove_contours(contours, 8000, 50000);
fore_ground.setTo(0);
cv::drawContours(fore_ground, contours, -1, cv::Scalar(255), CV_FILLED);

remove_contours was explained in the previous post.

graph_03

graph_04

graph_05

Step 4

Mark the background as 128 and mark unknown pixels(neither foreground nor background) as 0.

cv::Mat back_ground;
cv::dilate(binary, back_ground, structure, cv::Point(-1, -1), 4);
cv::threshold(back_ground, back_ground, 1, 128, cv::THRESH_BINARY_INV);

graph_06

graph_07

graph_08

Step 5

Combined foreground and background to get marker and use the marker to get the mask we need.

cv::Mat markers = back_ground + fore_ground;       
cv::Mat mask;
markers.convertTo(mask, CV_32S);
cv::watershed(color_image, mask);
mask.convertTo(mask, CV_8U);
cv::threshold(mask, mask, 150, 255, CV_THRESH_BINARY);

graph_09

graph_10

graph_11

Step 6

Generate final result.

cv::Mat result;
color_image.copyTo(result, mask);

graph_12

graph_13

graph_14

As usual, the codes can download from github.

Next post of Recognize leaf by openCV2

~~Next post I will try to use K-means to cluster the images~~. I decided to put k-means as a new topic, next post of this series I will try to segment the leafs into independent leaf.

Friday, 27 September 2013

Segmenting leaf by openCV2--00 : By threshold and findContours

These series of post will try to cut the leaf from the images(graph_00~graph_02) with different solutions.The first solution is based on threshold and contours.

graph_00

graph_01

graph_02

Step 0

read the image and covert it to gray image

cv::Mat color_image = cv::imread(Folder + "leaf" + num + ".png");
if(color_image.empty()){
   std::cerr<<"input is empty"<<std::endl;
   return -1;
}

cv::Mat image;
cv::cvtColor(color_image, image, CV_BGR2GRAY);

Step 1

Apply threshold.

cv::Mat binary;
cv::threshold(image, binary, 140, 255, cv::THRESH_BINARY);

graph_03

graph_04

graph_05

Step 2

Use morphology operations(erosion and close) to remove discontinuous region(erosion) and close the holes(close).

auto const structure = cv::getStructuringElement(cv::MORPH_ELLIPSE, 
                                                 cv::Size(5, 5));
cv::erode(binary, binary, structure);
cv::morphologyEx(binary, binary, cv::MORPH_CLOSE, structure);

graph_06

graph_07

graph_08

Step 3

find contours of the images.

std::vector<std::vector<cv::Point>> contours;
cv::findContours(binary, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_NONE);

graph_09

graph_10

graph_11

Step 4

From step 3, we could see that graph_09~graph_11 still contains a lot of undesired blobs, we could remove them by the area of contours(the pixels in the contours).

void remove_contours(std::vector<std::vector<cv::Point>>&contours, 
                     double cmin, double cmax)
{
    auto it = std::partition(std::begin(contours), std::end(contours), 
    [=](std::vector const &data)
    {
       auto const size = cv::contourArea(data);
       return !(size < cmin || size > cmax);
    });
    contours.erase(it, std::end(contours));
}

After that, we invert the mask(change 0 to 255, 255 to 0).

graph_12

graph_13

graph_14

Final results

Use the masks(graph_12~graph_14) to locate the region of interset, we get the results.

graph_15

graph_16

graph_17

As usual, the codes can download from github.

Next post of Recognize leaf by openCV2

Next post I will try to use watershed to segment the images, maybe it could give us better results.

Sunday, 22 September 2013

Hough transform and openCV2--00 : implementation of standard hough-transform

OpenCV2 provide different implementation of Hough-transform, in this post I want to analyze the implementation of the standard Hough-transform--HoughLinesStandard.You could find the codes of HoughLinesStandard at here(houghLinesStandard).

Before we step into the codes of HoughLinesStandard, better take a look at the tutorial of openCV2(tutorial).

Codes

struct LinePolar
{
    float rho;
    float angle;
};


struct hough_cmp_gt
{
    hough_cmp_gt(const int* _aux) : aux(_aux) {}
    bool operator()(int l1, int l2) const
    {
        return aux[l1] > aux[l2] || (aux[l1] == aux[l2] && l1 < l2);
    }
    const int* aux;
};

static void
HoughLinesStandard( const Mat& img, float rho, float theta,
                    int threshold, std::vector& lines, int linesMax )
{
    int i, j; //#ra0
    float irho = 1 / rho;

    CV_Assert( img.type() == CV_8UC1 ); //#ra1

    const uchar* image = img.data; //#rb1
    int step = (int)img.step; //#rb2
    int width = img.cols; //#ra2
    int height = img.rows; //#ra3

    int numangle = cvRound(CV_PI / theta);
    int numrho = cvRound(((width + height) * 2 + 1) / rho); //#ra4, #1

    AutoBuffer _accum((numangle+2) * (numrho+2)); //#ra5
    std::vector _sort_buf; //#ra6
    AutoBuffer _tabSin(numangle);
    AutoBuffer _tabCos(numangle);
    int *accum = _accum; //#ra7
    float *tabSin = _tabSin, *tabCos = _tabCos;

    memset( accum, 0, sizeof(accum[0]) * (numangle+2) * (numrho+2) );

    float ang = 0;
    for(int n = 0; n < numangle; ang += theta, n++ ) //#rd1
    {
        tabSin[n] = (float)(sin((double)ang) * irho); //#rc1
        tabCos[n] = (float)(cos((double)ang) * irho); //#rc2
    }

    // stage 1. fill accumulator
    for( i = 0; i < height; i++ ) //#rd2
        for( j = 0; j < width; j++ ) //#rd3
        {
            if( image[i * step + j] != 0 ) //#rb3
                for(int n = 0; n < numangle; n++ ) //#rd4
                {
                    int r = cvRound( j * tabCos[n] + 
                                   i * tabSin[n] ); //#2a
                    r += (numrho - 1) / 2; //#2b
                    accum[(n+1) * (numrho+2) + r+1]++; //#rd5, #3
                }
        }

    // stage 2. find local maximums
    for(int r = 0; r < numrho; r++ ) //#rd6
        for(int n = 0; n < numangle; n++ ) //#rd7
        {
            int base = (n+1) * (numrho+2) + r+1;
            if( accum[base] > threshold &&
                accum[base] > accum[base - 1] && 
                accum[base] >= accum[base + 1] &&
                accum[base] > accum[base - numrho - 2] && 
                accum[base] >= accum[base + numrho + 2] )
                _sort_buf.push_back(base);
        }

    // stage 3. sort the detected lines by accumulator value
    std::sort(_sort_buf.begin(), _sort_buf.end(), hough_cmp_gt(accum));

    // stage 4. store the first min(total,linesMax) lines to 
    // the output buffer
    linesMax = std::min(linesMax, (int)_sort_buf.size()); //#re1
    double scale = 1./(numrho+2);
    for( i = 0; i < linesMax; i++ ) //#rd8
    {
        LinePolar line; //#ra8
        int idx = _sort_buf[i];
        int n = cvFloor(idx*scale) - 1; //#4
        int r = idx - (n+1)*(numrho+2) - 1; //#5
        line.rho = (r - (numrho - 1)*0.5f) * rho; //#6
        line.angle = n * theta;
        lines.push_back(Vec2f(line.rho, line.angle)); //#rf1
    }
}

Anatomy of codes

I do not sure that my anatomy are 100% correct, if you find out anything wrong, please give me some advices, thanks a lot.

numrho is the number of distance.According to Digital Image Processing 3rd, the range of the distance is between equation(1).

According to equation(1) and Pythagoras' Theorem, we know that the number of distance should have 2D + 1 units(at least).

It is 2D + 1 but not 2D because we need to include 0.

#2b

r need to do some offset to make sure the value is positive.

accum[(n + 1) * (numrho + 2) + r + 1]++;

(numrho + 2) is the number of columns of the accumalate table

(n + 1) make sure the first row would not be used

(r + 1) make sure the first column would not be used

that is why the dimensions of the accumulate table(_accum) are (numangle+2) * (numrho+2)

because it need two more rows and columns to avoid the pointer access to invalid data.

n is the row number of the table, which corresponding to the theta

r is same as the r at #3, from #3 we know that
idx == accum[(n+1) * (numrho+2) + r+1] => r = idx -(n+1) * (numrho+2) -1

From #5, we regain the r same as #2b, but the r we need is not the #2b provided but the same as #2a, so we need to subtract it.

Refine codes

Thanks to openCV2, we know the details of how to translate the algorithm to codes, this save us a lot of troubles.But I believe we could make the codes become better.

#ra1~#ra8

There are one technique which called lazy evaluation. In short, we could declare the variables we need just in time;In other words, we better avoid to declare the variables before we need it.#ra1~#ra7 declare the variables too early.

example :

we could check the type before we declare any variable

original

int i, j;

float irho = 1.0/rho;

CV_Assert(img.type() == CV_8U);

after refine

CV_Assert(img.type() == CV_8U);

float const irho = 1.0/rho;

#rb1~#rb3

We could access the pixels with more efficient solution

original

const uchar* image = img.data; //#rb1

int step = (int)img.step; //#rb2

//.....a lot of codes

 for( i = 0; i < height; i++ )

        for( j = 0; j < width; j++ )

          if( image[i * step + j] != 0 )

after refine

 for(int i = 0; i < height; ++i){

        uchar const *img_ptr = img.ptr<uchar>(i);

        for(int j = 0; j < width; ++j){

            //access the pixel with cheaper operation

            //original one need (i * img.step + j)

            if( img_ptr[j] != 0 )

#rc1~#rc2

c-style sin only provide us the api which accept "double" because c do not support overload(this suck, really).That means we need to convert the float to double and convert it back float, this is

a waste.If we use std::sin this problem would not exist anymore, because c++ provide different overload for std::sin.

original

tabSin[n] = (float)(sin(double(ang)) * irho);

after refine

tabSin[n] = std::sin(ang) * irho;

#rd1~#rd7

We could declare the variable of for loop just in time, this way the codes would be easier to maintain and read.

original

int i, j;

for(i = 0; i != 10; ++i)

after refine

for(int i = 0; i != 10; ++i)

#re1

Use static_cast to replace old type casting.

original

linesMax = std::min(linesMax, (int)_sort_buf.size());

after refine

linesMax = std::min(linesMax, static_cast<int>(_sort_buf.size()));

#rf1

Call resize and use random access to replace push_back, this could save the cost of memory reallocation and the cost of checking the memory is big enough or not when you call push_back.

original

lines.push_back(cv::Vec2f(lines.rho, lines.angle));

after refine

lines.resize(linesMax);

//.........

lines[i] = cv::Vec2f(lines.rho, lines.angle);

Whatever, this is just my view about the codes, if you find out anything wrong, please give me some comments, thanks.

As always, the codes(houghTransform/standardHough) can download from github

Wednesday, 18 September 2013

otsu threshold and openCV2--00 : implementation of otsu threshold

    First of all, otsu threshold was proposed by Nobuyuki otsu in their paper.This post will talk about the implementation of the otsu threshold of openCV2.

     Following post will introduce the basic idea of otsu threshold based on the text book Digital image processing 3rd, this could help us understand what are the codes trying to do, I would not try to explain the details of otsu threshold becaus there are many good resouces to study, you should read the paper or find a good text book like Digital image processing 3rd if you want to know the whole story.

     Objective of otsu threshold is maximize the between-class variance(9).The basic idea is simple, if the threshold is good enough, the intensity of different class should be pretty different;In other words, if we could make the intensities between classes have biggest different, the threshold should be the best one.

    Class is the component of pixels within specific range.In otsu threshold, we classify the pixels into two classes.If C1 range is within [0, k], then C2 is within [k + 1, L - 1](assume the pixels intensity is within [0, L - 1]).

I want to give some explanation of the equations which will be used by the implementation of openCV2, without some investigation to these equations, it is almost impossible to understand what are the codes doing.

(1) is the total probability of C1

(2) is the total probability of C2.

(3) and (4) is the mean intensity of C1 and C2.

(5) is the global mean of the whole image.

(7) is deduced by (6) and will be used to compute (4).

(8) is the total variance combine by two classes

(9) is the between-class variance.

Now we should be able to study the codes.The implementation of the otsu threshold of openCV2 could be found on otsu threshold.Below is the code from openCV2(I do some small modification to it ).

static double
getThreshVal_Otsu_8u( const Mat& _src )
{
    Size size = _src.size();
    if( _src.isContinuous() )
    {
        size.width *= size.height;
        size.height = 1;
    }
    int const N = 256;
    int h[N] = {0}; //histogram
    for(int i = 0; i < size.height; ++i ) //#1
    {
        uchar const *src = _src.data + _src.step*i;
        
        for(int j = 0; j < size.width; ++j )
            ++h[src[j]];
    }

    double mu = 0, scale = 1./(size.width*size.height);
    for(int i = 0; i < N; ++i )
        mu += i * static_cast<double>(h[i]);

    mu *= scale;  //#2
    double mu1 = 0, q1 = 0;
    double max_sigma = 0, max_val = 0;

    for(int i = 0; i < N; ++i )
    {       
        double const p_i = h[i] * scale; //#3   
        q1 += p_i; //#4
        double const q2 = 1. - q1; //#5

        
        if( std::min(q1,q2) < FLT_EPSILON || 
            std::max(q1,q2) > 1. - FLT_EPSILON )
            continue;  //#6

        mu1 = (mu1 * q1 + i * p_i) / q1; //#7
        double const mu2 = (mu - q1*mu1)/q2; //#8
        double const sigma = q1 * q2 *
             (mu1 - mu2)*(mu1 - mu2); //#9
        if( sigma > max_sigma )
        {
            max_sigma = sigma;
            max_val = i;
        }
    }

    return max_val;
}

The algorithm of openCV2 adopt is pretty straigtforward.

Step 1 : build the histogram of the 8bit input image

Step 2 : find out global mean of equation(4)

Step 3 : iterate i from 0~255, classify the classes from [0, i], [i, L-1]. Find out the i which make the equation(9) become biggest.

Anatomy of codes

The loop of #1 build the histogram.

mu(#2) equal to equation(5).

#3 is the probability of pixel i.

#4 is equal to equation(1).

#5 use equation(8) to find out the result of equation(2).

#6 pass the loop if one of the class probability is too small, this make sense since m1-m2
   would be very small in this case. If you wonder why do the codes check the min and max but
   not just checking min or max, please give this site(otsu problem) a visit, I have the same
   question too.

   #7 find out the result of equation(3)

   #8 equal to equation(7)

   #9 equal to equation(9)

Next post of otsu

There are an extension of otsu--multi otsu method, I would like to implement it by openCV2 next time(openCV2 haven't implemented this algorithm yet when I write this post).

Monday, 16 September 2013

mean shift and openCV2--01 : implementation of meanShift

In the last post(mean shift and openCV2--00) I try to track the region of interest by mean-shift and Camshift, but I haven't tried to understand how to implement mean-shift. In this post, I would like to analyze the implementation of the mean-shift from openCV2(openCV2 mean-shift).

Mean-shift start from an initial location and iteratively around, find the centroid location and repeats the procedure until the window center converges to a stable point. The new centroid could be found by the equation(1) come from wikipedia(mean-shift equation).(2) is the kernal function, which determine the weight of nearby points.OpenCV2 do not choose gaussian distribution as the kernel but using spatial moments as kernal.

Here is the core of the mean-shift.

for( i = 0; i < niters; i++ )
{
   //step 1 : make sure cur_rect has reasonable position and size
   cur_rect = cur_rect & Rect(0, 0, mat.cols, mat.rows); #1
   if( cur_rect == Rect() )
   {
      cur_rect.x = mat.cols/2;
      cur_rect.y = mat.rows/2;
   }
   cur_rect.width = std::max(cur_rect.width, 1);
   cur_rect.height = std::max(cur_rect.height, 1);
        
   //step 2 : calculate the moments
   Moments m = moments(mat(cur_rect));
        
   //step 3 : Calculating center of mass
   if( fabs(m.m00) < DBL_EPSILON ) //#2
      break;
        
   int dx = cvRound( m.m10/m.m00 - window.width*0.5 ); //#3
   int dy = cvRound( m.m01/m.m00 - window.height*0.5 ); 
        
   int nx = std::min(std::max(cur_rect.x + dx, 0), 
                     mat.cols - cur_rect.width); //#4
   int ny = std::min(std::max(cur_rect.y + dy, 0), 
                     mat.rows - cur_rect.height); 
        
   dx = nx - cur_rect.x; //#5
   dy = ny - cur_rect.y;
   cur_rect.x = nx; #6
   cur_rect.y = ny;
        
   // step 4 : Check for coverage centers mass & window
   if( dx*dx + dy*dy < eps ) //#7
      break;
}

Anatomy of codes

The purpose of #1 is make sure the cur_rect(initial position when i == 0) is in the range of the input cv::Mat(mat). If the cur_rect do not in the range of the mat, then cur_rect will equal to Rect().

Purpose of #2 is make sure the m00 big enough.m00, m01, m10 are found by the equation(3).

If we express it by codes, they would be the same as

template<typename Func>
float moment(cv::Mat const &input, Func func)
{
    float sum = 0;
    for(int row = 0; row != input.rows; ++row){
        auto ptr = input.ptr<float>(row);
        for(int col = 0; col != input.cols; ++col){
            sum += func(ptr[col], row, col);
        }
    }

    return sum;
}

int main()
{
    cv::Mat input = (cv::Mat_(3,3) <<
                     1, 2, 3,
                     4, 5, 6,
                     7, 8, 9);

   float const m00 = moment(input, [](float data, int, int)
   {
      return data;
   });

   float const m10 = moment(input, [](float data, int, int col)
   {
      return data * col;
   });

   float const m01 = moment(input, [](float data, int row, int)
   {
      return data * row;
   });
}

#3 estimate how far the horizontal position of the new centroid could be.

#4 measure the new centroid position of horizontal.

std::max(cur_rect.x + dx, 0)

make sure the new centroid horizontal position would not smaller than 0.

std::min(std::max(cur_rect.x + dx, 0), 
                     mat.cols - cur_rect.width);

outer std::min make sure the new centroid horizontal position greater or equal to mat.cols - cur_rect.width.

#5 is the distance between new centroid horizontal position and old centroid horizontal position.

#6 is the new horizontal position of centroid(the center of the window).

#7 measure the distance between old centroid and new centroid are small enough or not, if they
are small enough, the program will end the iteration, else continue until one of the criteria meet(
iteration counts or the distance between old centroid and new centroid).The distance is measure by
euclidean distance(4).The program don't need to take the square root of (dx * dx) + (dy * dy), because the eps already multiply with itself(eps = cvRound(eps*eps)).