Saturday, 28 September 2013

Segmenting leaf by openCV2--01 : By watershed and contours

In the last post(recognize leaf--00) I use threshold and contours to cut the leafs from the images, this time I will give watershed a try.


graph_00
graph_01
graph_02


Step 0

  Read the images(graph_00~graph_02) and convert it to gray image.


cv::Mat color_image = cv::imread(Folder + "leaf" + num + ".png");
if(color_image.empty()){
  std::cerr<<"input is empty"<<std::endl;
  return -1;
}

cv::Mat image;
cv::cvtColor(color_image, image, CV_BGR2GRAY);

Step 1

Binarize the image.



cv::Mat binary;
cv::threshold(image, binary, 140, 255, cv::THRESH_BINARY); 

Step 2

Remove small objects and noise.

auto const structure = cv::getStructuringElement(cv::MORPH_ELLIPSE, 
                       cv::Size(5, 5));
cv::Mat fore_ground;
cv::erode(binary, fore_ground, structure, cv::Point(-1, -1), 4);

We severely erode the images in order to retain only pixels belonging to the foreground.

Step 3

Some of the objects can't removed by erosion, so we find out the contours of the image and remove unimportant objects from the images in order to get the foreground(255), they will be considered to correspond to objects of interest.


std::vector<std::vector<cv::Point>> contours;
cv::findContours(fore_ground, contours, CV_RETR_EXTERNAL, 
      CV_CHAIN_APPROX_NONE);
OCV::remove_contours(contours, 8000, 50000);
fore_ground.setTo(0);
cv::drawContours(fore_ground, contours, -1, cv::Scalar(255), CV_FILLED);

remove_contours was explained in the previous post.

graph_03
graph_04
graph_05



Step 4

  Mark the background as 128 and mark unknown pixels(neither foreground nor background) as 0.


cv::Mat back_ground;
cv::dilate(binary, back_ground, structure, cv::Point(-1, -1), 4);
cv::threshold(back_ground, back_ground, 1, 128, cv::THRESH_BINARY_INV);
graph_06
graph_07
graph_08

Step 5

  Combined foreground and background to get marker and use the marker to get the mask we need.


cv::Mat markers = back_ground + fore_ground;       
cv::Mat mask;
markers.convertTo(mask, CV_32S);
cv::watershed(color_image, mask);
mask.convertTo(mask, CV_8U);
cv::threshold(mask, mask, 150, 255, CV_THRESH_BINARY);

graph_09
graph_10
graph_11

Step 6

Generate final result.


cv::Mat result;
color_image.copyTo(result, mask);

graph_12
graph_13
graph_14
As usual, the codes can download from github.

Next post of Recognize leaf by openCV2

 Next post I will try to use K-means to cluster the images. I decided to put k-means as a new topic, next post of this series I will try to segment the leafs into independent leaf.

Friday, 27 September 2013

Segmenting leaf by openCV2--00 : By threshold and findContours

  These series of post will try to cut the leaf from the images(graph_00~graph_02) with different solutions.The first solution is based on threshold and contours.

graph_00
graph_01
graph_02


Step 0

read the image and covert it to gray image

cv::Mat color_image = cv::imread(Folder + "leaf" + num + ".png");
if(color_image.empty()){
   std::cerr<<"input is empty"<<std::endl;
   return -1;
}

cv::Mat image;
cv::cvtColor(color_image, image, CV_BGR2GRAY);
Step 1
 
  Apply threshold.

 
cv::Mat binary;
cv::threshold(image, binary, 140, 255, cv::THRESH_BINARY);
graph_03
graph_04
graph_05

Step 2 

Use morphology operations(erosion and close) to remove discontinuous region(erosion) and close the holes(close).


auto const structure = cv::getStructuringElement(cv::MORPH_ELLIPSE, 
                                                 cv::Size(5, 5));
cv::erode(binary, binary, structure);
cv::morphologyEx(binary, binary, cv::MORPH_CLOSE, structure);
graph_06
graph_07
graph_08
 Step 3
find contours of the images.
std::vector<std::vector<cv::Point>> contours;
cv::findContours(binary, contours, CV_RETR_EXTERNAL, CV_CHAIN_APPROX_NONE);
graph_09
graph_10
graph_11
Step 4

From step 3, we could see that graph_09~graph_11 still contains a lot of undesired blobs, we could remove them by the area of contours(the pixels in the contours).

void remove_contours(std::vector<std::vector<cv::Point>>&contours, 
                     double cmin, double cmax)
{
    auto it = std::partition(std::begin(contours), std::end(contours), 
    [=](std::vector const &data)
    {
       auto const size = cv::contourArea(data);
       return !(size < cmin || size > cmax);
    });
    contours.erase(it, std::end(contours));
} 
After that, we invert the mask(change 0 to 255, 255 to 0).

graph_12
graph_13
graph_14
Final results

 Use the masks(graph_12~graph_14) to locate the region of interset, we get the results.

graph_15
graph_16
graph_17

 As usual, the codes can download from github.

Next post of Recognize leaf by openCV2

 Next post I will try to use watershed to segment the images, maybe it could give us better results.

Sunday, 22 September 2013

Hough transform and openCV2--00 : implementation of standard hough-transform

    OpenCV2 provide different implementation of Hough-transform, in this post I want to analyze the implementation of the standard Hough-transform--HoughLinesStandard.You could find the codes of HoughLinesStandard at here(houghLinesStandard).

   Before we step into the codes of HoughLinesStandard, better take a look at the tutorial of openCV2(tutorial).

Codes 

struct LinePolar
{
    float rho;
    float angle;
};


struct hough_cmp_gt
{
    hough_cmp_gt(const int* _aux) : aux(_aux) {}
    bool operator()(int l1, int l2) const
    {
        return aux[l1] > aux[l2] || (aux[l1] == aux[l2] && l1 < l2);
    }
    const int* aux;
};

static void
HoughLinesStandard( const Mat& img, float rho, float theta,
                    int threshold, std::vector& lines, int linesMax )
{
    int i, j; //#ra0
    float irho = 1 / rho;

    CV_Assert( img.type() == CV_8UC1 ); //#ra1

    const uchar* image = img.data; //#rb1
    int step = (int)img.step; //#rb2
    int width = img.cols; //#ra2
    int height = img.rows; //#ra3

    int numangle = cvRound(CV_PI / theta);
    int numrho = cvRound(((width + height) * 2 + 1) / rho); //#ra4, #1

    AutoBuffer _accum((numangle+2) * (numrho+2)); //#ra5
    std::vector _sort_buf; //#ra6
    AutoBuffer _tabSin(numangle);
    AutoBuffer _tabCos(numangle);
    int *accum = _accum; //#ra7
    float *tabSin = _tabSin, *tabCos = _tabCos;

    memset( accum, 0, sizeof(accum[0]) * (numangle+2) * (numrho+2) );

    float ang = 0;
    for(int n = 0; n < numangle; ang += theta, n++ ) //#rd1
    {
        tabSin[n] = (float)(sin((double)ang) * irho); //#rc1
        tabCos[n] = (float)(cos((double)ang) * irho); //#rc2
    }

    // stage 1. fill accumulator
    for( i = 0; i < height; i++ ) //#rd2
        for( j = 0; j < width; j++ ) //#rd3
        {
            if( image[i * step + j] != 0 ) //#rb3
                for(int n = 0; n < numangle; n++ ) //#rd4
                {
                    int r = cvRound( j * tabCos[n] + 
                                   i * tabSin[n] ); //#2a
                    r += (numrho - 1) / 2; //#2b
                    accum[(n+1) * (numrho+2) + r+1]++; //#rd5, #3
                }
        }

    // stage 2. find local maximums
    for(int r = 0; r < numrho; r++ ) //#rd6
        for(int n = 0; n < numangle; n++ ) //#rd7
        {
            int base = (n+1) * (numrho+2) + r+1;
            if( accum[base] > threshold &&
                accum[base] > accum[base - 1] && 
                accum[base] >= accum[base + 1] &&
                accum[base] > accum[base - numrho - 2] && 
                accum[base] >= accum[base + numrho + 2] )
                _sort_buf.push_back(base);
        }

    // stage 3. sort the detected lines by accumulator value
    std::sort(_sort_buf.begin(), _sort_buf.end(), hough_cmp_gt(accum));

    // stage 4. store the first min(total,linesMax) lines to 
    // the output buffer
    linesMax = std::min(linesMax, (int)_sort_buf.size()); //#re1
    double scale = 1./(numrho+2);
    for( i = 0; i < linesMax; i++ ) //#rd8
    {
        LinePolar line; //#ra8
        int idx = _sort_buf[i];
        int n = cvFloor(idx*scale) - 1; //#4
        int r = idx - (n+1)*(numrho+2) - 1; //#5
        line.rho = (r - (numrho - 1)*0.5f) * rho; //#6
        line.angle = n * theta;
        lines.push_back(Vec2f(line.rho, line.angle)); //#rf1
    }
}
 
Anatomy of codes

I do not sure that my anatomy are 100% correct, if you find out anything wrong, please give me some advices, thanks a lot.

 #1 

numrho is the number of distance.According to Digital Image Processing 3rd, the range of the distance is between equation(1).


According to equation(1) and Pythagoras' Theorem, we know that the number of distance should have 2D + 1 units(at least).

It is 2D + 1 but not 2D because we need to include 0.
 
#2b 

r need to do some offset to make sure the value is positive.
#3 

 accum[(n + 1) * (numrho + 2) + r + 1]++;
(numrho + 2) is the number of columns of the accumalate table
(n + 1) make sure the first row would not be used
 (r + 1) make sure the first column would not be used
that is why the dimensions of the accumulate table(_accum) are (numangle+2) * (numrho+2)
because it need two more rows and columns to avoid the pointer access to invalid data.
#4 

 n is the row number of the table, which corresponding to the theta

#5 

r is same as the r at #3, from #3 we know that   
idx == accum[(n+1) * (numrho+2) + r+1]  => r = idx -(n+1) * (numrho+2) -1  

#6 

From #5, we regain the r same as #2b, but the r we need is not the #2b provided but the same as #2a, so we need to subtract it.
Refine codes 

   Thanks to openCV2, we know the details of how to translate the algorithm to codes, this save us a lot of troubles.But I believe we could make the codes become better.

  #ra1~#ra8

There are one technique which called lazy evaluation. In short, we could declare the variables we need just in time;In other words, we better avoid to declare the variables before we need it.#ra1~#ra7 declare the variables too early.

example : 

we could check the type before we declare any variable

original
int i, j;
float irho = 1.0/rho;
CV_Assert(img.type() == CV_8U);

after refine
CV_Assert(img.type() == CV_8U);
float const irho = 1.0/rho;
#rb1~#rb3  
We could access the pixels with more efficient solution

 original

const uchar* image = img.data; //#rb1
int step = (int)img.step; //#rb2
 
//.....a lot of codes 

 for( i = 0; i < height; i++ )
        for( j = 0; j < width; j++ ) 
        {             

          if( image[i * step + j] != 0 ) 
        } 

after refine


 for(int i = 0; i < height; ++i){
        uchar const *img_ptr = img.ptr<uchar>(i);
        for(int j = 0; j < width; ++j){
            //access the pixel with cheaper operation
            //original one need (i * img.step + j)
            if( img_ptr[j] != 0 )

 #rc1~#rc2

c-style sin only provide us the api which accept "double" because c do not support overload(this suck, really).That means we need to convert the float to double and convert it back float, this is
a waste.If we use std::sin this problem would not exist anymore, because c++ provide different overload for std::sin.

  original

tabSin[n] = (float)(sin(double(ang)) * irho);

after refine

tabSin[n] = std::sin(ang) * irho;

#rd1~#rd7 

We could declare the variable of for loop just in time, this way the codes would be easier to maintain and read.

  original
int i, j;
for(i = 0; i != 10; ++i)

after refine
for(int i = 0; i != 10; ++i)

 #re1

Use static_cast to replace old type casting.

 original

linesMax = std::min(linesMax, (int)_sort_buf.size());

after refine

linesMax = std::min(linesMax, static_cast<int>(_sort_buf.size()));


 #rf1

Call resize and use random access to replace push_back, this could save the cost of memory reallocation and the cost of checking the memory is big enough or not when you call push_back.

 
 original

lines.push_back(cv::Vec2f(lines.rho, lines.angle));

after refine
lines.resize(linesMax);
//.........
lines[i] = cv::Vec2f(lines.rho, lines.angle); 

Whatever, this is just my view about the codes, if you find out anything wrong, please give me some comments, thanks. 

As always, the codes(houghTransform/standardHough) can download from github

Wednesday, 18 September 2013

otsu threshold and openCV2--00 : implementation of otsu threshold

    First of all, otsu threshold was proposed by Nobuyuki otsu in their paper.This post will talk about the implementation of the otsu threshold of openCV2.

     Following post will introduce the basic idea of otsu threshold based on the text book Digital image processing 3rd, this could help us understand what are the codes trying to do, I would not try to explain the details of otsu threshold becaus there are many good resouces to study, you should read the paper or find a good text book like Digital image processing 3rd if you want to know the whole story.

     Objective of otsu threshold is maximize the between-class variance(9).The basic idea is simple, if the threshold is good enough, the intensity of different class should be pretty different;In other words, if we could make the intensities between classes have biggest different, the threshold should be the best one.

    Class is the component of pixels within specific range.In otsu threshold, we classify the pixels into two classes.If C1 range is within [0, k], then C2 is within [k + 1, L - 1](assume the pixels intensity is within [0,  L - 1]).
   

   I want to give some explanation of the equations which will be used by the implementation of openCV2, without some investigation to these equations, it is almost impossible to understand what are the codes doing.

(1) is the total probability of C1
(2) is the total probability of C2.
(3) and (4) is the mean intensity of C1 and C2.
(5) is the global mean of the whole image.



(7) is deduced by (6) and will be used to compute (4).
(8) is the total variance combine by two classes
(9) is the  between-class variance.
  
    Now we should be able to study the codes.The implementation of the otsu threshold of openCV2 could be found on otsu threshold.Below is the code from openCV2(I do some small modification to it ).

static double
getThreshVal_Otsu_8u( const Mat& _src )
{
    Size size = _src.size();
    if( _src.isContinuous() )
    {
        size.width *= size.height;
        size.height = 1;
    }
    int const N = 256;
    int h[N] = {0}; //histogram
    for(int i = 0; i < size.height; ++i ) //#1
    {
        uchar const *src = _src.data + _src.step*i;
        
        for(int j = 0; j < size.width; ++j )
            ++h[src[j]];
    }

    double mu = 0, scale = 1./(size.width*size.height);
    for(int i = 0; i < N; ++i )
        mu += i * static_cast<double>(h[i]);

    mu *= scale;  //#2
    double mu1 = 0, q1 = 0;
    double max_sigma = 0, max_val = 0;

    for(int i = 0; i < N; ++i )
    {       
        double const p_i = h[i] * scale; //#3   
        q1 += p_i; //#4
        double const q2 = 1. - q1; //#5

        
        if( std::min(q1,q2) < FLT_EPSILON || 
            std::max(q1,q2) > 1. - FLT_EPSILON )
            continue;  //#6

        mu1 = (mu1 * q1 + i * p_i) / q1; //#7
        double const mu2 = (mu - q1*mu1)/q2; //#8
        double const sigma = q1 * q2 *
             (mu1 - mu2)*(mu1 - mu2); //#9
        if( sigma > max_sigma )
        {
            max_sigma = sigma;
            max_val = i;
        }
    }

    return max_val;
}
 
The algorithm of openCV2 adopt is pretty straigtforward.  

Step 1 : build the histogram of the 8bit input image  

Step 2 : find out global mean of equation(4)  

Step 3 : iterate i from 0~255, classify the classes from [0, i], [i, L-1]. Find out the i which make the equation(9) become biggest.  
 
Anatomy of codes
 
  The loop of #1 build the histogram.  

  mu(#2) equal to equation(5).

  #3 is the probability of pixel i.

  #4 is equal to equation(1).

  #5 use equation(8) to find out the result of equation(2).

  #6 pass the loop if one of the class probability is too small, this make sense since m1-m2
   would be very small in this case. If you wonder why do the codes check the min and max but
   not just checking min or max, please give this site(otsu problem) a visit, I have the same
   question too.

   #7 find out the result of equation(3)

   #8 equal to equation(7)

   #9 equal to equation(9)

Next post of otsu
 
  There are an extension of otsu--multi otsu method, I would like to implement it by openCV2 next time(openCV2 haven't implemented this algorithm yet when I write this post).

Monday, 16 September 2013

mean shift and openCV2--01 : implementation of meanShift

  In the last post(mean shift and openCV2--00) I try to track the region of interest by mean-shift and Camshift, but I haven't tried to understand how to implement mean-shift. In this post, I would like to analyze the implementation of the mean-shift from openCV2(openCV2 mean-shift).

  Mean-shift start from an initial location and iteratively around,  find the centroid location and repeats the procedure until the window center converges to a stable point. The new centroid could be found by the equation(1) come from wikipedia(mean-shift equation).(2) is the kernal function, which determine the weight of nearby points.OpenCV2 do not choose gaussian distribution as the kernel but using spatial moments as kernal.




    Here is the core of the mean-shift.


for( i = 0; i < niters; i++ )
{
   //step 1 : make sure cur_rect has reasonable position and size
   cur_rect = cur_rect & Rect(0, 0, mat.cols, mat.rows); #1
   if( cur_rect == Rect() )
   {
      cur_rect.x = mat.cols/2;
      cur_rect.y = mat.rows/2;
   }
   cur_rect.width = std::max(cur_rect.width, 1);
   cur_rect.height = std::max(cur_rect.height, 1);
        
   //step 2 : calculate the moments
   Moments m = moments(mat(cur_rect));
        
   //step 3 : Calculating center of mass
   if( fabs(m.m00) < DBL_EPSILON ) //#2
      break;
        
   int dx = cvRound( m.m10/m.m00 - window.width*0.5 ); //#3
   int dy = cvRound( m.m01/m.m00 - window.height*0.5 ); 
        
   int nx = std::min(std::max(cur_rect.x + dx, 0), 
                     mat.cols - cur_rect.width); //#4
   int ny = std::min(std::max(cur_rect.y + dy, 0), 
                     mat.rows - cur_rect.height); 
        
   dx = nx - cur_rect.x; //#5
   dy = ny - cur_rect.y;
   cur_rect.x = nx; #6
   cur_rect.y = ny;
        
   // step 4 : Check for coverage centers mass & window
   if( dx*dx + dy*dy < eps ) //#7
      break;
}
Anatomy of codes 
  The purpose of #1 is make sure the cur_rect(initial position when i == 0) is in the range of the input cv::Mat(mat). If the cur_rect do not in the range of the mat, then cur_rect will equal to Rect().

   Purpose of #2 is make sure the m00 big enough.m00, m01, m10 are found by the equation(3).


  If we express it by codes, they would be the same as

template<typename Func>
float moment(cv::Mat const &input, Func func)
{
    float sum = 0;
    for(int row = 0; row != input.rows; ++row){
        auto ptr = input.ptr<float>(row);
        for(int col = 0; col != input.cols; ++col){
            sum += func(ptr[col], row, col);
        }
    }

    return sum;
}

int main()
{
    cv::Mat input = (cv::Mat_(3,3) <<
                     1, 2, 3,
                     4, 5, 6,
                     7, 8, 9);

   float const m00 = moment(input, [](float data, int, int)
   {
      return data;
   });

   float const m10 = moment(input, [](float data, int, int col)
   {
      return data * col;
   });

   float const m01 = moment(input, [](float data, int row, int)
   {
      return data * row;
   });
}

#3 estimate how far the horizontal position of the new centroid could be.

#4 measure the new centroid position of horizontal.

std::max(cur_rect.x + dx, 0)
make sure the new centroid horizontal position would not smaller than 0.

std::min(std::max(cur_rect.x + dx, 0), 
                     mat.cols - cur_rect.width);
outer std::min make sure the new centroid horizontal position greater or equal to mat.cols - cur_rect.width.

#5 is the distance between new centroid horizontal position and old centroid horizontal position.

#6 is the new horizontal position of centroid(the center of the window).

#7 measure the distance between old centroid and new centroid are small enough or not, if they
are small enough, the program will end the iteration, else continue until one of the criteria meet(
iteration counts or the distance between old centroid and new centroid).The distance is measure by
euclidean distance(4).The program don't need to take the square root of (dx * dx) + (dy * dy), because the eps already multiply with itself(eps = cvRound(eps*eps)).



Morphological operations and openCV2--01 : close and open

    Closing(3) and opening(4) are very similar to dilation(1) and erosion(2), but they are less destructive compare to dilation and erosion.From the view of mathematics, opening is erosion followed by a dilation with the same structuring element;Closing is dilation followed by a erosion with the same structuring element.


graph_00(Original image)
    Apply opening or closing in openCV2 is very easy

Close

cv::Mat const structure_elem = cv::getStructuringElement(
                     cv::MORPH_RECT, cv::Size(5, 5));
cv::Mat close_result;
cv::morphologyEx(input, close_result, 
            cv::MORPH_CLOSE, structure_elem);
graph_01(dilate)

graph_02(close)
Open
cv::Mat const structure_elem = cv::getStructuringElement(
                 cv::MORPH_RECT, cv::Size(5, 5));
cv::Mat open_result;
cv::morphologyEx(input, open_result, 
             cv::MORPH_OPEN, structure_elem); 
graph_03(erosion)
graph_04(open)
  Do remember that open and close are idempotent operation, that means the results would be the same if you apply open or close on the same image more than one time.

As usual, the codes can download from github

Sunday, 15 September 2013

Morphological operations and openCV2--00 : dilation and erosion

  There are many explanations about "what is morphological operations", the most intuitive explanation I like come from the book Learning OpenCV by Bradski and Kaehler

  A set of operations that process images based on shapes.
    Clean and easy to understand.More precisely, morphological operations would use the predefined shape(we call it structuring element) to slipped through all of the pixels of the image and replace the anchor pixel(most of the times, it is center) by minimum(erosion) or maximum(dilation) value.

    In morphological, the convention is to have fore-ground as white and background as black. The examples in this post also follow this convention.

Example :

These examples neglect boundary, only process the red pixels of input;green pixel is the anchor pixel, pink pixels are the pixels being changed.

Dilation

0 0 1 1 1 1 1 1                    1 1 1                                            0 0 1 1 1 1 1 1
0 1 1 1 0 1 0 0                    1 1 1                                            0 1 1 1 1 1 1 0     
0 0 0 0 0 1 1 1                    1 1 1                                            0 0 0 0 0 1 1 1

     input                            structuring element                             output


Erosion

0 0 1 1 1 1 1 1                    1 1 1                                            0 0 1 1 1 1 1 1
0 1 1 1 0 1 0 0                    1 1 1                                            0 0 0 0 0 0 0 0     
0 0 0 0 0 1 1 1                    1 1 1                                            0 0 0 0 0 1 1 1

     input                  structuring element                                      output

  You could defined any shape you like(ex : diamond, cross, circle and so on).The effects of dilation and erosion are grow or shrink the shapes.Dilation would grow the white region, shrink the dark region;Erosion would grow the dark region, shrink the white region.

 

 
graph_00(original image)

  I select binary image to process because it is 
  1. easier to understand
  2. most of the times, morphological filters work on binary images

  First, apply Otsu threshold on graph_00

cv::cvtColor(input, input, CV_BGR2GRAY);
cv::threshold(input, input, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);  


 graph_01(graph_00 after binarize)


  get the image after dilation
  
cv::Mat dilation(cv::Mat const &input)
{
    cv::Mat dilation_result = input.clone();
    //get the structuring of rectangle, size is 5 * 5
    cv::Mat const shape = cv::getStructuringElement
                  (cv::MORPH_RECT, cv::Size(5, 5));
    cv::dilate(input, dilation_result, shape);

    return dilation_result;
}
graph_02(dilation)
get the image after erosion

cv::Mat erosion(cv::Mat const &input)
{
    cv::Mat erosion_result = input.clone();
    //get the structuring of rectangle, size is 5 * 5
    cv::Mat const shape = cv::getStructuringElement(
                  cv::MORPH_RECT, cv::Size(5, 5));
    cv::erode(input, erosion_result, shape);

    return erosion_result;
}
graph_03(erosion)

If you dilate or erode it again, the effect would become more obvious.

graph_04(dilate two times)
graph_05(erode two times)

  From graph_02~graph_05, we can find out that dilation could grow the edges of white region;erosion can grow the edges of black region. You could treat dilation and erosion as some sort of filter, dilation will remove any black region which smaller than the structuring element;erosion will remove any white region which smaller than the structuring element.

As usual, the codes can download from github