Sunday, 22 September 2013

Hough transform and openCV2--00 : implementation of standard hough-transform

    OpenCV2 provide different implementation of Hough-transform, in this post I want to analyze the implementation of the standard Hough-transform--HoughLinesStandard.You could find the codes of HoughLinesStandard at here(houghLinesStandard).

   Before we step into the codes of HoughLinesStandard, better take a look at the tutorial of openCV2(tutorial).

Codes 

struct LinePolar
{
    float rho;
    float angle;
};


struct hough_cmp_gt
{
    hough_cmp_gt(const int* _aux) : aux(_aux) {}
    bool operator()(int l1, int l2) const
    {
        return aux[l1] > aux[l2] || (aux[l1] == aux[l2] && l1 < l2);
    }
    const int* aux;
};

static void
HoughLinesStandard( const Mat& img, float rho, float theta,
                    int threshold, std::vector& lines, int linesMax )
{
    int i, j; //#ra0
    float irho = 1 / rho;

    CV_Assert( img.type() == CV_8UC1 ); //#ra1

    const uchar* image = img.data; //#rb1
    int step = (int)img.step; //#rb2
    int width = img.cols; //#ra2
    int height = img.rows; //#ra3

    int numangle = cvRound(CV_PI / theta);
    int numrho = cvRound(((width + height) * 2 + 1) / rho); //#ra4, #1

    AutoBuffer _accum((numangle+2) * (numrho+2)); //#ra5
    std::vector _sort_buf; //#ra6
    AutoBuffer _tabSin(numangle);
    AutoBuffer _tabCos(numangle);
    int *accum = _accum; //#ra7
    float *tabSin = _tabSin, *tabCos = _tabCos;

    memset( accum, 0, sizeof(accum[0]) * (numangle+2) * (numrho+2) );

    float ang = 0;
    for(int n = 0; n < numangle; ang += theta, n++ ) //#rd1
    {
        tabSin[n] = (float)(sin((double)ang) * irho); //#rc1
        tabCos[n] = (float)(cos((double)ang) * irho); //#rc2
    }

    // stage 1. fill accumulator
    for( i = 0; i < height; i++ ) //#rd2
        for( j = 0; j < width; j++ ) //#rd3
        {
            if( image[i * step + j] != 0 ) //#rb3
                for(int n = 0; n < numangle; n++ ) //#rd4
                {
                    int r = cvRound( j * tabCos[n] + 
                                   i * tabSin[n] ); //#2a
                    r += (numrho - 1) / 2; //#2b
                    accum[(n+1) * (numrho+2) + r+1]++; //#rd5, #3
                }
        }

    // stage 2. find local maximums
    for(int r = 0; r < numrho; r++ ) //#rd6
        for(int n = 0; n < numangle; n++ ) //#rd7
        {
            int base = (n+1) * (numrho+2) + r+1;
            if( accum[base] > threshold &&
                accum[base] > accum[base - 1] && 
                accum[base] >= accum[base + 1] &&
                accum[base] > accum[base - numrho - 2] && 
                accum[base] >= accum[base + numrho + 2] )
                _sort_buf.push_back(base);
        }

    // stage 3. sort the detected lines by accumulator value
    std::sort(_sort_buf.begin(), _sort_buf.end(), hough_cmp_gt(accum));

    // stage 4. store the first min(total,linesMax) lines to 
    // the output buffer
    linesMax = std::min(linesMax, (int)_sort_buf.size()); //#re1
    double scale = 1./(numrho+2);
    for( i = 0; i < linesMax; i++ ) //#rd8
    {
        LinePolar line; //#ra8
        int idx = _sort_buf[i];
        int n = cvFloor(idx*scale) - 1; //#4
        int r = idx - (n+1)*(numrho+2) - 1; //#5
        line.rho = (r - (numrho - 1)*0.5f) * rho; //#6
        line.angle = n * theta;
        lines.push_back(Vec2f(line.rho, line.angle)); //#rf1
    }
}
 
Anatomy of codes

I do not sure that my anatomy are 100% correct, if you find out anything wrong, please give me some advices, thanks a lot.

 #1 

numrho is the number of distance.According to Digital Image Processing 3rd, the range of the distance is between equation(1).


According to equation(1) and Pythagoras' Theorem, we know that the number of distance should have 2D + 1 units(at least).

It is 2D + 1 but not 2D because we need to include 0.
 
#2b 

r need to do some offset to make sure the value is positive.
#3 

 accum[(n + 1) * (numrho + 2) + r + 1]++;
(numrho + 2) is the number of columns of the accumalate table
(n + 1) make sure the first row would not be used
 (r + 1) make sure the first column would not be used
that is why the dimensions of the accumulate table(_accum) are (numangle+2) * (numrho+2)
because it need two more rows and columns to avoid the pointer access to invalid data.
#4 

 n is the row number of the table, which corresponding to the theta

#5 

r is same as the r at #3, from #3 we know that   
idx == accum[(n+1) * (numrho+2) + r+1]  => r = idx -(n+1) * (numrho+2) -1  

#6 

From #5, we regain the r same as #2b, but the r we need is not the #2b provided but the same as #2a, so we need to subtract it.
Refine codes 

   Thanks to openCV2, we know the details of how to translate the algorithm to codes, this save us a lot of troubles.But I believe we could make the codes become better.

  #ra1~#ra8

There are one technique which called lazy evaluation. In short, we could declare the variables we need just in time;In other words, we better avoid to declare the variables before we need it.#ra1~#ra7 declare the variables too early.

example : 

we could check the type before we declare any variable

original
int i, j;
float irho = 1.0/rho;
CV_Assert(img.type() == CV_8U);

after refine
CV_Assert(img.type() == CV_8U);
float const irho = 1.0/rho;
#rb1~#rb3  
We could access the pixels with more efficient solution

 original

const uchar* image = img.data; //#rb1
int step = (int)img.step; //#rb2
 
//.....a lot of codes 

 for( i = 0; i < height; i++ )
        for( j = 0; j < width; j++ ) 
        {             

          if( image[i * step + j] != 0 ) 
        } 

after refine


 for(int i = 0; i < height; ++i){
        uchar const *img_ptr = img.ptr<uchar>(i);
        for(int j = 0; j < width; ++j){
            //access the pixel with cheaper operation
            //original one need (i * img.step + j)
            if( img_ptr[j] != 0 )

 #rc1~#rc2

c-style sin only provide us the api which accept "double" because c do not support overload(this suck, really).That means we need to convert the float to double and convert it back float, this is
a waste.If we use std::sin this problem would not exist anymore, because c++ provide different overload for std::sin.

  original

tabSin[n] = (float)(sin(double(ang)) * irho);

after refine

tabSin[n] = std::sin(ang) * irho;

#rd1~#rd7 

We could declare the variable of for loop just in time, this way the codes would be easier to maintain and read.

  original
int i, j;
for(i = 0; i != 10; ++i)

after refine
for(int i = 0; i != 10; ++i)

 #re1

Use static_cast to replace old type casting.

 original

linesMax = std::min(linesMax, (int)_sort_buf.size());

after refine

linesMax = std::min(linesMax, static_cast<int>(_sort_buf.size()));


 #rf1

Call resize and use random access to replace push_back, this could save the cost of memory reallocation and the cost of checking the memory is big enough or not when you call push_back.

 
 original

lines.push_back(cv::Vec2f(lines.rho, lines.angle));

after refine
lines.resize(linesMax);
//.........
lines[i] = cv::Vec2f(lines.rho, lines.angle); 

Whatever, this is just my view about the codes, if you find out anything wrong, please give me some comments, thanks. 

As always, the codes(houghTransform/standardHough) can download from github