Wednesday, 27 November 2013

Misunderstanding of c++ 02 -- new and delete must be slower than malloc and free

    Many c programmers believe that new and delete must be slower than new and delete, because they do too many jobs under the hood.new and delete have to deal with exception, new and delete must call constructor and destructor etc.

What kind of mistakes they make

1 :  We could disable exception, today many major compilers could do that for us(g++, clang++, visual c++).

2 : If the constructor and destructor are trivial, new and delete don't need to call the constructor and destructor.

3 : The comparison is unfair, by default, new == malloc + constructor, delete = destructor + free.


    For more details, you could check stack overflow.

  In theory, compiler of c++ should be able to perform same performance about new and malloc, delete and free, but could the compiler do that?

Codes


c
#include <stdio.h>
#include <stdlib.h> // pulls in declaration of malloc, free


int main()
{
    char *a = (char*)malloc(1024);

    for(int i = 0; i != 1024; ++i){
        printf("%c", a[i]);
    }

    free(a);
    a = NULL;

    return 0;
}


c++
#include <cstdio>

int main()
{
    char *a = new char[1024];

    for(int i = 0; i != 1024; ++i){
        printf("%c", a[i]);
    }

    delete []a;

    return 0;
}

c assembly : clang -S -O3 -mllvm --x86-asm-syntax=intel mallocAndNew00.c

.section    __TEXT,__text,regular,pure_instructions
    .globl  _main
    .align  4, 0x90
_main:                                  ## @main
    .cfi_startproc
## BB#0:                                ## %entry
    push    RBP
Ltmp3:
    .cfi_def_cfa_offset 16
Ltmp4:
    .cfi_offset rbp, -16
    mov RBP, RSP
Ltmp5:
    .cfi_def_cfa_register rbp
    push    R14
    push    RBX
Ltmp6:
    .cfi_offset rbx, -32
Ltmp7:
    .cfi_offset r14, -24
    mov EDI, 1024
    call    _malloc
    mov R14, RAX
    mov EBX, 1
    xor EDI, EDI
    jmp LBB0_1
    .align  4, 0x90
LBB0_2:                                 ## %for.body.for.body_crit_edge
                                        ##   in Loop: Header=BB0_1 Depth=1
    movsx   EDI, BYTE PTR [R14 + RBX]
    inc RBX
LBB0_1:                                 ## %for.body
                                        ## =>This Inner Loop Header: Depth=1
    call    _putchar
    cmp EBX, 1024
    jne LBB0_2
## BB#3:                                ## %for.end
    mov RDI, R14
    call    _free
    xor EAX, EAX
    pop RBX
    pop R14
    pop RBP
    ret
    .cfi_endproc


.subsections_via_symbols


c++ assembly : clang++ -S -O3 -std=c++11 -mllvm --x86-asm-syntax=intel mallocAndNew00.cpp
 .section __TEXT,__text,regular,pure_instructions
 .globl _main
 .align 4, 0x90
_main:                                  ## @main
 .cfi_startproc
## BB#0:                                ## %entry
 push RBP
Ltmp3:
 .cfi_def_cfa_offset 16
Ltmp4:
 .cfi_offset rbp, -16
 mov RBP, RSP
Ltmp5:
 .cfi_def_cfa_register rbp
 push R14
 push RBX
Ltmp6:
 .cfi_offset rbx, -32
Ltmp7:
 .cfi_offset r14, -24
 mov EDI, 1024
 call __Znam
 mov R14, RAX
 mov EBX, 1
 xor EDI, EDI
 jmp LBB0_1
 .align 4, 0x90
LBB0_2:                                 ## %for.body.for.body_crit_edge
                                        ##   in Loop: Header=BB0_1 Depth=1
 movsx EDI, BYTE PTR [R14 + RBX]
 inc RBX
LBB0_1:                                 ## %for.body
                                        ## =>This Inner Loop Header: Depth=1
 call _putchar
 cmp EBX, 1024
 jne LBB0_2
## BB#3:                                ## %for.end
 test R14, R14
 je LBB0_5
## BB#4:                                ## %delete.notnull
 mov RDI, R14
 call __ZdaPv
LBB0_5:                                 ## %delete.end
 xor EAX, EAX
 pop RBX
 pop R14
 pop RBP
 ret
 .cfi_endproc


.subsections_via_symbols

  This time the codes of c++ and c have some different, c++ call the assembly of new and delete(?), c call the assembly of malloc and new. c++ check the pointer is point to nullptr or not before delete, but c do not check it anyway.If you want the same behavior, call malloc and free in c++, do make sure what are you doing before you call malloc and free but not new and delete.

codes can download from github.

Monday, 25 November 2013

Misunderstanding of c++ 01 -- c++ must slower than c because c++ always construct and destruct object

    Unlike c, c++ support constructor and destructor, these are powerful tool for resource management. Many(or some?) c programmers claim that constructor and destructor could kill the performance, and they don't want to take the burden of it, they want to gain full control and this seems like an impossible mission in c++. Is this true? Absolutely no.

    What kind of mistakes they make?

1 : We do not have to take the cost of constructor and destructor, it is not a must.In other words, if your structure or class satisfy the requirements of trivial constructor and trivial destructor, you program will not invoke any constructor nor destructor.

A class or struct is trivial constructor if

A constructor of a class A is trivial if all the following are true:
  • It is implicitly defined
  • A has no virtual functions and no virtual base classes
  • All the direct base classes of A have trivial constructors
  • The classes of all the nonstatic data members of A have trivial constructors
If any of the above are false, then the constructor is nontrivial.

A class or struct is trivial destructor if

A destructor of a class A is trivial if all the following are true:
  • It is implicitly defined
  • All the direct base classes of A have trivial destructors
  • The classes of all the nonstatic data members of A have trivial destructors
If any of the above are false, then the destructor is nontrivial.


  All of the POD have trivial destructor and trivial constructor.

2 :  You think constructor and destructor cost you something, but in truth they rarely do(compare with equivalent C behavior)


  In theory, we don't have to take the cost of constructor and destructor,  moreover,  they rarely cost us anything. Theory is good, let us fallback to reality, can compiler get the job done?If you don't sure about it, measure, rather than guessing about performance.

  Following codes are compiled by clang3.3 and clang++3.3 on mac OS X 10.8.5(mac mini, intel cpu).


High level codes and associative assembly of case 1
 
 

c codes
struct trivialStruct
{

int a;
float b;
double c;

};

int main()
{
  struct trivialStruct A;  

  return 0;
}

assembly generated by command "clang -S -O2 -mllvm --x86-asm-syntax=intel trivialConstructDestruct00.c"
 .section __TEXT,__text,regular,pure_instructions
 .globl _main
 .align 4, 0x90
_main:                                  ## @main
 .cfi_startproc
## BB#0:                                ## %entry
 push RBP
Ltmp2:
 .cfi_def_cfa_offset 16
Ltmp3:
 .cfi_offset rbp, -16
 mov RBP, RSP
Ltmp4:
 .cfi_def_cfa_register rbp
 lea RDI, QWORD PTR [RIP + L_.str]
 mov AL, 2
 call _printf
 xor EAX, EAX
 pop RBP
 ret
 .cfi_endproc

 .section __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
 .asciz  "%d, %f, %f"


.subsections_via_symbols


c++ codes
struct trivialStruct
{

int a;
float b;
double c;

};

int main()
{
  trivialStruct A;  

  return 0;
}

assembly generated by command "clang++ -S -O2 -mllvm --x86-asm-syntax=intel trivialConstructDestruct00.cpp"
 .section __TEXT,__text,regular,pure_instructions
 .globl _main
 .align 4, 0x90
_main:                                  ## @main
 .cfi_startproc
## BB#0:                                ## %entry
 push RBP
Ltmp2:
 .cfi_def_cfa_offset 16
Ltmp3:
 .cfi_offset rbp, -16
 mov RBP, RSP
Ltmp4:
 .cfi_def_cfa_register rbp
 lea RDI, QWORD PTR [RIP + L_.str]
 mov AL, 2
 call _printf
 xor EAX, EAX
 pop RBP
 ret
 .cfi_endproc

 .section __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
 .asciz  "%d, %f, %f"


.subsections_via_symbols



  Apparently, c++ have to take the cost of destructor and constructor are false, they are part of the misunderstading because of FUD.

High level codes and associative assembly of case 2

c

#include 
#include 

struct trivialStruct
{

int *a;
float *b;
float *c;

};

void construct_trivial_struct(struct trivialStruct *data)
{
  data->a = (int*)malloc(sizeof(int));
  data->b = (float*)malloc(sizeof(float));
  data->c = (float*)malloc(sizeof(float));
  
  *data->a = 100;
  *data->b = 200;
  *data->c = 300;
}

void destruct_trivial_struct(struct trivialStruct *data)
{
  free(data->a);
  free(data->b);
  free(data->c);
  
  data->a = NULL;
  data->b = NULL;
  data->c = NULL;
}

int main()
{
  struct trivialStruct A;
  construct_trivial_struct(&A);
  printf("%d, %f, %f", *A.a, *A.b, *A.c);
  
  destruct_trivial_struct(&A);

  return 0;
}


assembly generated by command "clang -S -O2 -mllvm --x86-asm-syntax=intel trivialConstructDestruct00.c"
 .section __TEXT,__text,regular,pure_instructions
 .globl _construct_trivial_struct
 .align 4, 0x90
_construct_trivial_struct:              ## @construct_trivial_struct
 .cfi_startproc
## BB#0:                                ## %entry
 push RBP
Ltmp3:
 .cfi_def_cfa_offset 16
Ltmp4:
 .cfi_offset rbp, -16
 mov RBP, RSP
Ltmp5:
 .cfi_def_cfa_register rbp
 push R15
 push R14
 push RBX
 push RAX
Ltmp6:
 .cfi_offset rbx, -40
Ltmp7:
 .cfi_offset r14, -32
Ltmp8:
 .cfi_offset r15, -24
 mov RBX, RDI
 mov EDI, 4
 call _malloc
 mov R14, RAX
 mov QWORD PTR [RBX], R14
 mov EDI, 4
 call _malloc
 mov R15, RAX
 mov QWORD PTR [RBX + 8], R15
 mov EDI, 4
 call _malloc
 mov QWORD PTR [RBX + 16], RAX
 mov DWORD PTR [R14], 100
 mov DWORD PTR [R15], 1128792064
 mov DWORD PTR [RAX], 1133903872
 add RSP, 8
 pop RBX
 pop R14
 pop R15
 pop RBP
 ret
 .cfi_endproc

 .globl _destruct_trivial_struct
 .align 4, 0x90
_destruct_trivial_struct:               ## @destruct_trivial_struct
 .cfi_startproc
## BB#0:                                ## %entry
 push RBP
Ltmp12:
 .cfi_def_cfa_offset 16
Ltmp13:
 .cfi_offset rbp, -16
 mov RBP, RSP
Ltmp14:
 .cfi_def_cfa_register rbp
 push RBX
 push RAX
Ltmp15:
 .cfi_offset rbx, -24
 mov RBX, RDI
 mov RDI, QWORD PTR [RBX]
 call _free
 mov RDI, QWORD PTR [RBX + 8]
 call _free
 mov RDI, QWORD PTR [RBX + 16]
 call _free
 mov QWORD PTR [RBX + 16], 0
 mov QWORD PTR [RBX + 8], 0
 mov QWORD PTR [RBX], 0
 add RSP, 8
 pop RBX
 pop RBP
 ret
 .cfi_endproc

 .section __TEXT,__literal8,8byte_literals
 .align 3
LCPI2_0:
 .quad 4641240890982006784     ## double 200
LCPI2_1:
 .quad 4643985272004935680     ## double 300
 .section __TEXT,__text,regular,pure_instructions
 .globl _main
 .align 4, 0x90
_main:                                  ## @main
 .cfi_startproc
## BB#0:                                ## %entry
 push RBP
Ltmp18:
 .cfi_def_cfa_offset 16
Ltmp19:
 .cfi_offset rbp, -16
 mov RBP, RSP
Ltmp20:
 .cfi_def_cfa_register rbp
 lea RDI, QWORD PTR [RIP + L_.str]
 movsd XMM0, QWORD PTR [RIP + LCPI2_0]
 movsd XMM1, QWORD PTR [RIP + LCPI2_1]
 mov ESI, 100
 mov AL, 2
 call _printf
 xor EAX, EAX
 pop RBP
 ret
 .cfi_endproc

 .section __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
 .asciz  "%d, %f, %f"


.subsections_via_symbols
c++
#include 
#include 

struct trivialStruct
{

trivialStruct();
~trivialStruct();

int *a;
float *b;
float *c;

};

trivialStruct::trivialStruct() : 
a((int*)malloc(sizeof(int))), 
b((float*)malloc(sizeof(float))), 
c((float*)malloc(sizeof(float)))
{
  *a = 100;
  *b = 200;
  *c = 300;
}

trivialStruct::~trivialStruct()
{
  free(a);
  free(b);
  free(c);
  
  a = nullptr;
  b = nullptr;
  c = nullptr;
}

int main()
{
  trivialStruct A;
  printf("%d, %f, %f", *A.a, *A.b, *A.c);

  return 0;
}

 .section __TEXT,__text,regular,pure_instructions
 .globl __ZN13trivialStructC1Ev
 .align 4, 0x90
__ZN13trivialStructC1Ev:                ## @_ZN13trivialStructC1Ev
 .cfi_startproc
## BB#0:                                ## %entry
 push RBP
Ltmp3:
 .cfi_def_cfa_offset 16
Ltmp4:
 .cfi_offset rbp, -16
 mov RBP, RSP
Ltmp5:
 .cfi_def_cfa_register rbp
 push R15
 push R14
 push RBX
 push RAX
Ltmp6:
 .cfi_offset rbx, -40
Ltmp7:
 .cfi_offset r14, -32
Ltmp8:
 .cfi_offset r15, -24
 mov RBX, RDI
 mov EDI, 4
 call _malloc
 mov R14, RAX
 mov QWORD PTR [RBX], R14
 mov EDI, 4
 call _malloc
 mov R15, RAX
 mov QWORD PTR [RBX + 8], R15
 mov EDI, 4
 call _malloc
 mov QWORD PTR [RBX + 16], RAX
 mov DWORD PTR [R14], 100
 mov DWORD PTR [R15], 1128792064
 mov DWORD PTR [RAX], 1133903872
 add RSP, 8
 pop RBX
 pop R14
 pop R15
 pop RBP
 ret
 .cfi_endproc

 .globl __ZN13trivialStructC2Ev
 .align 4, 0x90
__ZN13trivialStructC2Ev:                ## @_ZN13trivialStructC2Ev
 .cfi_startproc
## BB#0:                                ## %entry
 push RBP
Ltmp12:
 .cfi_def_cfa_offset 16
Ltmp13:
 .cfi_offset rbp, -16
 mov RBP, RSP
Ltmp14:
 .cfi_def_cfa_register rbp
 push R15
 push R14
 push RBX
 push RAX
Ltmp15:
 .cfi_offset rbx, -40
Ltmp16:
 .cfi_offset r14, -32
Ltmp17:
 .cfi_offset r15, -24
 mov RBX, RDI
 mov EDI, 4
 call _malloc
 mov R14, RAX
 mov QWORD PTR [RBX], R14
 mov EDI, 4
 call _malloc
 mov R15, RAX
 mov QWORD PTR [RBX + 8], R15
 mov EDI, 4
 call _malloc
 mov QWORD PTR [RBX + 16], RAX
 mov DWORD PTR [R14], 100
 mov DWORD PTR [R15], 1128792064
 mov DWORD PTR [RAX], 1133903872
 add RSP, 8
 pop RBX
 pop R14
 pop R15
 pop RBP
 ret
 .cfi_endproc

 .globl __ZN13trivialStructD1Ev
 .align 4, 0x90
__ZN13trivialStructD1Ev:                ## @_ZN13trivialStructD1Ev
 .cfi_startproc
## BB#0:                                ## %entry
 push RBP
Ltmp21:
 .cfi_def_cfa_offset 16
Ltmp22:
 .cfi_offset rbp, -16
 mov RBP, RSP
Ltmp23:
 .cfi_def_cfa_register rbp
 push RBX
 push RAX
Ltmp24:
 .cfi_offset rbx, -24
 mov RBX, RDI
 mov RDI, QWORD PTR [RBX]
 call _free
 mov RDI, QWORD PTR [RBX + 8]
 call _free
 mov RDI, QWORD PTR [RBX + 16]
 call _free
 mov QWORD PTR [RBX + 16], 0
 mov QWORD PTR [RBX + 8], 0
 mov QWORD PTR [RBX], 0
 add RSP, 8
 pop RBX
 pop RBP
 ret
 .cfi_endproc

 .globl __ZN13trivialStructD2Ev
 .align 4, 0x90
__ZN13trivialStructD2Ev:                ## @_ZN13trivialStructD2Ev
 .cfi_startproc
## BB#0:                                ## %entry
 push RBP
Ltmp28:
 .cfi_def_cfa_offset 16
Ltmp29:
 .cfi_offset rbp, -16
 mov RBP, RSP
Ltmp30:
 .cfi_def_cfa_register rbp
 push RBX
 push RAX
Ltmp31:
 .cfi_offset rbx, -24
 mov RBX, RDI
 mov RDI, QWORD PTR [RBX]
 call _free
 mov RDI, QWORD PTR [RBX + 8]
 call _free
 mov RDI, QWORD PTR [RBX + 16]
 call _free
 mov QWORD PTR [RBX + 16], 0
 mov QWORD PTR [RBX + 8], 0
 mov QWORD PTR [RBX], 0
 add RSP, 8
 pop RBX
 pop RBP
 ret
 .cfi_endproc

 .section __TEXT,__literal8,8byte_literals
 .align 3
LCPI4_0:
 .quad 4641240890982006784     ## double 200
LCPI4_1:
 .quad 4643985272004935680     ## double 300
 .section __TEXT,__text,regular,pure_instructions
 .globl _main
 .align 4, 0x90
_main:                                  ## @main
 .cfi_startproc
## BB#0:                                ## %entry
 push RBP
Ltmp34:
 .cfi_def_cfa_offset 16
Ltmp35:
 .cfi_offset rbp, -16
 mov RBP, RSP
Ltmp36:
 .cfi_def_cfa_register rbp
 lea RDI, QWORD PTR [RIP + L_.str]
 movsd XMM0, QWORD PTR [RIP + LCPI4_0]
 movsd XMM1, QWORD PTR [RIP + LCPI4_1]
 mov ESI, 100
 mov AL, 2
 call _printf
 xor EAX, EAX
 pop RBP
 ret
 .cfi_endproc

 .section __TEXT,__cstring,cstring_literals
L_.str:                                 ## @.str
 .asciz  "%d, %f, %f"


.subsections_via_symbols

    We could see that the part of construct and destruct are same as the assembly generated by c.Although the compiler generate two pieces of constructor and destructor for us, this do not mean the codes will become fatter or slower, because the linker could remove the duplicate the codes for us, even if they are not removed(hard to believe this will happen in modern linker), they will take some space in memory, but will never be evaluated.

    One of the way to verify this is separate the declaration and definition of the struct, generate the assembly by same command and look at the codes, you will find that the assembly only call for one symbol of the constructor. The final step is verify the size of the exe, if the linker haven't removed duplicate codes, the size of the exe should be fatter. The other solution is download a disassemble to analyze the exe by yourself.

    In the conclusion, most of the times we have to construct or destruct the object, so constructor and destructor rarely cost us anything(compare with equivalent c behavior).Even you really don't them, you could disable them by your wish, nothing stop you from gaining full control in c++.

    Codes are available on github.

Saturday, 23 November 2013

Deploy Qt apps to android phone on mac

    After some struggle, I figure out how to deploy android application by Qt5 on my mac mini.Before you start, make sure you remove other version of Qt from your mac, else it may deploy wrong version of Qt onto the android device.

Step 1 : download the tools you need


    About the android NDK, make sure you didn't download the legacy-toolchains(I don't know what is this for yet), download the bigger one.

Step 2 : setup the android configurations

graph_00

  graph_00 show you how to setup the location after you unzip and install(this step is trivial, if you got any question, please leave me a comment).

Step 3 : pick a devices and give it a test

    My device is GT-S6310, I use api 10 and armeabi to deploy the example of calculator(QWidget) onto the device.

graph_01

    Connect the device by usb, click the run button and you will see the app run alive and kicking on the device.Do remember to enable the debugging options of your device.In my case,
it is Settings->Developer options->USB debugging(enable this one), if you don't enabled it, your mac can't detect the device(you could check the device is connect or not by "adb devices").

    I tried other examples like maroon, emitters, same, qml video effects, not all of them are deployable, even they did, not all of the functions are work flawlessly. In my humble opinion, Qt5.2 beta1 still far from productive on android.Hope that the release version of Qt5.2 could give users a much more stable library.


Friday, 22 November 2013

Misunderstanding of c++ 00 -- Before you blame c++, please make sure you know c++

  Inside of c++ object model, Bjarne and the lessons of Scott meyers keep telling us the performance of c++ is at least as good as c.But there are still tons of c programmers who do not understand c++ always denigrade the performance of c++ with their FUD perceptions.There are many forums have the fight between c programmers and c++ programmers, from performance vary to the complexity of the languages and so on.

  None of the language is perfect, it is a truth that c++ have so many defects, but most of the "defects" claimed by those c programmers are totally nonsense, they have no sense what are they talking about. It is natural for programmers to disagree, dislike the features of a language, but before you denigrate any languages, tools or features you don't like, at least do some homework. c is good, but c++ is far more better. Exactly, many big, important softwares are shift from c to c++, not c++ to c.Here are other projects of c++ which demonstrate the ability of c++, showing that c++ could work on embedded devices pretty well.




   Here are some immature criticisms come from those FUD  c programmers.


  • c++ is slower than c because compiler do too many things for us in the backgroud
  • c++ is slower than c because c++ always construct and destruct object
  • new and delete must slower than malloc and free because they have to(?) deal with exception, constructor and destructor
  • Inheritance make c++ become fatter and slower than c
  • virtual is very slow and fat compare with function pointer
  • template will lead to code bloat
  • RAII will cause memory fragementation
  • RAII do not suit for resource like file, mutex, thread and so on
  • macro is easier to use than template
  • c is easier to read and maintain than c++
  • higher level of abstraction equal to slower, so c++ must be slower than c
  • and so on(please tell me what do you  want to add on this list)
  Generally speaking, one of the philosophy of c++ is "you don't pay for what you don't use".

  It is a waste of time to show them their worry are based on the fact that c programmers are not c++ programmers again and again, so I decided to write down the experience and prove on my blog.

   A war between different languages are almost meaningless, because there are too many programmers like to attack the tools they don't know with their self-imagine image without research, prove, learning.As the father of c++ tell us, "Before dismissing C++ features, such as classes and templates, for demanding tasks, learn the basics of their use and try them out. Measure, rather than guessing about performance!".

   In these series of post, I will try my best to explain why those reason of c programmers are wrong, what are the problems of those criticisms.  I am far from an expert of programming or c++,  if you find anything doubt or uncertainty of my post, please leave some comments.

  However, even these series of post will spend a lot of times on discussing the performance of c++, performance is not the only factor of choosing your tools, like portability, maintainability, develop time, level of programmers, but it is  another story to tell.

machine learning--04 : Normal equation

  Besides gradient descent, we could use normal equation to find out the optimal hypothesis function too.


   As equation 1 show, normal equation is far more easier to implement with compare to gradient descent.If the matrix is singular, we could either decrease the number of features, or using SVD to find an approximation of the inverse matrix.


  The pros and cons of gradient descent vs normal equation.


Gradient Descent
   


  • Need to choose alpha
  • Needs many iterations
  • works well even when n(number of features) is large

Normal Equation

  • No need to choose alpha
  • Don't need to iterate
  • Need to compute inverse matrix
  • slow if n(number of features) is very large

  The price of computing the inverse matrix is almost same as O(n^3), this kind of complexity is unacceptable when the number of n is big(10000 or more, depends on your machine).

Saturday, 16 November 2013

machine learning--03 : logistic regression

  According to the open course, logistic regression is same as "classification regression", same as linear regression, it is belong to supervised machine learning algorithm, but the targets of logistic regression are digital, not analog.Following are the memo of this open course.

Example

Email : Spam/Not Spam?
Tumor : Malignant/Benign

y=1, 0
0 : "negative class"(Not Spam)
1 : "positive class""(Spam)

Question
    
1 : Why do we need logistic regression?Can't we just use linear regression to predict the results by setting some threshold?

A : Because the results are far from perfect. Assume we pick the threshold as 0.5

graph_00

The examples of graph_00 works perfectly, but the prediction is very easy to break if we add new sample into the training set as graph_01 show.

graph_01



B : the hypothesis function of linear regression can greater than 1 or less than zero.


2 : What is the model of logistic regression?

The course use sigmoid function as the model, the output range of sigmoid function is [0, 1], this is a neat feature for classification.


  
graph_02


The output of the hypothesis is same as


Monday, 11 November 2013

machine learning--02 : Estimate the batch gradient descent is converge or not

    This is an exercise from the open course of machine learning, one of the purpose of this exercise is measure the batch gradient descent is converge or not.




    Equation 1 is the cost function, the procedures of measuring the convergence of gradient is compare different learning rate of the gradient descent, draw the graph and make sure the value of the cost function is decrease on every iteration.

    At first, I implement the cost function and batch gradient descent.


/**
 *@brief calculate the theta generated by the cost function 
 * of the linear regression
 */
template<typename T>
T cost_function(cv::Mat_<T> const &features, 
cv::Mat_<T> const &labels, 
cv::Mat_<T> const &theta)
{
    cv::Mat_<T> const temp = features * theta - labels;
    cv::Mat_<T> results = temp.t() * temp;
    results /= 2 * labels.rows;

    return *results.template ptr<T>(0);
}

/**
 *@brief find the costs of each iteration
 *@param features input sequence
 *@param labels output sequence
 *@param alpha determine the step of each iteration, smaller alpha would 
 * cause longer time to iterate but with higher chance to converge;
 * larger a;pha will run faster but with higher chance to divert.
 * Since this function has a fixed iteration time, so alpha 
 * only affect accuracy.
 *@param iterate iterate time
 *@return the costs of each iteration
 */
template<typename T>
cv::Mat_<T> batch_gradient_descent_cost(cv::Mat_<T> const &features, 
cv::Mat_<T> const &labels, 
T alpha = 0.07, size_t iterate = 1)
{
    cv::Mat_<T> theta = cv::Mat_<T>::zeros(features.cols, 1);
    cv::Mat_<T> results;
    T const ratio = alpha / features.rows;
    for(size_t i = 0; i != iterate; ++i){
        cv::Mat_<T> const data = ratio * 
                linear_regression(features, labels, theta);
        results.push_back(cost_function(features, labels, theta));
        theta -= data;
    }

    return results;
}

Draw the cost vs iteration graph by qwt.


/**
 * @brief draw and show the charts of cost function vs iteration
 * @param plot a 2-d plotting widget
 * @param features input(features)
 * @param labels output(target)
 * @param a
 */
template<typename T>
void cost_vs_iteration(simple2DPlot &plot, cv::Mat_<T> const &features, 
cv::Mat_<T> const &labels, QApplication &a)
{
    size_t const iterate_times = 50;
    T const ratio[] = {0.01, 0.03, 0.1, 0.3, 1, 1.2};
    QColor const colors[] = { {255, 0, 0}, {0, 255, 0}, {0, 0, 255}, 
                              {255, 255, 0}, {255, 128, 128}, 
                              {128, 0, 128}};
    std::vector<T> iterates(50);
    std::iota(std::begin(iterates), std::end(iterates), 1);
    for(size_t i = 0; i != sizeof(ratio) / sizeof(T); ++i){
        cv::Mat_<T> const costs = batch_gradient_descent_cost<T>(features, 
                                       labels, ratio[i], iterate_times);
        plot.insert_curve(std::begin(iterates), std::end(iterates), 
                          std::begin(costs));
        plot.get_curve(i).setTitle(
          QString::fromStdString(std::to_string(ratio[i])));
        plot.get_curve(i).setPen(colors[i]);
        plot.get_curve(i).setStyle(QwtPlotCurve::Lines);
    }

    plot.replot();
    plot.render_document("costs vs iteration", {300, 200});

    plot.resize(600, 400);
    plot.show();
    a.exec();

    plot.clear();
    cv::Mat_<T> const costs = 
    batch_gradient_descent_cost<T>(features, labels, 1.3, iterate_times);
    plot.insert_curve(std::begin(iterates), std::end(iterates), 
                      std::begin(costs));
    plot.get_curve(0).setTitle("1.3");
    plot.get_curve(0).setStyle(QwtPlotCurve::Lines);
    plot.render_document("diverge", {300, 200});

    plot.replot();
    plot.show();

    a.exec();
}


graph-00 : converge

graph-01 : diverge


 The implementation of simple2DPlot, the codes of main project.

machine learning--01 : linear regression and batch gradient descent

  Linear regression is one of the famous machine learning algorithm.This algorithm assume the relationship between the input(features) and the output(targets) are linear, it use a hypothesis to predict the result.




    Equation 1 is the hypothesis we use to predict the results, usually we define X0 as 1.To find out the parameters, we could use batch gradient descent.



   Since the square error produce by linear regression only has one minimum value(it is a convex), we don't need to worry about the starting point.You could find more details on this website. Implement this algorithm with the help of openCV is pretty simple(octave/matlab even more simpler).


/**
 *@brief linear regression
 *@param features input sequence
 *@param labels output sequence
 *@param theta the parameters we want to find
 *@return new theta
 */
template<typename T>
cv::Mat_<T> linear_regression(cv::Mat_<T> const &features, 
cv::Mat_<T> const &labels, 
cv::Mat_<T> const &theta)
{
    cv::Mat_<T> result = features.clone();
    cv::transpose(result, result);
    result *= (features * theta - labels);

    return result;
}

/**
 *@brief batch gradient descent
 *@param features input sequence
 *@param labels output sequence
 *@param alpha determine the step of each iteration, smaller alpha would 
 * cause longer time to iterate but with higher chance to converge;
 * larger a;pha will run faster but with higher chance to divert.
 * Since this function has a fixed iteration time, so alpha only 
 * affect accuracy.
 *@param iterate iterate time
 *@return theta, the parameters of batch gradient descent searching
 */
template<typename T>
cv::Mat_<T> batch_gradient_descent(cv::Mat_<T> const &features, 
cv::Mat_<T> const &labels, 
T alpha = 0.07, size_t iterate = 1)
{
    cv::Mat_<T> theta = cv::Mat_<T>::zeros(features.cols, 1);
    T const ratio = alpha / features.rows;
    for(size_t i = 0; i != iterate; ++i){
        cv::Mat_<T> const data = ratio * linear_regression
                                        (features, labels, theta);
        theta -= data;
    }

    return theta;
}


  If you can't accept the vectorize equation of the function linear_regression, you could expand it as following.



Source codes can download from github.

Thursday, 7 November 2013

machine learning--00 : supervised vs unsupervised

  Machine learning is a very important and practical skills in computer vision, in order to make a good use of the machine learning algorithm provided by openCV,  I decided to study machine learning and record the things I learned from the open classroom of standford.

  There are two major branch of machine learning algorithms, they are supervised and unsupervised learning.

Supervised learning : teach the program what is the correct answers, the input will contain "features" and "labels".

Unsupervised learning : let the program learn by itself, the input contain "features" but no "labels".

Example of Supervised learning

 Height prediction : There are various numbers of boys between ages(features) of two and eights, we want to predict the heights(labels) of different ages.


graph-00 : ages(x-axis) vs height(y-axis) without predict

   On graph-00,  I plot the ages vs height with dot,  the range of ages are analog but not digital(2.54 years old, 3.33 years old and so on), how could we predict the height of age 4.44 which do not shown on this graph? This is the time when machine learning comes in,  we give the program features(in this case, it is ages) and labels(in this case, it is height), use them to deduce an equation which could map arbitrary ages within 2~8 years old, we call this equation as hypothesis function and make it at-least close to the result of the data set we have.


graph-01 : ages(x-axis) vs height(y-axis) with predict(blue line)
   On graph-01, I map the ages to height by the equation deduce by the batch gradient descent, since the ranges of the labels are analog but not digital, this kind of supervised learning also refer to regression.

    The other examples is image recognition, like the simple OCR engine developed by this blog also leverage the power of supervised learning to predict the character of the image.Because the results are discrete in the OCR example, we also refer that kind of supervised learning as classification.

Example of unsupervised learning

    k-means is one of the unsupervised machine learning algorithm, you give it a bunch of data without telling the program which label the data belong to, let the program determine which label of those data belong to, this also called clustering.


 
graph_02 : before k-means clustering


graph_03 : after k-means clustering

Thursday, 31 October 2013

RAII is a must know technique for c++ programmers, no excuse.

    RAII(resource acquisition is initialization), in a nutshell, is same as "let the object handle the resource".I use the word resource but not memory, because RAII can guard more than memory(mutex, db connnection, socket, file and every resource you can think of).There are many blogs, famous technical forum, gurus suggestion, mention and emphasize how important and what RAII is.

    C++ is a big, complexity language, I have no idea I need to take how many years to learn every corners of this powerful beast, but we do not need to learn all of it. We only need to know the most essential part of c++, RAII is one of them.You may not use any TMP(template meta programming) , multi inheritance(please think carefully before you use this feature), design any allocators, don't know what is policy based design, have no idea how type_traits work, do not familiar with operator overloading, but you will always find out that RAII could help you write good codes, make your codes become much more easier to maintain, read and more robust.

   If you don't believe it, let us look at a simple example.


#include <iostream>
#include <new>

int main()
{
    int *arr = nullptr;
    int *max = nullptr;
    double *double_arr = nullptr;

    try{
    arr = new int[100];
    max = new int(300);
    double_arr = new double[300];

    //.....do something......

    //handle your garbages if something happen
    goto cleanUp;
    //......
  }catch(std::bad_alloc const &ex){
    std::cerr<<ex.what()<<std::endl;
  }

  cleanUp:
    if(arr){
      delete []arr;
      arr = nullptr;
    }
    if(max){
      delete max;
      max = nullptr;
    }
    if(double_arr){
      delete []double_arr;
      double_arr = nullptr;
    }
}

    What is the problem with this example?They are perfectly legal and safe in C++?If you don't get it, let me ask you a problem.

Q : What if we need to handle more resource?

Naive answer(bad practice) : add more if else

The suck codes proposed by the naive answer may look like


#include <iostream>
#include <new>

int main()
{
    int *arr = nullptr;
    int *max = nullptr;
    double *double_arr = nullptr;
    float *float_max = nullptr;
    float *float_arr = nullptr;

    try{
        arr = new int[100];
        max = new int(300);
        double_arr = new double[300];
        float_max = new float(3.14);
        float_arr = new float[300];

        //.....do something......

        //handle your garbages if something happen
        goto cleanUp;
        //......
    }catch(std::bad_alloc const &ex){
        std::cerr<<ex.what()<<std::endl;
    }

cleanUp:
    if(arr){
        delete []arr;
        arr = nullptr;
    }
    if(max){
        delete max;
        max = nullptr;
    }
    if(double_arr){
        delete []double_arr;
        double_arr = nullptr;
    }
    if(float_max){
        delete float_max;
        float_max = nullptr;
    }
    if(float_arr){
        delete []float_arr;
        float_arr = nullptr;
    }
}

   You will have to alter two places whenever you need to add or delete the resources.This kind of c style codes are error prone, hard to maintain and unbearable, they are almost as worse as(even worse than it if we have to take care of exception) the old school c codes which heavily depends on goto to clean up resources, this kind of codes will become unreadable in short times. I believe this is one of the reason why Bjarne said "Aim to write idiomatic C++: avoid simply writing code in the style of your previous language using C++ syntax; there is little to be gained from simply changing syntax.".For any C++ programmers who know how important and powerful RAII is, they can't bear this kind of horrible codes.

    How could RAII help us?


#include <iostream>
#include <memory>
#include <new>
#include <vector>

int main()
{
  try{
   std::vector<int> arr(100);
   std::unique_ptr<int> max(new int(300));
   std::vector<double> double_arr(300);
   std::unique_ptr<double> double_max(new double(300));
   std::vector<float> float_arr(100);
   std::unique_ptr<float> float_max(new float(100));

   //...do something....
   
   //the object will release the resources automatically
   //even the exceptions thrown

  }catch(std::bad_alloc const &ex){
    std::cerr<<ex.what()<<std::endl;
  }
}
   

    Do you see that?The codes become much more easier to read, because we don't need to release the resource manually, we don't need to change the codes in two places.The codes become more robust, because we will not forget to release the resources again.More than that, it is exception safe too.c++ is good because you can deal with pointer and memory management, in c you have to deal with pointer and memory management.

    There are many kinds of coding standards, rules in many teams, not all coding standards suit for all teams and all applications.Whatever it is, RAII is a must know technique and should be obey when you can. If you are using delete in your codes, maybe you are doing something wrong or writing worse codes in C++, even you need to use delete(building some infrastructures, like smart pointer, containers and so on), do remember to protect it by object, always remember to follow the rule of RAII.Pointer is good at access data, but not resource management, leave your resource to class(RAII).

    In my humble opinion, if a C++ programmer don't know the concepts of RAII, he/she will generate lower quality, worse performance, less reliable codes most of the times(just like the suck codes show by the examples).  It is always a good question to test your interviewers, ask them they notice how important RAII in C++ or not.If they don't know it, don't hire them(or teach them if you think the candidates are worth to).If you were an interviewer and find out the company don't take RAII seriously, then you should be careful.

  

   

Wednesday, 30 October 2013

openCV and artificial neural network(ANN)--01 : simple OCR engine with openCV2

  To get a better image of how to make use of the artificial neural network provided by openCV, I decided to do develop a simple OCR(optical character recognition) engine by CvANN_MLP.

  At first, I download the image from stack overflow, they are

graph_00(digits for training)



graph_01(digits for classify)



  Step 1 : segment the digits of graph_00

 
std::string const prefix("/Users/Qt/program/blogsCodes/");
cv::Mat img = cv::imread(prefix + "pic/digits00.png");
if(img.empty()){
   std::cerr<<"can't read image"<<std::endl;
   return - 1;
}

detectCharacter dc;
std::vector training_data = dc.segment(img, {10, 10});


implementation of segment



std::vector detectCharacter::segment(cv::Mat const &input, 
                                     cv::Size const &crop_size)
{    
    CV_Assert(input.type() == CV_8UC3);    

    //step 1 : binarize the image
    cv::cvtColor(input, gray, CV_BGR2GRAY);
    cv::GaussianBlur(gray, gray, cv::Size(5, 5), 0);

    cv::Mat binary = gray;
    cv::threshold(gray, binary, 0, 255, 
                  cv::THRESH_BINARY_INV + cv::THRESH_OTSU);
    
    //step 2 : find the contours
    cv::findContours(binary.clone(), contours, CV_RETR_EXTERNAL, 
                     CV_CHAIN_APPROX_SIMPLE);

    //step 3 : get the bounding rect of each contours
    for(auto &data : contours){
        min_rects.emplace_back(cv::boundingRect(data));
    }

    //step 4 : crop each digits into the vector
    std::vector<cv::Mat> digits;
    for(size_t i = 0, size = contours.size(); i != size; ++i){
        cv::Mat const digit = crop_digit(binary, min_rects[i], 
                                     contours[i], crop_size);
        digits.emplace_back(digit);
    }

    return digits;
}


implementation of crop_digit


/**
 * @brief crop a character in the input
 * @param input : input image
 * @param rect  : location of the character
 * @param points : contours
 * @param size : size of the result
 * @return character after crop
 */
cv::Mat detectCharacter::crop_digit(cv::Mat const &input, 
                                    cv::Rect const &rect, 
                                    std::vector<cv::Point> const &contour, 
                                    cv::Size const &size)
{
    cv::Mat mask = cv::Mat(input, rect);
    cv::Mat digit(mask.size(), mask.type());
    digit.setTo(0);

    auto digit_ptr = digit.ptr<uchar>(0);
    for(int row = 0; row != mask.rows; ++row){
        auto mask_ptr = mask.ptr<uchar>(row);
        for(int col = 0; col != mask.cols; ++col){
            //only take the pixels equal to 255 and 
            //lie in the region surrounded by the contour
            if(*mask_ptr == 255 && 
               cv::pointPolygonTest(contour, 
                                    {col + rect.x, row + rect.y}, 
                                    false) >= 0){
                *digit_ptr = 255;
            }
            ++mask_ptr; ++digit_ptr;
        }
    }

    cv::resize(digit, digit, size);

    return digit;
}

  Before I return the digit, I resize the digit into a specific size, because the samples of the ANN have to be same size and same type.


Step 2 : set up training labels

  This step is very verbose, I have to map the number of the digits after segmentation to appropriate value.Since there are 125 digits after segmentation, it took me some times to make it done.


std::vector<int> const training_labels
{
  7, 4, 8, 2, 1, 6, 4, 4, 3, 3, 9, 5, 6, 6, 5,
  7, 9, 0, 1, 8, 8, 2, 4, 4, 6, 9, 1, 8, 3, 0,
  3, 9, 4, 5, 9, 8, 4, 9, 2, 2, 6, 4, 4, 6, 9,
  5, 5, 5, 0, 1, 1, 2, 5, 8, 3, 9, 1, 0, 7, 2,
  0, 1, 4, 8, 2, 0, 5, 4, 7, 1, 1, 1, 8, 4, 8,
  2, 1, 8, 0, 4, 9, 5, 3, 5, 2, 7, 1, 3, 2, 2,
  8, 5, 0, 5, 5, 9, 0, 6, 4, 4, 8, 3, 9, 0, 7,
  4, 6, 6, 0, 3, 2, 8, 2, 3, 1, 5, 6, 8, 0, 8,
  4, 1, 2, 8, 9
};

Step 3 : write the training data and training labels into xml file

  We could transform the data and labels to suitable form without writing them to xml too, we don't have any obligation to write the data into xml, write it or not is depend on your needs.

void write_digits_xml(std::vector<cv::Mat> &training_data, 
                      std::vector<int> const &training_labels)
{
    cv::Mat train_data;
    cv::Mat train_labels;

    for(auto &data : training_data){
        //same as data.reshape(1, 1),
        //data.convertTo(data, CV_32F);
        OCV::transform_to_svm_training_data(data);
        train_data.push_back(data);
    }

    cv::Mat(training_labels).copyTo(train_labels);

    cv::FileStorage fs("digits.xml", cv::FileStorage::WRITE);
    fs << "TrainingData10x10" << train_data;
    fs << "Labels" << train_labels;
}

Step 4 : Segment the digit of the graph_01(target)

 
cv::Mat target = cv::imread(prefix + "pic/digitTarget00.png");
if(target.empty()){
  std::cerr<<"can't read target"<<std::endl;
  return -1;
}

std::vector<cv::Mat> target_digits = dc.segment(target, {10, 10});
std::vector<cv::Mat> temp_target_digits; //for verification
for(auto &data : target_digits){
   temp_target_digits.emplace_back(data.clone());
   OCV::transform_to_svm_training_data(data);
}

  At step 4,  I segment the digits from graph_01 and create one more buffer--temp_target_digits to store the digits after segmentation, because the target_digits have to transform to one row and one channel matrix, so we need a buffer to retain the digit after segmentation for verification on the next step.

Step 5 : train and classify the digit

 
characterRecognizer cr(prefix + 
       "simpleOCR/build-exercise00-Qt5_1_1_clang_3_2-Release/digits.xml");
cr.train(10, 10);

for(size_t i = 0, size = target_digits.size(); i != size; ++i){
  int const classify = cr.classify(target_digits[i]);
  std::cout<<classify<<std::endl;
  cv::imshow("", temp_target_digits[i]);
  cv::waitKey();
}

  At step 5, I train the neural network, classify the image after segmentation(we need to transform the image want to classify to one row, one channel and CV_32F matrix), and verify the result correct or by comparing the digits after segmentation and the result of classify.

implementation of train


/**
 * @brief overload of train, you can specity the training 
 *        data and training labels by this overload function
 * @param train_data : the training data
 * @param labels : the training labels
 * @param nlayers : layers of the network
 * @param num_char : number of the characters,
 * it is equal to the type of the training labels(training classes)
 */
void characterRecognizer::train(cv::Mat const &train_data, 
                                cv::Mat const &labels, int nlayers, 
                                int num_char)
{
    CV_Assert(!train_data.empty() && !labels.empty());

    num_charecter = num_char;
    int buffer[] = {train_data.cols, nlayers, num_char};
    cv::Mat const layers(1, 3, CV_32S, buffer);
    ann.create(layers, CvANN_MLP::SIGMOID_SYM, 1, 1);

    //Prepare trainClases
    //Create a mat with n trained data by m classes
    cv:: Mat train_classes;
    train_classes.create(train_data.rows, num_char, CV_32F);
    for( int i = 0; i !=  train_classes.rows; ++i){
        int const label = *labels.ptr<int>(i);
        auto train_ptr = train_classes.ptr<float>(i);
        for(int k = 0; k != train_classes.cols; ++k){
            *train_ptr = k != label ? 0 : 1;
            ++train_ptr;
        }
    }

    cv::Mat const weights = cv::Mat::ones(1, train_data.rows, CV_32FC1);

    //Learn classifier
    ann.train(train_data, train_classes, weights);
}

   The principle of this method could found at this link. In this example I treat all of the region of the segmented digit as features, exactly this is not an optimal solution, there are many ways to extract the features of the segmented digit.

    Although the hit rate is high, from the view of machine learning, this method is far from perfect, because it may overfitting the training set, there are many algorithms to help us verify that it is underfitting, overfitting and give us some hints about how to gain a better training result. Don't jump to the conclusion that adding more training examples can improve the result, collect those examples take a lot of times, use some techniques(learning curve, bias, variance, model selection and so on) to verify how should you improve the result before you decide to collect more training examples. This is same as the performance issues of programming, don't guess, measure.


   The source codes can download from github.

Monday, 28 October 2013

openCV and artificial neural network(ANN)--00 : How to construct the labels of ANN

    openCV2 implement multi-layer perceptrons(MLP), we could use it by declare CvANN_MLP. To train the classifier of the ANN, we need to create training data and labels as we did in the SVM, but the training labels are a bit different.Instead of an Nx1 matrix where N stand for the samples and 1 present the classes.The labels of an ANN are NxM, the N is the same as SVM(sample numbers), M is a classes which set 1 in a position(i, j) if the row i is classified with column j.

    The question is, how do we feed the ANN with proper data format?The document do not mention about this, but thanks to the book and the forum, this question is solved(at-least I think so).

    The way of creating the xml is same as the SVM.But we need some post processing to convert the labels into a two dimensions array before we pass the labels into the function train().

    This example intent to do a simple OCR training with numeric alpha(an examples from the ch5 of the book with some simplification).

    First, we write the data of the samples and associative labels to the cv::Mat.


//for the sake of simplicity, I neglect some errors checking

#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>

#include <iostream>

int main ( int argc, char** argv )
{        
    char const *path = argv[1];

    cv::Mat trainingData;    
    //35 means there are 35 samples for '0'; 
    //40 means there are 40 samples for '1' and so on
    int const numFilesChars[]={35, 40, 42, 41, 42, 
                               30, 31, 49, 44, 30, 
                              };
    char const strCharecters[] = {'0','1','2','3','4',
                                  '5','6','7','8','9'};

    cv::Mat trainingLabels(0, 0, CV_32S);
    int const numCharacters = 10;    
    //this part is same as the SVM
    for(int i = 0; i != numCharacters; ++i){
        int numFiles = numFilesChars[i];
        for(int j = 0; j != numFiles; ++j){
            std::cout << "Character "<< strCharacters[i] << 
                      " file: " << j << "\n";
            std::stringstream ss;
            ss << path << strCharacters[i] << "/" << j << ".jpg";
            cv::Mat img = cv::imread(ss.str(), 0);           
            //assume the img is always continuous
            img.reshape(1, 1);
            trainingData.push_back(img);
            trainingLabels.push_back(i);
        }
    }

    trainingData.convertTo(trainingData, CV_32FC1);   

    cv::FileStorage fs("OCR.xml", FileStorage::WRITE);
    fs << "TrainingData" << trainingData;   
    fs << "classes" << trainingLabels;

    return 0;
}
   
second step is read the data and convert the trainingLabels to a proper format of the CvANN_MLP asked for.

cv::FileStorage fs;
fs.open("OCR.xml", cv::FileStorage::READ);
CV::Mat trainData;
Mat classes;
fs["TrainingData"] >> trainData;
fs["classes"] >> classes;

//the 1st element is the input layer(the features of the samples)
//the last one is the outpt layer(how many kinds of data you 
//want to classify?)
//all inbetween are for the hidden ones.
int buffer[] = {trainData.cols, nlayers, numCharacters};
//you could specify the layers as 3 rows, 1 column too
cv::Mat const layers(1, 3, CV_32SC1, buffer);
ann.create(layers, CvANN_MLP::SIGMOID_SYM, 1, 1);

//Prepare trainClases
//Create a mat with n trained data by m classes
cv:: Mat trainClasses(trainData.rows, numCharacters, CV_32FC1);
for( int i = 0; i !=  trainClasses.rows; ++i){
    int const labels = *classes.ptr<int>(i);
    auto train_ptr = trainClasses.ptr<float>(i);
    for(int k = 0; k != trainClasses.cols; ++k){
       *train_ptr = k != labels ? 0 : 1;
       ++train_ptr;
    }
}

cv::Mat const weights = cv::Mat::ones(1, trainData.rows, CV_32FC1);           

//Learn classifier
ann.train(trainData, trainClasses, weights);


  In the first step, we load the labels as a Nx1 matrix, but in the second step, we need to "unfold" it to NxM matrix( trainClasses).That is what the loop are doing about.N equal to the number of samples and M equal to the number of the data you want to classify.

Ex :
the first sample is '0', so the labels will map to
1 0 0 0 0 0 0 0 0 0
the second sample is '1', so the labels will map to
0 1 0 0 0 0 0 0 0 0
ans so on

Saturday, 26 October 2013

openCV : Construct the images suitable for SVM

  The example from document show us some basic about the SVM, but it don't tell us how to incorporate with image. I had a hard time to figure out how to do it correctly until I study the boo--mastering openCV with practical computer vision projects.

  In a nutshell, the data of SVM needed are training data and labels.

Training data

rows :  corresponding to specific samples(ex : images)

columns :  corresponding to the features(ex : pixels)

In the case of image, all of the pixels values would be considered as features.


cv::Mat trainingData;
for(size_t i = 0; i != numTrainData; ++i){
  std::stringstream name;
  name<<leaf<<i<<".png";
  cv::Mat samples = cv::imread(name.str());
  //reshape it to 1 channel and 1 row
  samples.reshape(1, 1);
  trainingData.push_back(samples);
}
//convert all of the image to floating point values
trainingData.convertTo(trainingData, CV_32F); 

That is, if the original samples are look like a 4X4, after transform, it would become a 1X16 matrix.




Although this is an example for image, but it could apply to everything, the most important part is transform the samples into  a 1 row and 1 channel cv::Mat, put all of the features(data) of the samples into the column.

The function of reshape is very convenient, but it could not tackle with non-continuous image, we have to transform them into a suitable cv::Mat by ourselves.Here is a generic transform function design for images.


template<typename T = uchar, typename U = float>
void transform_to_svm_training_data(cv::Mat &input)
{
    if(input.isContinuous()){
        input = input.reshape(1, 1);
        input.convertTo(input, cv::DataType<U>().depth);        
        return;
    }

    cv::Mat output(1, input.total() * input.channels(), 
                   cv::DataType<U>().depth);
    auto output_ptr = output.ptr<U>(0);
    OCV::for_each_channels<T>(input, [&](T a)
    {
        *output_ptr = a;
        ++output_ptr;
    });

    input = output;

}

  There are one more thing to keep in mind, all of the samples of the training data should have same numbers of features, all of the features should have the same type.The codes of OCV::for_each_channels can found on github, the principal of it could refer to opencv and generic programming.

 Labels

rows : same as the number of rows of the training data, the samples and the rows of the labels are mapping to each other.

columns : specify what are the samples belong to? For example if you were classifying leafs and non- leafts, you would need to specify which one is leaf and which one is non-leaf in the training data(ex : 1 == leaf; -1 == non-leaf).

Examples


cv::Mat trainingImages;
std::vector<int> trainingLabels;
for(int i = 0; i != numPlates; ++i)
{
   std::stringstream ss;
   ss << path_Plates << i << ".jpg";
   cv::Mat img = cv::imread(ss.str(), 0);   
   transform_to_svm_training_data(img);
   //every labels are associated to the training data
   trainingImages.push_back(img);
   trainingLabels.emplace_back(1);
}

for(int i = 0; i != numNoPlates; ++i)
{
   std::stringstream ss;
   ss << path_NoPlates << i << ".jpg";
   cv::Mat img = cv::imread(ss.str(), 0);
   transform_to_svm_training_data(img);
   trainingImages.push_back(img);
   trainingLabels.emplace_back(0);
}

cv::Mat classes;
cv::Mat trainingData = trainingImages;
cv::Mat(trainingLabels).copyTo(classes);
trainingData.convertTo(trainingData, CV_32FC1);
cv::FileStorage fs("SVM.xml", cv::FileStorage::WRITE);
fs << "TrainingData" << trainingData;
fs << "classes" << classes;

Thursday, 24 October 2013

boost spirit2--01 : match recursive braces

   Matching recursive patterns by boost::spirit::qi is very simple, easy to maintain and enhance.

Example : match recursive braces

   case 1 : ( () )         --> match
   case 2 : (               --> unmatch
   case 3 : (() ())       --> match
   case 4 : ((((( )))))  --> match
   case 5 : (((( ( ))))  --> match

   Following grammar can detect case1~case5

  

template<typename Iterator>
struct recursiveBraces : qi::grammar<Iterator, qi::ascii::space_type>
{
    recursiveBraces() : recursiveBraces::base_type(finalRules)
    {
        braces = "(" >> *expression >> ")";
        expression = +braces;
    }

    qi::rule<Iterator, qi::ascii::space_type> braces;
    qi::rule<Iterator, qi::ascii::space_type> expression;    
};

  braces call expression, expression call braces, so the form a recursive loop.If we do some enhancement, we could match more complicated pattern.
  
case 6 : (() ()) ((())) (())     --> match
case 7 : (() ()) ((( ( ))) (())  --> umatch


template<typename Iterator>
struct recursiveBraces : qi::grammar<Iterator, qi::ascii::space_type>
{
    recursiveBraces() : recursiveBraces::base_type(finalRules)
    {
        braces = "(" >> *expression >> ")";
        expression = +braces;
        finalRules += expression;
    }

    qi::rule<Iterator, qi::ascii::space_type> braces;
    qi::rule<Iterator, qi::ascii::space_type> expression;
    qi::rule<Iterator, qi::ascii::space_type> finalRules;    
};


  The grammar express by spirit are elegant and easy to read(you do need to invest your times to study the tutorial), the performance are very good too.It is worth to have some play with it.The full codes are available at github.

Tuesday, 22 October 2013

boost spirit2--00 : exercise00

  Boost spirit is a robust and efficiency parser and output generators libs for c++. Although it is very fast, the learning curve is steep.To better understand the library, I write down this post to record what I have learn from the exercises.

  The implementation of spirit involve a lot of TMP skills, unfortunately, c++11 still do not support concepts, therefore the error messages of compile time are almost unreadable, this is the biggest draw back for me.The other serious issue is long compile time, spirit X3 may reduce the compile times more than 50%, hope it come out quickly.The merits of spirit are obvious too, very fast, clear syntax(similar to EBNF), seamless integration with other c++ codes.

   In this post I would try to transform the input from


0.5">If you are a premiere member of the lynda.com online training library, 
 or if you

5.19">are watching this tutorial on a DVD, you have access to the  
exercise files used throughout the title.

To something like

 
1
00:00:00,500 --> 00:00:05,190
If you are a premiere member of the lynda.com online training library, 
 or if you

2
00:00:05,190 --> 00:00:11,800
are watching this tutorial on a DVD, you have access to  
the exercise files used throughout the title.
 
 
step 1 : parse the string
 
template<typename Iterator>
bool parse_video_data(Iterator begin, Iterator end, 
                      std::vector<std::pair<float, std::string>> 
                      &data)
{
  //if you don't want to insert the "\n\n" in to the string
  bool const r = qi::parse(begin, end, 
                           (qi::float_ >> qi::omit["\">"] >> 
                           *~qi::char_('\n')) % *(qi::eol), 
                           data);
  //if you want to insert the "\n\n" to the string
  //bool const r = qi::parse(begin, end, 
  //                         +(qi::float_ >> qi::omit["\">"] >> 
  //                           qi::raw[*~qi::char_('\n') >> *(qi::eol)]), 
  //                           data);

  if(!r || begin != end){
    return false;
  }

  return r;
}

  This function will parse the time and the string into the vector "data".

ex : 0.5">if you are something
will parse to
float : 0.5
std::string :  "if you are something"

step 2 : transform the times

  There are many solutions to transform the times, like sscanf, stringstream, regular expression, hand made loop, spirit::karma and so on.For the sake of practice, I chose spirit::karma.


BOOST_FUSION_ADAPT_STRUCT(
spiritParser::transformData,
(size_t, index)
(std::vector<std::vector<int>>, nums)
)

template<typename OutputIterator>
struct videoGrammar : karma::grammar<OutputIterator, transformData()>
{
   videoGrammar()
   : videoGrammar::base_type(final_rule)
   {
      using karma::int_;
      using karma::repeat;
      using karma::right_align;
      using karma::uint_;

      first_rule = repeat(2)[right_align(2, '0')[int_ <<":"]]>>
                   right_align(2, '0')[int_] >>",">> 
                   right_align(2, '0')[int_];

      second_rule = first_rule >> " --> " >> first_rule >> "\n";
      final_rule = uint_ >> "\n">> second_rule;

   }

   karma::rule<OutputIterator, transformData()> final_rule;
   karma::rule<OutputIterator, std::vector<std::vector<int>>()> second_rule;
   karma::rule<OutputIterator, std::vector<int>()> first_rule;
};

inline void generate_nums(std::vector<int> &nums, float data)
{
   nums[0] = static_cast<int>(data / 3600);
   nums[1] = static_cast<int>(std::floor(data) / 60) % 60;
   nums[2] = static_cast<int>(std::floor(data)) % 60;
   nums[3] = static_cast<int>((data - std::floor(data)) * 1000.0);
}

template<typename Iterator, typename Grammar>
bool generate_times(Iterator sink, Grammar &grammar, transformData &data, 
                    size_t index, float fir, float sec)
{        
   data.index = index;
   generate_nums(data.nums[0], fir);
   generate_nums(data.nums[1], sec);
   bool r = karma::generate(sink,
                            grammar,
                            data
                           );

   return r;
}

Now we could transform the times from (0.5, 5.19) to (00:00:00,500 --> 00:00:05,190).

 The codes can download from github.