Monday, 28 October 2013

openCV and artificial neural network(ANN)--00 : How to construct the labels of ANN

    openCV2 implement multi-layer perceptrons(MLP), we could use it by declare CvANN_MLP. To train the classifier of the ANN, we need to create training data and labels as we did in the SVM, but the training labels are a bit different.Instead of an Nx1 matrix where N stand for the samples and 1 present the classes.The labels of an ANN are NxM, the N is the same as SVM(sample numbers), M is a classes which set 1 in a position(i, j) if the row i is classified with column j.

    The question is, how do we feed the ANN with proper data format?The document do not mention about this, but thanks to the book and the forum, this question is solved(at-least I think so).

    The way of creating the xml is same as the SVM.But we need some post processing to convert the labels into a two dimensions array before we pass the labels into the function train().

    This example intent to do a simple OCR training with numeric alpha(an examples from the ch5 of the book with some simplification).

    First, we write the data of the samples and associative labels to the cv::Mat.


//for the sake of simplicity, I neglect some errors checking

#include <opencv2/imgproc/imgproc.hpp>
#include <opencv2/highgui/highgui.hpp>

#include <iostream>

int main ( int argc, char** argv )
{        
    char const *path = argv[1];

    cv::Mat trainingData;    
    //35 means there are 35 samples for '0'; 
    //40 means there are 40 samples for '1' and so on
    int const numFilesChars[]={35, 40, 42, 41, 42, 
                               30, 31, 49, 44, 30, 
                              };
    char const strCharecters[] = {'0','1','2','3','4',
                                  '5','6','7','8','9'};

    cv::Mat trainingLabels(0, 0, CV_32S);
    int const numCharacters = 10;    
    //this part is same as the SVM
    for(int i = 0; i != numCharacters; ++i){
        int numFiles = numFilesChars[i];
        for(int j = 0; j != numFiles; ++j){
            std::cout << "Character "<< strCharacters[i] << 
                      " file: " << j << "\n";
            std::stringstream ss;
            ss << path << strCharacters[i] << "/" << j << ".jpg";
            cv::Mat img = cv::imread(ss.str(), 0);           
            //assume the img is always continuous
            img.reshape(1, 1);
            trainingData.push_back(img);
            trainingLabels.push_back(i);
        }
    }

    trainingData.convertTo(trainingData, CV_32FC1);   

    cv::FileStorage fs("OCR.xml", FileStorage::WRITE);
    fs << "TrainingData" << trainingData;   
    fs << "classes" << trainingLabels;

    return 0;
}
   
second step is read the data and convert the trainingLabels to a proper format of the CvANN_MLP asked for.

cv::FileStorage fs;
fs.open("OCR.xml", cv::FileStorage::READ);
CV::Mat trainData;
Mat classes;
fs["TrainingData"] >> trainData;
fs["classes"] >> classes;

//the 1st element is the input layer(the features of the samples)
//the last one is the outpt layer(how many kinds of data you 
//want to classify?)
//all inbetween are for the hidden ones.
int buffer[] = {trainData.cols, nlayers, numCharacters};
//you could specify the layers as 3 rows, 1 column too
cv::Mat const layers(1, 3, CV_32SC1, buffer);
ann.create(layers, CvANN_MLP::SIGMOID_SYM, 1, 1);

//Prepare trainClases
//Create a mat with n trained data by m classes
cv:: Mat trainClasses(trainData.rows, numCharacters, CV_32FC1);
for( int i = 0; i !=  trainClasses.rows; ++i){
    int const labels = *classes.ptr<int>(i);
    auto train_ptr = trainClasses.ptr<float>(i);
    for(int k = 0; k != trainClasses.cols; ++k){
       *train_ptr = k != labels ? 0 : 1;
       ++train_ptr;
    }
}

cv::Mat const weights = cv::Mat::ones(1, trainData.rows, CV_32FC1);           

//Learn classifier
ann.train(trainData, trainClasses, weights);


  In the first step, we load the labels as a Nx1 matrix, but in the second step, we need to "unfold" it to NxM matrix( trainClasses).That is what the loop are doing about.N equal to the number of samples and M equal to the number of the data you want to classify.

Ex :
the first sample is '0', so the labels will map to
1 0 0 0 0 0 0 0 0 0
the second sample is '1', so the labels will map to
0 1 0 0 0 0 0 0 0 0
ans so on