The question is, how do we feed the ANN with proper data format?The document do not mention about this, but thanks to the book and the forum, this question is solved(at-least I think so).
The way of creating the xml is same as the SVM.But we need some post processing to convert the labels into a two dimensions array before we pass the labels into the function train().
This example intent to do a simple OCR training with numeric alpha(an examples from the ch5 of the book with some simplification).
First, we write the data of the samples and associative labels to the cv::Mat.
//for the sake of simplicity, I neglect some errors checking #include <opencv2/imgproc/imgproc.hpp> #include <opencv2/highgui/highgui.hpp> #include <iostream> int main ( int argc, char** argv ) { char const *path = argv[1]; cv::Mat trainingData; //35 means there are 35 samples for '0'; //40 means there are 40 samples for '1' and so on int const numFilesChars[]={35, 40, 42, 41, 42, 30, 31, 49, 44, 30, }; char const strCharecters[] = {'0','1','2','3','4', '5','6','7','8','9'}; cv::Mat trainingLabels(0, 0, CV_32S); int const numCharacters = 10; //this part is same as the SVM for(int i = 0; i != numCharacters; ++i){ int numFiles = numFilesChars[i]; for(int j = 0; j != numFiles; ++j){ std::cout << "Character "<< strCharacters[i] << " file: " << j << "\n"; std::stringstream ss; ss << path << strCharacters[i] << "/" << j << ".jpg"; cv::Mat img = cv::imread(ss.str(), 0); //assume the img is always continuous img.reshape(1, 1); trainingData.push_back(img); trainingLabels.push_back(i); } } trainingData.convertTo(trainingData, CV_32FC1); cv::FileStorage fs("OCR.xml", FileStorage::WRITE); fs << "TrainingData" << trainingData; fs << "classes" << trainingLabels; return 0; }
second step is read the data and convert the trainingLabels to a proper format of the CvANN_MLP asked for.
cv::FileStorage fs; fs.open("OCR.xml", cv::FileStorage::READ); CV::Mat trainData; Mat classes; fs["TrainingData"] >> trainData; fs["classes"] >> classes; //the 1st element is the input layer(the features of the samples) //the last one is the outpt layer(how many kinds of data you //want to classify?) //all inbetween are for the hidden ones. int buffer[] = {trainData.cols, nlayers, numCharacters}; //you could specify the layers as 3 rows, 1 column too cv::Mat const layers(1, 3, CV_32SC1, buffer); ann.create(layers, CvANN_MLP::SIGMOID_SYM, 1, 1); //Prepare trainClases //Create a mat with n trained data by m classes cv:: Mat trainClasses(trainData.rows, numCharacters, CV_32FC1); for( int i = 0; i != trainClasses.rows; ++i){ int const labels = *classes.ptr<int>(i); auto train_ptr = trainClasses.ptr<float>(i); for(int k = 0; k != trainClasses.cols; ++k){ *train_ptr = k != labels ? 0 : 1; ++train_ptr; } } cv::Mat const weights = cv::Mat::ones(1, trainData.rows, CV_32FC1); //Learn classifier ann.train(trainData, trainClasses, weights);
In the first step, we load the labels as a Nx1 matrix, but in the second step, we need to "unfold" it to NxM matrix( trainClasses).That is what the loop are doing about.N equal to the number of samples and M equal to the number of the data you want to classify.
Ex :
the first sample is '0', so the labels will map to
1 0 0 0 0 0 0 0 0 0
the second sample is '1', so the labels will map to
0 1 0 0 0 0 0 0 0 0
ans so on
No comments:
Post a Comment