Thursday 4 October 2018

Person detection(Yolo v3) with the helps of mxnet, able to run on gpu/cpu

    In this post I will show you how to do object detection with the helps of the cpp-package of mxnet. Why do I introduce mxnet? Because following advantages make it a decent library for standalone project development

1. It is open source and royalty free
2. Got decent support for GPU/CPU
3. Scaling efficiently to multiple GPU and machines
4. Support cpp api, which means you do not need to ask the users to install python environment , shipped the source codes in order to run your apps
5. mxnet support many platforms, including windows, linux, mac, aws, android, ios
6. It got a lot of pre-trained models
7. MMDNN support mxnet, which mean we can convert the models trained by different libraries to mxnet(although not all of the models could be converted).

Step 1 : Download model and convert it to the format can load by cpp package

1. Install anaconda(the version come with python3)
2. Install mxnet from the terminal of anaconda
3. Install gluon-cv from the terminal of anaconda
4. Download model and convert it by following scripts

import gluoncv as gcv
from gluoncv.utils import export_block

net = gcv.model_zoo.get_model('yolo3_darknet53_coco', pretrained=True)
export_block('yolo3_darknet53_coco', net)

 Step 2 : Load the models after convert

void load_check_point(std::string const &model_params,
                      std::string const &model_symbol,
                      Symbol *symbol,
                      std::map<std::string, NDArray> *arg_params,
                      std::map<std::string, NDArray> *aux_params,
                      Context const &ctx)
    Symbol new_symbol = Symbol::Load(model_symbol);
    std::map<std::string, NDArray> params = NDArray::LoadToMap(model_params);
    std::map<std::string, NDArray> args;
    std::map<std::string, NDArray> auxs;
    for (auto iter : params) {
        std::string type = iter.first.substr(0, 4);
        std::string name = iter.first.substr(4);
        if (type == "arg:")
            args[name] = iter.second.Copy(ctx);
        else if (type == "aux:")
            auxs[name] = iter.second.Copy(ctx);

    *symbol = new_symbol;
    *arg_params = args;
    *aux_params = auxs;

    You could use the load_check_point function as following

    Symbol net;
    std::map<std::string, NDArray> args, auxs;
    load_check_point(model_params, model_symbols, &net, &args, &auxs, context);

    //The shape of the input data must be the same, if you need different size,
    //you could rebind the Executor or create a pool of Executor.
    //In order to create input layer of the Executor, I make a dummy NDArray.
    //The value of the "data" could be change later
    args["data"] = NDArray(Shape(1, static_cast<unsigned>(input_size.height),
                                 static_cast<unsigned>(input_size.width), 3), context);
    executor_.reset(net.SimpleBind(context, args, std::map<std::string, NDArray>(),
                                   std::map<std::string, OpReqType>(), auxs));

    model_params is the location of the weights(ex : yolo3_darknet53_coco.params), model_symbols(ex : yolo3_darknet53_coco.json) is the location of the symbols saved as json.

Step 3: Convert image format

    Before we feed the image into the executor of mxnet, we need to convert them.

NDArray cvmat_to_ndarray(cv::Mat const &bgr_image, Context const &ctx)
    cv::Mat rgb_image;
    cv::cvtColor(bgr_image, rgb_image, cv::COLOR_BGR2RGB); 
    rgb_image.convertTo(rgb_image, CV_32FC3);
    //This api copy the data of rgb_image into NDArray. As far as I know,
    //opencv guarantee continuous of cv::Mat unless it is sub matrix of cv::Mat
    return NDArray(rgb_image.ptr<float>(),
                   Shape(1, static_cast<unsigned>(rgb_image.rows), static_cast<unsigned>(rgb_image.cols), 3),

Step 4 : Perform object detection on video

void object_detector::forward(const cv::Mat &input)
    //By default, input_size_.height equal to 256 input_size_.width equal to 320.
    //Yolo v3 has a limitation, width and height of the image must be divided by 32.
    if(input.rows != input_size_.height || input.cols != input_size_.width){
        cv::resize(input, resize_img_, input_size_);
        resize_img_ = input;

    auto data = cvmat_to_ndarray(resize_img_, *context_);
    //Copy the data of the image to the "data"
    //Forward is an async api.

Step 5 : Draw bounding boxes on image

void plot_object_detector_bboxes::plot(cv::Mat &inout,
                                       std::vector<mxnet::cpp::NDArray> const &predict_results,
                                       bool normalize)
    using namespace mxnet::cpp;

    //1. predict_results get from the output of Executor(executor_->outputs)
    //2. Must Set Context as cpu because we need process data by cpu later
    auto labels = predict_results[0].Copy(Context(kCPU, 0));
    auto scores = predict_results[1].Copy(Context(kCPU, 0));
    auto bboxes = predict_results[2].Copy(Context(kCPU, 0));
    //1. Should call wait because Forward api of Executor is async
    //2. scores and labels could treat as one dimension array
    //3. BBoxes can treat as 2 dimensions array

    size_t const num = bboxes.GetShape()[1];
    for(size_t i = 0; i < num; ++i) {
        float const score = scores.At(0, 0, i);
        if (score < thresh_) break;

        size_t const cls_id = static_cast<size_t>(labels.At(0, 0, i));
        auto const color = colors_[cls_id];
        //pt1 : top left; pt2 : bottom right
        cv::Point pt1, pt2;
        //get_points perform normalization
        std::tie(pt1, pt2) = normalize_points(bboxes.At(0, i, 0), bboxes.At(0, i, 1),
                                              bboxes.At(0, i, 2), bboxes.At(0, i, 3),
                                              normalize, cv::Size(inout.cols, inout.rows));
        cv::rectangle(inout, pt1, pt2, color, 2);

        std::string txt;
        if (labels_.size() > cls_id) {
            txt += labels_[cls_id];
        std::stringstream ss;
        ss << std::fixed << std::setprecision(3) << score;
        txt += " " + ss.str();
        put_label(inout, txt, pt1, color);

     I only mentioned the key points in this post, if you want to study the details, please check it on github.

Saturday 29 September 2018

Install cpp package of mxnet on windows 10, with cuda and opencv

    Compile and install cpp-package of mxnet on windows 10 is a little bit tricky when I writing this post.

     The install page of mxnet tell us almost everything we need to know, but there are something left behind haven't wrote into the pages yet, today I would like to write down the pitfalls I met and share with you how do I solved them.


1. Remember to download the mingw dll from the openBLAS download page, put those dll into some place could be found by the system, else you wouldn't be able to generate op.h for cpp-package.

2. Install Anaconda(recommended) or the python package on the mxnet install page on your , machines and register the path(the path with python.exe), else you wouldn't be able to generate op.h for cpp-package.

3. Compile the project without cpp-package first, else you may not able to generate op.h.

Cmake command for reference, change it to suit your own need

a : Run these command first

cmake -G "Visual Studio 14 2015 Win64" ^

cmake --build . --config Release

b : Run these command, with cpp package on

cmake -G "Visual Studio 14 2015 Win64" ^

cmake --build . --config Release --target INSTALL

4. After you compile and install the libs, you may find out you missed some headers in the install
path, I missed nnvm and mxnet-cpp. What I did is copy the folders to the install folder.

    Hope these could help someone who is pulling their head when compile cpp-package of mxnet on windows 10.

Saturday 4 August 2018

Qt and computer vision 2 : Build a simple computer vision application with Qt5 and opencv3

    In this post, I will show you how to build a dead simple computer vision application with Qt Creator and opencv3 step by step.

Install opencv3.4.1(or newer version) on windows

0. Go to source forge, download prebuild binary of opencv3.4.2. or you could build it by yourself

1. Double click on the opencv-3.4.2-vc14_vc15.exe and extract it to your favorite folder(pic_00)


2. Open the folder you extract(assume you extract it to /your_path/opencv_3_4_2). You will see a folder call "opencv" .

3. Open your QtCreator you installed.

Create a new project by Qt Creator

4. Create a new project

5. You will see a lot of options, for simplicity, let us choose "Application->Non-Qt project->Plain c++ application". This tell the QtCreator, we want to create a c++ program without using any Qt components.


6. Enter the path of the folder and name of the project.

7. Click the Next button and use qmake as your build system by now(you can prefer cmake too, but I always prefer qmake when I am working with Qt).

8. You will see a page ask you to select your kits, kits is a tool QtCreator use to group different settings like device, compiler, Qt version etc.

9. Click on next, QtCreator may ask you want to add to version control or not, for simplicity, select None. Click on finish.

10. If you see a screen like this, that means you are success.


11. Write codes to read an image by opencv

#include <iostream>

#include <opencv2/core.hpp>
#include <opencv2/highgui.hpp>

//propose of namespace are
//1. Decrease the chance of name collison
//2. Help you organizes your codes into logical groups
//Without declaring using namespace std, everytime when you are using
//the classes, functions in the namespace, you have to call with the
//prefix "std::".
using namespace cv;
using namespace std;

 * main function is the global, designated start function of c++.
 * @param argc Number of the parameters of command line
 * @param argv Content of the parameters of command line.
 * @return any integer within the range of int, meaning of the return value is
 * defined by the users
int main(int argc, char *argv[])
    if(argc != 2){
        cout<<"Run this example by invoking it like this: "<<endl;
        cout<<"./step_02.exe lena.jpg"<<endl;
        return -1;

    //If you execute by Ctrl+R, argv[0] == "step_02.exe", argv[1] == lena.jpg

    //Open the image
    auto const img = imread(argv[1]);
        imshow("img", img); //Show the image on screen
        waitKey(); //Do not exist the program until users press a key
        cout<<"cannot open image:"<<argv[1]<<endl;

        return -1;

    return 0; //usually we return 0 if everything are normal

How to compile and link the opencv lib with the help of Qt Creator and qmake

  Before you can execute the app, you will need to compile and link to the libraries of opencv. Let me show you how to do it. If you missed steps a and b, you will see a lot of error messages like Pic07 or Pic09 show.

12. Tell the compiler, where are the header files, this could be done by adding following command in the

INCLUDEPATH += your_install_path_of_opencv/opencv/opencv_3_4_2/opencv/build/include

  The compiler will tell you it can't locate the header files if you do not add this line(see Pic07).


  If your INCLUDEPATH is correct, QtCreator should be able to find the headers and use the auto complete to help you type less words(Pic08).


13. Tell linker which libraries of the opencv it should link to by following command.

LIBS += your_install_path_of_opencv/opencv/opencv_3_4_2/opencv/build/x64/vc14/lib/opencv_world342.lib

Without this step, you will see the errors of "unresolved external symbols"(Pic08).

14. Change from debug to release.


  Click the icon surrounded by the red region and change it from debug to release. Why do we do that? Because

  • Release mode is much more faster than debug mode in many cases
  • The library we link to is build as release library, do not mixed debug and release libraries in your project unless you are asking for trouble
  I will introduce more details of compile, link, release, debug in the future, for now, just click Ctrl+B to compile and link the app.

Execute the app

  After we compile and link the app, we already have the exe in the folder(in the folder show at Pic11).


  We are almost done now, just few more steps the app could up and run.

13. Copy the dll opencv_world342.dll and opencv_ffmpeg342_64.dll(they place in /your_path/opencv/opencv_3_4_2/opencv/build/bin) into a new folder(we called it global_dll).

14. Add the path of this folder into system path. Without step 13 and 14, the exe wouldn't be able to find the dll when we execute the app, and you may see following error when you execute the app from command line(Pic12). I recommend you use the tool--Rapid environment editor(Pic13) to edit your path on windows.


15. Add command line argument in the QtCreator, without it, the app do not know where is the image when you click Ctrl+R to execute the program.

16. If success, you should see the app open an image specify from the command line arguments list(Pic15).


  These are easy but could be annoying at first. I hope this post could leverage your frustration. You can find the source codes located at github.

Sunday 22 April 2018

Qt and computer vision 1 : Setup environment of Qt5 on windows step by step

    Long time haven't updated my blog, today rather than write a newer, advanced deep learning topics like "Modern way to estimate homography matrix(by lightweight cnn)" or "Let us create a semantic segmentation model by PyTorch", I prefer to start a series of topics for new comers who struggling to build a computer vision app by c++. I hope my posts could help more people find out  use c++ to develop application could as easy as another "much easier to use languages"(ex : python).

    Rather than introduce you most of the features of Qt and opencv like other books did, these topics will introduce the subsets of Qt which could help us develop decent computer vision application step by step.

c++ is as easy to use as python, really?

    Many programmers may found this nonsense, but my own experience told me it is not, because I never found languages like python, java or c# are "much easier to use compare with c++". What make our perspective become so different? I think the answers are

1. Know how to use c++ effectively.
2. There exist great libraries for the tasks(ex : Qt, boost, opencv, dlib, spdlog etc).

    As long as these two conditions are satisfy, I believe many programmers will have the same conclusion as mine. I will try my best to help you learn how to develop easy to maintain application by c++ in these series, show you how to solve those "small yet annoying issues" which may scare away many new comers.

Install visual c++ 2015

    Some of you may ask "why 2015 but not 2017"? Because when I am writing this post, cuda do not have decent support on visual c++ 2017 or mingw yet, cuda is very important to computer vision app, especially when deep learning take over many computer vision tasks today.

1. Go to this page, click on the download button of visual studio 2015.

2. Download visual studio 2015 community(you may need to open an account before you can enter this page)

3. Double click on the exe "en_visual_studio_community_2015_with_update_3_x86_x64_web_installer_xxxx" and wait a few minutes.

4. Install visual c++ tool box as following shown, make sure you select all of them.

    Install Qt5 on windows

1. Go to the download page of Qt
2. Select open source version of Qt

3. Click download button, wait until qt-unified-windows downloaded.

4. Double click on the installer, click next->skip->next

5. Select the path you want to install Qt

6. Select the version of Qt you want to install, every version of Qt(Qt5.x) have a lot of binary files to download, only select the one you need. We prefer to install Qt5.9.5 at here. Why Qt5.9.5? Because Qt5.9 is a long term support version of Qt, in theory long term support should be more stable.

7. Click next and install.

Test Qt5 installed or not

1. Open QtCreator and run an example. Go to your install path(ex: C:/Qt/3rdLibs/Qt), navigate to your_install_path/Tools/QtCreator/bin and double click on the qtcreator.exe.

2. Select Welcome->Example->Qt Quick Controls 2 - Gallery

3. Click on the example, it may pop out a message box to ask you some questions, you can click on yes or no.

4. Every example you open would pop out help page like this, keep it or not is your choices, sometimes they are helpful.

5. First, select the version of Qt you want to use(surrounded by red bounding box). Second, keep the shadow build option on(surrounded by green bounding box), why keep it on? Because shadow build could help you separate your source codes and build binary. Third select you want to build your binary as debug or release version(surrounded by blue bounding box). Usually we cannot mix debug/release libraries together, I will open another topic to discuss the benefits of debug/release, explain what is MT/MD, which one you should choose etc.

6. Click on the run button or Ctrl + R, then you should see the example running on your computer.