Equation 1 is the hypothesis we use to predict the results, usually we define X0 as 1.To find out the parameters, we could use batch gradient descent.

/** *@brief linear regression *@param features input sequence *@param labels output sequence *@param theta the parameters we want to find *@return new theta */ template<typename T> cv::Mat_<T> linear_regression(cv::Mat_<T> const &features, cv::Mat_<T> const &labels, cv::Mat_<T> const &theta) { cv::Mat_<T> result = features.clone(); cv::transpose(result, result); result *= (features * theta - labels); return result; } /** *@brief batch gradient descent *@param features input sequence *@param labels output sequence *@param alpha determine the step of each iteration, smaller alpha would * cause longer time to iterate but with higher chance to converge; * larger a;pha will run faster but with higher chance to divert. * Since this function has a fixed iteration time, so alpha only * affect accuracy. *@param iterate iterate time *@return theta, the parameters of batch gradient descent searching */ template<typename T> cv::Mat_<T> batch_gradient_descent(cv::Mat_<T> const &features, cv::Mat_<T> const &labels, T alpha = 0.07, size_t iterate = 1) { cv::Mat_<T> theta = cv::Mat_<T>::zeros(features.cols, 1); T const ratio = alpha / features.rows; for(size_t i = 0; i != iterate; ++i){ cv::Mat_<T> const data = ratio * linear_regression (features, labels, theta); theta -= data; } return theta; }

If you can't accept the vectorize equation of the function linear_regression, you could expand it as following.

Source codes can download from github.