Cost Function
Updated: December 1, 2020
The measurement of accuracy of a hypothesis function. The accuracy is given as an average difference of all the results of the hypothesis from the inputs (\(x\)’s) to the outputs (\(y\)’s).
\begin{equation} J(\Theta_{0},\Theta_{1})=\frac{1}{2m}\sum_{i=1}^{m}(h_{\Theta}(x_{i}) - y_{i})^{2} \end{equation}
where \(m\) is the number of inputs (e.g. training examples)
This function is also known as the squared error function
or mean squared error
. The \(\frac{1}{2}\) is a convenience for the cancellation of the 2 which will be present due to the squared term being derived (see gradient descent).
The basic idea of the cost function
is to choose a \(\Theta_{0}\) and \(\Theta_{1}\) such that the \(h_{\Theta}(x)\) is as close to \(y\), as possible, for our training examples \((x,y)\).
In an ideal world, the cost function
would have a value of 0 (i.e. \(J(\Theta_{0},\Theta_{1}) = 0\)), which would imply we have a straight line which passes through each of our data points and that we can, with perfect accuracy, predict any new data point which may come into our set.