# Cost Function

##### Updated: December 1, 2020

The measurement of accuracy of a hypothesis function. The accuracy is given as an average difference of all the results of the hypothesis from the inputs (\(x\)’s) to the outputs (\(y\)’s).

\begin{equation} J(\Theta_{0},\Theta_{1})=\frac{1}{2m}\sum_{i=1}^{m}(h_{\Theta}(x_{i}) - y_{i})^{2} \end{equation}

where \(m\) is the number of inputs (e.g. training examples)

This function is also known as the `squared error function`

or `mean squared error`

. The \(\frac{1}{2}\) is a convenience for the cancellation of the 2 which will be present due to the squared term being derived (see gradient descent).

The basic idea of the `cost function`

is to choose a \(\Theta_{0}\) and \(\Theta_{1}\) such that the \(h_{\Theta}(x)\) is as close to \(y\), as possible, for our training examples \((x,y)\).

In an ideal world, the `cost function`

would have a value of 0 (i.e. \(J(\Theta_{0},\Theta_{1}) = 0\)), which would imply we have a straight line which passes through each of our data points and that we can, with perfect accuracy, predict any new data point which may come into our set.