Cost Function
Updated: December 1, 2020
The measurement of accuracy of a hypothesis function. The accuracy is given as an average difference of all the results of the hypothesis from the inputs (x’s) to the outputs (y’s).
J(Θ0,Θ1)=12mm∑i=1(hΘ(xi)−yi)2
where m is the number of inputs (e.g. training examples)
This function is also known as the squared error function
or mean squared error
. The 12 is a convenience for the cancellation of the 2 which will be present due to the squared term being derived (see gradient descent).
The basic idea of the cost function
is to choose a Θ0 and Θ1 such that the hΘ(x) is as close to y, as possible, for our training examples (x,y).
In an ideal world, the cost function
would have a value of 0 (i.e. J(Θ0,Θ1)=0), which would imply we have a straight line which passes through each of our data points and that we can, with perfect accuracy, predict any new data point which may come into our set.