![]() | ![]() |
The line of best fit is a linear function used to predict the y-value for a given x value after a significant correlation has been found between the two random variables they represent. We denote the prediction for a given xi by ˆyi. Given that the relationship between any ˆy and x is a linear one, there must be constants m and b, such that:
ˆy=mx+bThis being a line of "best fit", the particular values of these constants m and b are such that they minimize the sum of the squared errors:
E=∑i(ˆyi−yi)2To find these constants m and b, we note the following:
E=∑i(ˆyi−yi)2=∑i(mxi+b−yi)2=∑i(m2x2i+2bmxi+b2−2mxiyi−2byi+y2i)=m2∑ix2i+2bm∑ixi+nb2−2m∑ixiyi−2b∑yi+∑iy2iThat might look intimidating, but remember that all of the sigmas are just constants, formed by adding up various combinations of the x and y coordinates of the original points. In fact, collecting like terms reveals that E is just a parabola with respect to m or b:
E(m)=(∑ix2i)m2+(2b∑ixi−2∑xiyi)m+(nb2−2b∑iyi+∑iy2i) E(b)=nb2+(2m∑ixi−2∑iyi)b+(m2∑ix2i−2m∑ixiyi+∑iy2i)Further, both of these parabolas open upward since the coefficients on the m2 and b2 terms are both positive (the sum of x2i must be positive unless all of the x-coordinates are 0, and of course n, the number of points, is positive).
Since the parabolas open upwards, each one has a minimum at its vertex. Recalling that the vertex of y=ax2+bx+c occurs at x=−b/(2a), we have a vertex at:
m=−2b∑xi+2∑xiyi2∑x2i=∑xiyi−b∑xi∑x2i b=−2m∑xi+2∑yi2n=∑yi−m∑xinNow we have two linear equations in terms of m and b. Substitute one into the other to solve this system of equations -- perhaps the second into the first -- and the solution is revealed:
m=n∑xiyi−(∑xi)(∑yi)n∑x2i−(∑xi)2andb=(∑x2i)(∑yi)−(∑xi)(∑xiyi)n∑x2i−(∑xi)2The formula for m is bad enough, but the formula for b is a monstrosity. However, there is no need to deal with (or even find in the first place) this expression for b, as our earlier (and far simpler) expression for b previously had m as its only variable with an unknown value -- and now m is known! Consequently:
m=n∑xiyi−(∑xi)(∑yi)n∑x2i−(∑xi)2andb=∑yi−m∑xinWith a little more algebra, we can express m and b in the following way (see if you can prove it):
m=∑(xi−¯x)(yi−¯y)∑(xi−¯x)2andb=¯y−m¯xwhere ¯x and ¯y are the averages of all the x-coordinates and all the y-coordinates, respectively.
Finally, recalling the formula for the covariance sxy, we can also write this as:
m=sxys2xandb=¯y−m¯x