Saturday, November 18, 2017

Least Squares Linear Fits for Dummies

line of best fit
line of best fit
A staffer who, for a while, wrote for Demand Media Studios (DMS¹) says that it was common for a freelancer to "claim" many similar topics and rewrite the same post several times. Of course, if – like eHowian Ryan Menezes – the freelancer didn't know the answer in the first place, subsequent answers were likely to drift ever farther from factual. Such seems to be the case with our J-school graduate when he attempted to explain "How to Calculate the Slope of a Line of Best Fit" at Sciencing.com.

Ryan had already met with eHow.com success (though not mathematical success) with a prior article about slopes, so the boy apparently figured, "How different can this one be?" Clearly, however, Ryan was punching well above his weight this time, or it looks that way based on the rather... strange way he describes a line of best fit:
"...when the points do show a correlation, a line of best fit will show the extent of the connection. The sharper the slope of the line through the points, the greater the correlation between the points."
Is there really a mathematical measure of sharpness (kurtosis, perhaps)? If so, we doubt it's applicable here. Whatever the case, Menezes goes on to inform his readers that,
"The [best fit] line's slope equals the difference between points' y-coordinates divided by the difference between their x-coordinates..."
...which is at its core a bastardization of  how to calculate the slope of a line. Oh, he goes on to describe how to use the coordinates of two points to calculate slope, all right. Ryan's bigger problem, however, is that in getting his BS (sure...) in Journalism, he neglected to learn how to generate a best-fit line, and probably didn't even know such a thing is possible!
What the OQ most likely wanted was to know a method – say, a linear least squares fit – for deriving the formula of the best fit line. Once that formula's in the form y = mx + b, the more numerate student will realize that the slope is m. But how to find that formula? Menezes is clueless.

We know, though: to calculate a linear least squares fit for a set of x,y points
  1. Calculate the mean of the x coordinates, X
  2. Calculate the mean of the y coordinates, Y
  3. Subtract X from each x coordinate (xi - X)
  4. Subtract Y from each y coordinate (yi - Y)
  5. Calculate the square of (xi - X)² for each point
  6. Calculate the product of (xi - X)*(yi - Y) for each point
  7. Sum all the products
  8. Sum all the squares
  9. Divide the sum of the products by the sum of the squares.
    The quotient is the slope of a linear least-squares best-fit line. Tedious, but simple; and it's also pretty easy to figure out the y-intercept so you can draw the line. Menezes, however, couldn't even figure out that was what the OQ wanted to know. Is it any wonder that Ryan is collecting Dumbass of the Day award number three?

¹ Now known as Leaf Group, Demand Media Studios was the parent company of eHow. The initials gave rise to one of our favorite sayings, "You can't spell 'dumbass' without 'DMS'!"
copyright © 2017-2022 scmrak

MM - ALGEBRA

No comments: