Least Squares Regression and Interpreting Linear Models

3 teachers like this lesson
Print Lesson


SWBAT interpret the slope and the intercept of a linear model in the context of the data.

Big Idea

Which is the predictor variable and which is the response variable? An interpretation of the results of two regression models will usually make it clear.

Opener: Check Your Work

10 minutes

As students arrive I post the second slide of today's lesson notes, which prompts students to take out the work they did yesterday.  On slides #3-7 are solutions to each of the practice exercises.  I provide the equation of the median-median regression line for each data set, and a graph showing the corresponding scatter plot and line.

I post each solution for about a minute, taking more time if kids have questions.  I circulate to check in with each student.  If I see they're in great shape, I'll prompt them with other things to think about, like how many points are above and below each line or how close the points are to the line and which correlations look the "strongest" or "weakest".  If students are struggling, I make a note of that, but I also reassure them that they're about to learn how a calculator can help with this task.

Next, we will use the first data set from yesterday's work to learn how to use TI-83 calculators to input data and run a regression.

The Basics: Regression on your Calculator

10 minutes

Now comes the reveal: our TI-83 calculators can run a median-median regression for us!  I start by using the first exercise from yesterday's Median Median Handout to give students a quick example of how this works.  I post slide #8 of the lesson notes so we can look at the example together.

I make sure that everyone has a calculator, and I run pretty quickly through this first example.  We enter the data by pressing the STAT button, then choosing "Edit..." and entering x values into L1 and y-values into L2.  Then, as I walk around the room and hold up my calculator, I show everyone how to press STAT again, then arrow over to the CALC menu.  There, we can see that choosing option 3: MedMed will provide us with the same Median-Median line we've already calculated by hand.  Next, I show students that option 4: LinReg (ax+b) provides a different regression model and a slightly different equation.  "This model uses what's called the least squares regression method, which is actually more commonly used in statistics," I say, "and we're going to use it today."

I move quickly on purpose, because after sharing this first example, I'll provide complete notes (on slides #12-15) that will review what we've just done.  I want students to see this example first, before taking notes.

When we have these two different regression lines, I return to slide #3.  I invite students to try to sketch the "least squares line" on their work from yesterday, and I model what I mean on the front board.  We notice that these lines are pretty similar, which raises the question: "Well, sheesh, which one is better?"  I give students the opportunity to share their thoughts, and then I post slides #9-11, which show Desmos-created graphs of the same data.  It's hard to get a real feel for the differences between the two lines when we see them at the same time, so I move between slide #10 (the median-median line) and #11 (the least squares line) so we can see what we might discover.  One thing I want students to note is that the median-median line appears to go "through" three points, "below" three points, and "above" two points.  The least-squares line, on the other hand, has four points above it and four points below.  

"Later this week," I say, "we'll look a little more closely about how to assess the 'goodness' of a regression line."  We'll also see how each of these techniques responds to extreme data -- but that's all a coming attraction.  For now, an informal sneak peek does the trick.

Running Regressions and Interpreting Models: Flight Cost vs. Distance

20 minutes

Now that students have seen one example, I tell everyone that they should take notes on what we just did.  On slide #12 of the lesson notes is an outline of the three steps we'll take every time we run a regression.  Then, slides #13 and #14 review the details behind the first and second steps.

Up to now, we've done everything but interpret our models, because we've been practicing on data sets that consist of only numbers with no context.  "Really, the most important step is to interpret our model and figure out how to use it," I say.  "To see what that means, we'll look at an example."

I flip to slide #16, which shows this screenshot from Kayak.com's (really cool!) map feature, which provides a map with airfares to dozens of cities.  The beauty of this map is that it gives everyone a feel for the data before we begin to analyze it.  Kids can't resist sharing their observations, or hopes for where they want to go.  A quick look at the map makes it tangible that - in general - the farther you travel from New York City, the more a flight is going to cost.

On slide #17, I provide a subset of this data.  It's important to help students recognize that these are just ten of the many cities visible on the map, and that choosing a different set of points might yield a different regression model, but that often we select a smaller sample of the data to analyze.

I tell students to follow the steps they've just seen to enter this data and find the equation for the least-squares regression line, and this is where the fun starts.  I've purposefully provided the data with "Airfare" in the first column and "Miles" in the second, because if students copy the data in that order they'll get a model that makes a little less sense than if it were the other way around.  The purpose of this activity is to help students see that our decision about which value is the predictor variable and which is the response variable is important for making a model.

The default setting on the TI-83 is for L1 to be the predictor and L2 to be the response. If everyone entered the data as they saw it, they should get:

y = 8.54x - 726.32

So what does that really mean?  Using what we know about slope, we can say that this model assumes that the price of a flight is a predictor for how far it will get us, and that we can expect to fly 8.54 miles for every dollar spent on airfare.  That's all well and good, but isn't it also reasonable to say that the length of a flight is the predictor - I might also review the term independent variable - and that cost is the response variable?  What if we want to know the price of a flight in dollars per mile?

Students are relieved to know that they don't have to re-enter the data.  I tell students that we can pass parameters to the LinReg method.  After choosing the LinReg option from the CALC menu, we can use the LIST button to choose L2, then press the comma button, then L1.  This way, the second list will be the predictor variable and L1 will be the response.  In this case, we'll get:

y = 0.074x + 146.63

Before I rush into interpreting this model for students, I post slide #18, which poses the question, "Which one makes more sense?"

After giving students a chance to share their ideas, I post slide #19 and lead a discussion as I annotate it.  I ask the class to consider what they know about slope: it's "rise over run," we might note, and I tell students to think of the word "over" (in other words, the division sign) as "per".  Then, when we include units of measurement in that ratio, we'll be able to read it as "y units per x units," which in this case is "miles per dollar."

I repeat the same steps for slide #20.  We also look at the scatter plots on slides #22-23, which helps to reiterate the idea that the second model makes more sense.  On the second graph, we can see that the domain consists of all positive values of x, the distance travelled.  We can see that even a short flight, of say, 50 miles, will cost about $150.  The domain of the first model is more restricted and can yield some nonsensical conclusions such as, "For $50 you can travel about -300 miles."  Here, it might make sense to point out that although $50 airfares are pretty uncommon, they're not entirely unheard of!

For the moment, "interpretation" means just reading the slope and y-intercept.  Students will notice the r and r-squared values on their calculator screens, and some will have questions about what these numbers mean.  I tell them I'm glad they noticed!  These numbers can help us assess how "good" a regression model is - but we're going to look more closely at that in a few days.

Practice: Skyscrapers, or Book Work as Needed

35 minutes

A General Approach to Practice

I explain in this narrative video that what I really hope to share here is a way to get students to look at bivariate data sets in two ways: giving each variable a chance to be the predictor with the other as the response.  I happen to use a half-dozen data sets from Pearson's Stats: Modeling the World, but you can use this protocol to dig a little deeper into the example data sets from whatever resource you choose.  Most years, we'll spend another lesson (or at least half of the next one) practicing what I describe here.

As practice, here is what I ask students to do for the textbook examples:

  1. Run the least squares regression both ways.
  2. Interpret the slope and y-intercept of both resulting models.
  3. Answer the question: which model makes more sense, and why?
  4. Answer the question: is a correlation between these two variables?
  5. Answer the question: do you think we can say that one variable causes the other to change?  How and why?

The 20 Tallest Buildings in the United States: Height vs. Number of Stories*

I've prepared one favorite example here: Interpreting Linear Models.  This can serve as a first practice example, but I usually use it as a quiz after students practice on some textbook problems.  This activity leads students through the first three steps I outline above before posing a few more questions in which students interpret their chosen model.  

Here is my sample solution.  You can see that the slope sounds more reasonable when we say "8.6 feet per story," instead of "0.063 stories per foot."  It takes a little more effort to think about the y-intercept, especially when kids are used to have algebraic examples where it makes sense to let x=0.  

Here, neither y-intercept makes too much sense if we let x=0, but suppose we choose a different value for x and see if it makes sense?  For example, let x=20 in both models.  Could a 20 story building be 646 feet tall?  That doesn't sound completely crazy.  Could a 20 foot tall building be six-and-a-half stories tall?  Now, that's just preposterous!  We can also imagine saying that a skyscraper can start at 474 feet tall, and add an additional 8.6 feet for each story.  On other hand, saying that a building starts with 5.28 stories and adds 0.063 stories for each foot just feels awkward.

*Source: http://en.wikipedia.org/wiki/List_of_tallest_buildings_in_the_United_States