In today's lesson, students are able to put their learning together about lines of best fit, regression lines and correlation coefficients. I tell students that today we will look at some real data from a US census and examine the trends in men's and women's salaries from a statistical perspective. We begin by reading through Making More $ together.
Today students will first make the graphs by hand so they can draw a line of best fit and later will use chromebooks and plot.ly to find the regression line and the correlation coefficent. I have students estimate a correlation coefficient first in Questions #1 and #2, but have them hold off on finding the actual correlation coefficient until they've completed Question #8.
Some students in my class will have trouble writing an equation for their own line of best fit. This is an especially challenging task with this data because they are not starting from Year 0 on the x-axis. When we discuss this problem later in the class, we will spend a lot of time talking about the difference in the technology generated equation from their own equation. I like this application of real world data so students can learn how to handle different, and sometime messy numbers.
Students seem to enjoy comparing their own equations to the equations that plot.ly gives them. They can also compare their estimated correlation coefficient with the technology generated one.
This task goes on in Questions #10 and #11 to add more data to show students that data can change and is not necessarily fixed just because at some time it shows a relationship. Though I think it is an important point, this is not my primary use for this task so I only assign it to students who are working faster than the rest of the class.
There's a lot to discuss with this task. First, I want students to share their graphs and strategies for drawing a line of best fit. Next, I want students to compare the line of best fit they drew with the regression line that was generated by plot.ly (or other technology). We spend some time look at the regression lines and trying to figure out how the computer decided to put them there. I am looking for students to talk about creating the least amount of distance between each point and the line so they begin to get the idea of a residual for the next lesson.
Next we talk about the equations they wrote for their lines of best fit and the equations for the regression lines. Again, we need to figure out here where the y-intercept would be and what x would be depending on if we use Year 0 as the starting point (like the technology will) or 1991 as the first year. We use Question #9 to explore this idea, figuring out what value we should put in for x if the year we want to make a prediction about is 2015. Students often find this piece confusing so we might try a few different examples to try and make the idea clear.
Students have put a bunch of different concepts together today so I want to give them an opportunity to reflect on their work. I might ask them to complete a prompt like, "The most important thing I learned in class today was...."
Making More $ is licensed by © 2012 Mathematics Vision Project | MVP In partnership with the Utah State Office of Education Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported license.