You may be right....but you may be wrong

3 teachers like this lesson
Print Lesson


SWBAT determine possible association from bivariate data in self constructed scatter plots.

Big Idea

There may be no correlation between apparently unrelated data.....but there might be.


15 minutes

Before getting into the main theme of the lesson I want my students to have some time to think about the difference between correlation and causation. I try this in 2 steps.

Step 1: I hand each student an Entrance Card. Each student is to analyze the graph, read the corresponding text and discuss their opinion with a shoulder partner. Students discuss why they agree or disagree with the math student's conclusion. As I wind around the tables, I look out for a pair that dissagrees so I can call on them to explain why. Students should say that just because there is a correlation between two sets of data doesn't mean one causes the other; that playing sports does not necesarily mean you will catch a cold. Those students in the survey probably caught a cold for other reasons. 

Step 2: I ask that students write the words correlation and causation in large letters on the front and back of a sheet of notebook paper. Next I will show brief descriptions of possibly related variables using the resource Correlation or Causation. When I do, each student should raise one of the two words over their head. During this activity I want to assess how clear students are with respect to the difference between these concepts: any set of data can show a correlation but only when it is absolutely clear that one event causes another, is it causation. 


25 minutes

Our next activity can work well with random pairs of students. Before we begin, we first need to complete the Student Data Sheet. I pass this sheet around and I ask that each student fill out an entire row of data. I say, "please try to be as accurate as possible." Once the sheet is completed, I will place it on the document camera for all to see, or copies can be made to be given to the groups. Completed Student data sheet.jpg

I ask that each pair of students choose any two columns of data to work with on this activity. Then, I ask them to predict the correlation that they think exists between these two variables. I'll say something like, "Of course you'll want to start by thinking about whether the correlation is positive, negative, or no correlation?" I want the process of choosing the data to be engaging to students. At the same time, I want them to build on what has come before.

In this task, my students will be plotting bivariate data by hand. For some it will be the first time in a while. So, I want them to be motivated to see if their initial ideas are correct. At the same time, I will be working hard to make sure that they are working carefully (MP6).

Common dilemmas during this task:

  1. Students may not make reasonable scales for their x and y axis. Plotted points may appear too close together making it difficult to see the trend and draw a best fit line I look out for this and assist in scaling the axes conveniently. I also warn students that scales need not begin at 0; that they can choose a convenient value to start and end with. 
  2. Another issue that comes up is which variable to choose as the independent variable. I generally explain to my students that the x-axis data, or independent variable is the set they believe affects the other set (dependent). This is best understood by giving an example. If you say #hrs of sleep affects height, then #hrs of sleep is the independent variable and goes on the x axis. 

Once the scatter plot is complete (one plot per pair), the students should work together to draw a best fit line. Here is an example: Completed student scatter plot. My students always like to talk about whether they were right or wrong with their guesses. I take advantage of the momentum this energy gives the class and I use it to flow right into our Closure section. 


15 minutes

Students are usually anxious to not only voice their results, but to see the scatter plots and best fit lines that their classmates created (See my reflection Relevance Helps). I call on volunteers to come up to the document camera and show the class their scatter plot. A couple of minutes should be given so students can first state what they predicted, and then interpret the result of their scatter plot. Depending on the trend of the graph I ask the class ..."Do you think there is a causation here, or just correlation? I allow students to speak their mind if debate is stirred. Students should demonstrate they understand the difference between one and the other by providing other factors involved in the relation, proving that one set of data doesn't clearly cause the other.

Here is an Example. A group obtained a positive correlation when comparing #hours of sleep to math exam grade (see Completed student scatter plot). In this case the students should state that more hours of sleep does not necessarily result in a student getting a better grade. The student's grade could be a result of studying harder, having done all of their homework, or having taken tutoring a week before the test, etc. Since there are multiple possible explanations, while there is correlation we cannot assume causation. 


To help clear the notion of causation vs correlation, I chose this HOMEWORK assignment for the evening. I plan to go over some parts of it at the start of the next lesson.