Grade: 8 Subject: Math Unit: Data Analysis Lesson: 5 of 6 SAT: ProblemSolving+DataAnalysis ACT: Math

Common Mistakes

Overview

Learn to recognize and avoid the most common errors students make when working with scatter plots, correlations, and lines of best fit. Understanding these mistakes will help you avoid them on tests and assignments.

Practice Problems

Question 1: A student sees a strong correlation between shoe size and vocabulary test scores in a school. They conclude bigger feet cause better vocabulary. What is the error?

Show Answer

Answer: Confusing correlation with causation

Age is the lurking variable - older students have bigger feet AND larger vocabularies. Correlation does not prove one variable causes changes in another.

Question 2: A line of best fit has equation y = 3x + 10. A student calculates y when x = 5 as y = 3(5) = 15. Find the error.

Show Answer

Answer: Forgot to add the y-intercept

Correct calculation: y = 3(5) + 10 = 15 + 10 = 25. Always remember both the slope term AND the y-intercept.

Question 3: A scatter plot goes down from left to right. A student says this shows a weak correlation. Is this correct?

Show Answer

Answer: No - direction and strength are different properties

Going down shows NEGATIVE correlation (direction), but says nothing about strength. A tight downward pattern is a strong negative correlation.

Question 4: Data covers ages 10-18. A student uses the line of best fit to predict outcomes for a 50-year-old. What mistake is this?

Show Answer

Answer: Extrapolating far beyond the data range

Predictions are only reliable within or near the range of the original data. The relationship may not hold at extreme values.

Question 5: A student draws a line through the first and last points of a scatter plot. Why is this not necessarily the line of best fit?

Show Answer

Answer: The line of best fit minimizes total distance to ALL points

The first and last points might be outliers. The line of best fit considers all data points, not just endpoints.

Question 6: A residual is calculated as Predicted - Actual. What is wrong with this?

Show Answer

Answer: The formula is reversed

Residual = Actual - Predicted. A positive residual means the actual point is ABOVE the line, negative means BELOW.

Question 7: A student sees r = -0.9 and says this is a weak correlation because it's negative. Correct this misconception.

Show Answer

Answer: -0.9 is a STRONG negative correlation

The sign shows direction (positive or negative). The absolute value shows strength. |−0.9| = 0.9 is very strong.

Question 8: A student removes an outlier to make their line fit better. When is this appropriate vs. inappropriate?

Show Answer

Answer: Only remove outliers if there's evidence of measurement error

Removing outliers just to improve fit is data manipulation. Outliers may represent real, important data points.

Question 9: The slope of a line is 2.5. A student says "y increases by 2.5 every time." What's missing from this statement?

Show Answer

Answer: "...for every 1 unit increase in x"

Slope is rate of change: rise over run. You must specify the change in BOTH variables for a complete interpretation.

Question 10: A student plots (year, sales) with years 2015-2023 on x-axis. The y-intercept is negative. They say "in year 0, sales were negative." What's wrong?

Show Answer

Answer: The y-intercept has no meaningful interpretation here

Year 0 is far outside the data range (extrapolation). The y-intercept is a mathematical artifact, not a meaningful prediction.