Grade: Grade 11 Subject: Mathematics Unit: Advanced Data Analysis SAT: ProblemSolving+DataAnalysis ACT: Math

Guided Practice: Data Analysis

Overview

This guided practice lesson helps you apply the concepts from regression analysis and statistical inference through structured problem-solving. Each problem includes step-by-step guidance to reinforce your understanding.

Skills Reinforced

  • Calculating and interpreting correlation coefficients
  • Writing and using regression equations
  • Making predictions using linear models
  • Understanding residuals and their meaning
  • Drawing statistical conclusions from sample data

Worked Examples

Example 1: Interpreting Correlation

Problem: A study found r = 0.85 between hours of study and test scores for 50 students. Interpret this value.

Solution:

  1. The correlation coefficient r = 0.85 is positive, indicating a positive relationship.
  2. The magnitude is close to 1, indicating a strong relationship.
  3. r squared = 0.7225, meaning about 72.25% of the variation in test scores can be explained by hours of study.
  4. Interpretation: There is a strong positive linear relationship between study hours and test scores. Students who study more tend to score higher.

Example 2: Using a Regression Equation

Problem: The regression equation for predicting test score (y) from study hours (x) is: y = 52 + 4.5x. Predict the score for a student who studies 8 hours.

Solution:

  1. Substitute x = 8 into the equation: y = 52 + 4.5(8)
  2. Calculate: y = 52 + 36 = 88
  3. Answer: The predicted test score is 88 points.

Practice Problems

Work through each problem, showing all steps. Check your answers at the end of this section.

Problem 1

A researcher collects data on the relationship between daily temperature (in degrees F) and ice cream sales (in dollars). The correlation coefficient is r = 0.92. Describe the relationship.

Problem 2

Given the regression equation y = 120 + 15x where y is ice cream sales and x is temperature, predict sales when the temperature is 75 degrees F.

Problem 3

Using the same regression equation from Problem 2, if actual sales were $1,300 on a 75-degree day, calculate the residual.

Problem 4

A dataset has r = -0.78 between hours of TV watched per day and GPA. Interpret this correlation and calculate r squared.

Problem 5

The slope of a regression line is 2.3 and the y-intercept is 15. Write the regression equation and find y when x = 10.

Problem 6

A sample of 100 voters shows 58% support for a policy. The margin of error is 4.8%. Construct the 95% confidence interval.

Problem 7

For the confidence interval (0.532, 0.628), what is the sample proportion and the margin of error?

Problem 8

Two variables have r = 0.45. Calculate the coefficient of determination and explain what it means.

Problem 9

A regression equation predicts y = 80 for a given x value. If the actual y value is 73, is the residual positive or negative? Calculate it.

Problem 10

Data shows a correlation of r = 0.95 between the number of firefighters at a fire and the amount of damage. Does this prove firefighters cause damage? Explain.

Answer Key

1. Strong positive linear relationship. As temperature increases, ice cream sales tend to increase. r squared = 0.8464 (84.64% of variation explained).

2. y = 120 + 15(75) = 120 + 1125 = $1,245

3. Residual = Actual - Predicted = 1300 - 1245 = $55 (positive residual means actual was higher than predicted)

4. Moderately strong negative relationship. More TV hours associated with lower GPA. r squared = 0.6084 (60.84% of variation explained).

5. y = 15 + 2.3x; When x = 10: y = 15 + 2.3(10) = 15 + 23 = 38

6. 58% plus or minus 4.8% gives (53.2%, 62.8%) or (0.532, 0.628)

7. Sample proportion = (0.532 + 0.628) / 2 = 0.58 (58%); Margin of error = (0.628 - 0.532) / 2 = 0.048 (4.8%)

8. r squared = 0.2025 (20.25%). Only about 20% of the variation in y is explained by x. This is a weak relationship.

9. Residual = 73 - 80 = -7. The residual is negative because the actual value is less than predicted.

10. No, correlation does not imply causation. This is likely a lurking variable situation: larger fires require more firefighters AND cause more damage. The fire size is the confounding variable.

Next Steps

  • Review any problems you found challenging
  • Practice identifying when to use each formula
  • Move on to Word Problems to apply these skills in real-world contexts
  • Keep a formula sheet handy for quick reference