Grade: Grade 11 Subject: Science Unit: Advanced Science SAT: ProblemSolving+DataAnalysis ACT: Science

Data and Graphs

Learn

Data analysis is a critical skill in science. This lesson covers how to organize data, choose appropriate graphs, interpret visualizations, and apply basic statistical analysis to draw valid conclusions.

Types of Data

  • Quantitative Data: Numerical measurements (e.g., temperature, mass, time)
  • Qualitative Data: Descriptive observations (e.g., color, texture, behavior)
  • Continuous Data: Can take any value within a range (e.g., height: 165.3 cm)
  • Discrete Data: Only specific values possible (e.g., number of leaves: 12)

Choosing the Right Graph

Graph Type Best Used For Example
Line Graph Showing change over continuous time or relationship between continuous variables Temperature vs. time, population growth
Bar Graph Comparing discrete categories or groups Growth of plants with different fertilizers
Scatter Plot Showing correlation between two variables Height vs. arm span, studying vs. test scores
Histogram Showing distribution of continuous data Distribution of heights in a population
Pie Chart Showing parts of a whole (percentages) Composition of gases in atmosphere

Graph Components

Every properly constructed graph should include:

  • Title: Describes what the graph shows (Dependent Variable vs. Independent Variable)
  • Axis Labels: Clear labels with units of measurement
  • Scale: Appropriate intervals that make data easy to read
  • Legend/Key: Identifies different data series if multiple are shown
  • Data Points: Clearly marked; use different symbols for different groups

Basic Statistical Analysis

Mean (Average): Sum of all values divided by the number of values

Median: Middle value when data is arranged in order

Mode: Most frequently occurring value

Range: Difference between highest and lowest values

Standard Deviation: Measure of how spread out data is from the mean

Interpreting Trends

  • Positive Correlation: As one variable increases, the other increases
  • Negative Correlation: As one variable increases, the other decreases
  • No Correlation: No apparent relationship between variables
  • Outliers: Data points that fall significantly outside the general pattern

Examples

Example 1: Calculating Mean and Identifying Outliers

Data: Plant heights (cm): 12, 14, 13, 15, 14, 42, 13, 14

Mean: (12+14+13+15+14+42+13+14) / 8 = 137 / 8 = 17.1 cm

Median: Arranged: 12, 13, 13, 14, 14, 14, 15, 42 = (14+14)/2 = 14 cm

Outlier: 42 cm is significantly higher than other values and skews the mean. The median (14 cm) better represents the typical plant height.

Example 2: Selecting the Appropriate Graph

Scenario A: Comparing average test scores between 5 different classes

Best Graph: Bar graph (comparing discrete categories)

Scenario B: Tracking a patient's body temperature every hour for 24 hours

Best Graph: Line graph (continuous change over time)

Scenario C: Investigating whether there's a relationship between hours of sleep and reaction time

Best Graph: Scatter plot (correlation between two variables)

Example 3: Interpreting a Graph

Observation: A scatter plot shows data points forming a downward slope from left to right.

Interpretation: This indicates a negative correlation - as the x-variable increases, the y-variable decreases. For example, if x-axis is "hours spent on social media" and y-axis is "test score," the graph suggests that increased social media use is associated with lower test scores.

Caution: Correlation does not imply causation. Other factors could explain the relationship.

Practice

Apply your data analysis skills to these problems.

1. Calculate the mean, median, and range for this dataset: 23, 27, 25, 29, 24, 26, 28, 25

2. A student wants to show how the population of bacteria in a culture changes over 48 hours. What type of graph should they use and why?

3. Data set A has a mean of 50 and standard deviation of 2. Data set B has a mean of 50 and standard deviation of 15. What does this tell you about the two data sets?

4. A scatter plot shows points clustered around a horizontal line. What type of correlation does this indicate?

5. Identify three errors in this graph description: "A graph with no title, x-axis labeled 'time' without units, y-axis labeled 'temperature (C)' with a scale of 0-1000 for data ranging from 20-30C."

6. The following data shows enzyme activity at different temperatures: 10C: 5 units, 20C: 12 units, 30C: 25 units, 40C: 18 units, 50C: 3 units. Describe the trend and suggest what might explain it.

7. A researcher reports that the average income of participants was $150,000, but most participants earned between $30,000-$60,000. What measure of central tendency would better represent the typical participant?

8. You have data on the number of students who prefer each of five lunch options. What type of graph would best display this data?

9. A study finds a strong positive correlation between ice cream sales and drowning deaths. Does eating ice cream cause drowning? What might be a confounding variable?

10. Why is it problematic to draw a line of best fit through only 3 data points?

11. Data: 5, 7, 6, 5, 8, 7, 6, 5, 7, 6. What is the mode? What does the mode tell you about this dataset?

12. A histogram shows a bell-shaped curve with most data clustered in the middle. What type of distribution is this, and what does it indicate about the data?

Check Your Understanding

Q1: When should you use a bar graph instead of a line graph?

Show Answer

Use a bar graph when comparing discrete categories or groups that don't have a continuous relationship. Use a line graph when showing change over continuous time or a continuous relationship between variables.

Q2: What is the difference between correlation and causation?

Show Answer

Correlation means two variables are associated or change together. Causation means one variable directly causes changes in the other. Correlation does not prove causation - there may be confounding variables or the relationship may be coincidental.

Q3: Why is the median sometimes preferred over the mean?

Show Answer

The median is preferred when data contains outliers or is skewed, because the mean can be significantly affected by extreme values. The median better represents the "typical" value in such cases.

Next Steps

  • Practice creating graphs from raw data using graphing software or by hand
  • Analyze graphs from scientific articles and identify their strengths and weaknesses
  • Move on to the next lesson: CER Writing