Data Collection and Statistics Review — Grade 8

Classify each as primary or secondary data:
1. A student surveys 30 classmates about their favourite sport.
2. A researcher uses rainfall data published by the Bureau of Meteorology.
3. A teacher records the test scores of every student in their class.
4. A journalist uses unemployment figures from the ABS website.
For each scenario, identify the population and the sample:
1. A factory tests 80 of every 2000 bottles for defects.
2. A city council surveys 600 of its 120 000 residents.
Classify each sampling method as random, systematic, stratified, or convenience:
1. Every 8th customer entering a store is surveyed.
2. Names are drawn from a hat.
3. A researcher surveys 20 students from each year level to match school proportions.
4. A student asks the first 10 friends she sees.

Identify one potential bias or misleading feature in each scenario:
1. A bar graph comparing sales uses a y-axis that starts at 950 instead of 0.
2. A pie chart shows four categories but the percentages add up to 105%.
3. A survey question asks: “Don’t you agree that our product is excellent?”
A histogram shows the distribution of students’ test scores. Which measures of centre (mean, median, or mode) would be most appropriate to report if the distribution is strongly skewed to the right? Explain your reasoning.
A company creates a graph where the pictograph symbols for “Year 2” are twice as tall AND twice as wide as those for “Year 1,” even though sales only doubled. Why is this misleading?

The data below shows the number of hours of weekly exercise reported by two groups of 10 adults.

Group A: 3, 5, 2, 7, 4, 6, 5, 3, 8, 2
Group B: 1, 10, 2, 9, 3, 8, 2, 10, 1, 9

Calculate the mean for each group.
Find the median for each group.
Find the mode for each group.
Find the range for each group.
The means are equal. Does this mean the groups exercise similarly? Explain using the range.
Which measure of spread (range) better reveals the difference between the two groups?

A coin is flipped multiple times. The relative frequency of heads is recorded after each set of flips:
10 flips: 0.30 | 50 flips: 0.42 | 100 flips: 0.47 | 500 flips: 0.50 | 1000 flips: 0.499
1. What is the theoretical probability of heads?
2. Describe the trend in the relative frequency as the number of flips increases.
3. After 10 flips the relative frequency was 0.30. Does this mean the coin is unfair? Explain.
A bag contains red and blue marbles. A student draws a marble, records its colour, replaces it, and repeats this process.
- After 5 draws: 4 red, 1 blue (relative frequency of red = 0.80)
- After 20 draws: 14 red, 6 blue (relative frequency of red = 0.70)
- After 100 draws: 66 red, 34 blue (relative frequency of red = 0.66)
1. What does the relative frequency appear to be approaching?
2. Estimate the actual proportion of red marbles in the bag.
3. If the bag contains 30 marbles, estimate how many are red.

A school wants to find out what students think about the school canteen. The principal surveys 20 students who are eating in the canteen at lunchtime.
1. Identify two problems with this sampling method.
2. Suggest a better sampling method that would give more representative results. Explain your choice.
3. If the school has 800 students and the principal wants a stratified sample of 80 students across Years 7–10 (200 students each), how many students from each year level should be surveyed?
A researcher compares two data sets: the heights (cm) of students from two different schools.
School X (10 students): 155, 162, 158, 170, 163, 157, 168, 160, 165, 172
School Y (10 students): 148, 175, 152, 178, 150, 176, 149, 174, 153, 175
1. Calculate the mean and range for each school.
2. Which school has more consistent heights? How do you know?
3. The researcher plans to use these results to make conclusions about all students in Queensland. Identify one limitation of this approach.

Topic Review — Data Collection and Statistics