Practice Maths

L54 — Secondary Source Data

Rainfall Data Table — Riverside (fictional city)

Month JanFebMarAprMayJun JulAugSepOctNovDec
Rainfall (mm) 8510278564128 192337637490
Hot Tip: Correlation is not causation. Two things changing together does not mean one causes the other. Ice cream sales and drowning rates both rise in summer — but eating ice cream does not cause drowning. The real cause is warmer weather.

Worked Example

Use the Riverside rainfall table above: (a) In which month is it driest? (b) What is the total annual rainfall? (c) Identify one limitation.

(a) Smallest value: July = 19 mm is the driest month.

(b) Sum all 12 months: 85+102+78+56+41+28+19+23+37+63+74+90 = 696 mm

(c) This table shows only one year. Rainfall varies year to year, so a single year may not be typical.

Key Terms

secondary data
data originally collected by someone else — e.g. ABS datasets, newspaper articles, textbooks, websites
credibility
how trustworthy a source is; a credible source has expertise, independence, and transparent methods
currency
how recent the data is; old data may not reflect current conditions
limitation
a factor that restricts how broadly you can apply a dataset's conclusions
correlation
two variables that change together; does NOT mean one causes the other
causation
one variable directly causes a change in another; requires strong experimental evidence

What Is Secondary Data?

Secondary data is data collected by someone else, for a purpose that may differ from yours. Sources include government agencies (ABS), scientific journals, news organisations, and websites. Secondary data gives access to large national datasets and historical records you could not collect yourself — but you must evaluate it carefully.

Evaluating a Secondary Source

Ask four questions before trusting any secondary data:

  • Credible? — who collected it? Do they have the expertise and independence to do so reliably?
  • Current? — was it collected recently enough? Population data from 2005 may not reflect today.
  • Relevant? — does this dataset actually answer your question, or is it only loosely related?
  • Purpose? — was it collected to inform or to persuade? Data from a company promoting its own products may be biased.

Limitations of Secondary Data

When you use someone else's data, you inherit all their decisions — what to measure, who to sample, and how to record results. Every dataset has limitations:

  • A survey of 100 students in one Brisbane school cannot tell you about all students in Queensland
  • Data collected in the US may not apply to Australia
  • Old data may not reflect current conditions

Identifying limitations is a sign of strong statistical thinking.

Correlation vs Causation

Secondary data often shows that two things change together (correlation), but this does not mean one causes the other (causation).

  • Ice cream sales and shark attacks both increase in summer — hot weather causes both
  • Countries with higher chocolate consumption tend to have more Nobel Prize winners — this is spurious correlation, not causation

Use language like "the data shows a relationship between..." rather than "X causes Y" unless there is strong experimental evidence.

Correlation vs causation — the key test: Ask yourself — could a third variable (a confounding factor) be causing both things to change? If yes, you have correlation, not causation. Controlled experiments that isolate variables are needed to establish causation.
  1. City Population Table

    Population of five Australian cities (2021 Census, thousands)

    City SydneyMelbourneBrisbanePerthAdelaide
    Population (thousands) 5 2315 0782 5602 1921 402
    1. Which city has the largest population?
    2. Which city has the smallest population?
    3. What is the combined population of Sydney and Melbourne (in thousands)?
    4. How much larger is Brisbane's population than Perth's (in thousands)?
    5. What is the total population of all five cities (in thousands)?
  2. Subject Average Scores Table

    Average class score by subject — Year 7 Springvale School (out of 100)

    Subject MathsEnglishScienceHistoryArtPE
    Average Score 726875618479
    1. In which subject did students score highest on average?
    2. In which subject did students score lowest on average?
    3. What is the difference between the highest and lowest average scores?
    4. How many subjects had an average score above 70?
    5. What is the average of all six subject averages? (Round to 1 decimal place.)
  3. Evaluate Source Credibility

    Rate each data source on a scale of 1–3 (1 = least credible, 3 = most credible) and explain your rating.

    1. A dataset published by the Australian Bureau of Statistics, updated in 2024.
    2. A comment on a social media post saying "I heard that 90% of teenagers have anxiety."
    3. A peer-reviewed scientific journal article published in 2022 by university researchers.
    4. A company's own website advertising their product, claiming it is used by "millions of satisfied customers."
    5. A government health department report published in 2019 on childhood nutrition.
  4. Does the Data Support the Claim?

    Use the Springvale School data from Question 2. State whether each claim supports, does not support, or goes beyond the data. Explain.

    1. "Students at Springvale performed better in Art than in any other subject."
    2. "Year 7 students across Australia find Maths harder than Science."
    3. "The average Maths score was higher than the average History score."
    4. "Students dislike History because it is boring."
    5. "More than half of the subjects had an average score above 70."
  5. Statistical Inquiry with Secondary Data

    1. Write a statistical question that could be answered using data from the Australian Bureau of Statistics (ABS).
    2. Identify which specific ABS dataset you would look for and explain why it matches your question.
    3. Describe two limitations of using secondary data to answer your question.
    4. Identify one confounding factor that could make it difficult to draw a simple conclusion.
  6. Riverside Rainfall Table

    Rainfall Data Table — Riverside (fictional city)

    Month JanFebMarAprMayJun JulAugSepOctNovDec
    Rainfall (mm) 8510278564128 192337637490
    1. In which month is rainfall highest?
    2. In which month is rainfall lowest?
    3. What is the total annual rainfall?
    4. What is the average monthly rainfall? (Round to 1 decimal place.)
    5. How many months received more than 60 mm of rain?
    6. What is the difference between the wettest and driest months?
    7. Which half of the year (Jan–Jun or Jul–Dec) had greater total rainfall?
    8. Describe the seasonal trend you see in the data.
  7. Correlation or Causation?

    Decide whether each relationship is likely correlation, causation, or coincidence. Explain your reasoning.

    1. Students who eat breakfast tend to score higher on morning tests than those who skip breakfast.
    2. In a town, ice cream sales and the number of shark attacks both increase during summer months.
    3. Cities with more libraries have higher literacy rates.
    4. Turning on a light switch causes the light to turn on.
    5. Countries with higher chocolate consumption tend to have more Nobel Prize winners per capita.
    6. People who exercise regularly have lower rates of heart disease.
  8. Limitations of Secondary Data

    1. A student uses a 2010 census to draw conclusions about today's Australian population. Identify one limitation.
    2. A report on average wages uses data collected only from workers in the mining industry. What limitation does this create?
    3. A student uses data from a study conducted in the United States to answer a question about Australian teenagers. What problem might this cause?
    4. A dataset shows average temperatures recorded only at city weather stations. How might this limit studying rural Australia?
    5. Explain the difference between a limitation and a bias in a secondary data source.
  9. Screen Time Table

    Average screen time (hours per week) by age group — Oakfield Study, 2023

    Age Group 6–910–1213–1516–1819–25
    Avg screen time (hrs/week) 1422354138
    1. Which age group has the highest average screen time?
    2. Which age group has the lowest average screen time?
    3. By how many hours does screen time increase from the 6–9 group to the 13–15 group?
    4. What is the average screen time across all five age groups?
    5. A parent says "Screen time always increases with age." Does this table support or contradict that claim? Explain.
    6. Identify one limitation of this secondary data source.
  10. Extended Analysis

    A student is investigating whether taller students tend to have bigger foot sizes. She finds a published study with the following data (50 Year 7 students):

    Height range (cm) 140–149150–159160–169170–179
    Average shoe size 5.26.87.99.1
    1. Describe the trend between height and shoe size.
    2. Does this data show correlation, causation, or both? Explain your reasoning.
    3. Identify two limitations of this secondary data for drawing general conclusions.
    4. The student writes: "This proves that being tall makes your feet grow bigger." Evaluate this conclusion and write a better version.
    5. Suggest one additional piece of information you would want to know about this study before trusting its results.