L54 — Secondary Source Data
Rainfall Data Table — Riverside (fictional city)
| Month | Jan | Feb | Mar | Apr | May | Jun | Jul | Aug | Sep | Oct | Nov | Dec |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Rainfall (mm) | 85 | 102 | 78 | 56 | 41 | 28 | 19 | 23 | 37 | 63 | 74 | 90 |
Worked Example
Use the Riverside rainfall table above: (a) In which month is it driest? (b) What is the total annual rainfall? (c) Identify one limitation.
(a) Smallest value: July = 19 mm is the driest month.
(b) Sum all 12 months: 85+102+78+56+41+28+19+23+37+63+74+90 = 696 mm
(c) This table shows only one year. Rainfall varies year to year, so a single year may not be typical.
Key Terms
- secondary data
- data originally collected by someone else — e.g. ABS datasets, newspaper articles, textbooks, websites
- credibility
- how trustworthy a source is; a credible source has expertise, independence, and transparent methods
- currency
- how recent the data is; old data may not reflect current conditions
- limitation
- a factor that restricts how broadly you can apply a dataset's conclusions
- correlation
- two variables that change together; does NOT mean one causes the other
- causation
- one variable directly causes a change in another; requires strong experimental evidence
What Is Secondary Data?
Secondary data is data collected by someone else, for a purpose that may differ from yours. Sources include government agencies (ABS), scientific journals, news organisations, and websites. Secondary data gives access to large national datasets and historical records you could not collect yourself — but you must evaluate it carefully.
Evaluating a Secondary Source
Ask four questions before trusting any secondary data:
- Credible? — who collected it? Do they have the expertise and independence to do so reliably?
- Current? — was it collected recently enough? Population data from 2005 may not reflect today.
- Relevant? — does this dataset actually answer your question, or is it only loosely related?
- Purpose? — was it collected to inform or to persuade? Data from a company promoting its own products may be biased.
Limitations of Secondary Data
When you use someone else's data, you inherit all their decisions — what to measure, who to sample, and how to record results. Every dataset has limitations:
- A survey of 100 students in one Brisbane school cannot tell you about all students in Queensland
- Data collected in the US may not apply to Australia
- Old data may not reflect current conditions
Identifying limitations is a sign of strong statistical thinking.
Correlation vs Causation
Secondary data often shows that two things change together (correlation), but this does not mean one causes the other (causation).
- Ice cream sales and shark attacks both increase in summer — hot weather causes both
- Countries with higher chocolate consumption tend to have more Nobel Prize winners — this is spurious correlation, not causation
Use language like "the data shows a relationship between..." rather than "X causes Y" unless there is strong experimental evidence.
-
City Population Table
Population of five Australian cities (2021 Census, thousands)
City Sydney Melbourne Brisbane Perth Adelaide Population (thousands) 5 231 5 078 2 560 2 192 1 402 - Which city has the largest population?
- Which city has the smallest population?
- What is the combined population of Sydney and Melbourne (in thousands)?
- How much larger is Brisbane's population than Perth's (in thousands)?
- What is the total population of all five cities (in thousands)?
-
Subject Average Scores Table
Average class score by subject — Year 7 Springvale School (out of 100)
Subject Maths English Science History Art PE Average Score 72 68 75 61 84 79 - In which subject did students score highest on average?
- In which subject did students score lowest on average?
- What is the difference between the highest and lowest average scores?
- How many subjects had an average score above 70?
- What is the average of all six subject averages? (Round to 1 decimal place.)
-
Evaluate Source Credibility
Rate each data source on a scale of 1–3 (1 = least credible, 3 = most credible) and explain your rating.
- A dataset published by the Australian Bureau of Statistics, updated in 2024.
- A comment on a social media post saying "I heard that 90% of teenagers have anxiety."
- A peer-reviewed scientific journal article published in 2022 by university researchers.
- A company's own website advertising their product, claiming it is used by "millions of satisfied customers."
- A government health department report published in 2019 on childhood nutrition.
-
Does the Data Support the Claim?
Use the Springvale School data from Question 2. State whether each claim supports, does not support, or goes beyond the data. Explain.
- "Students at Springvale performed better in Art than in any other subject."
- "Year 7 students across Australia find Maths harder than Science."
- "The average Maths score was higher than the average History score."
- "Students dislike History because it is boring."
- "More than half of the subjects had an average score above 70."
-
Statistical Inquiry with Secondary Data
- Write a statistical question that could be answered using data from the Australian Bureau of Statistics (ABS).
- Identify which specific ABS dataset you would look for and explain why it matches your question.
- Describe two limitations of using secondary data to answer your question.
- Identify one confounding factor that could make it difficult to draw a simple conclusion.
-
Riverside Rainfall Table
Rainfall Data Table — Riverside (fictional city)
Month Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Rainfall (mm) 85 102 78 56 41 28 19 23 37 63 74 90 - In which month is rainfall highest?
- In which month is rainfall lowest?
- What is the total annual rainfall?
- What is the average monthly rainfall? (Round to 1 decimal place.)
- How many months received more than 60 mm of rain?
- What is the difference between the wettest and driest months?
- Which half of the year (Jan–Jun or Jul–Dec) had greater total rainfall?
- Describe the seasonal trend you see in the data.
-
Correlation or Causation?
Decide whether each relationship is likely correlation, causation, or coincidence. Explain your reasoning.
- Students who eat breakfast tend to score higher on morning tests than those who skip breakfast.
- In a town, ice cream sales and the number of shark attacks both increase during summer months.
- Cities with more libraries have higher literacy rates.
- Turning on a light switch causes the light to turn on.
- Countries with higher chocolate consumption tend to have more Nobel Prize winners per capita.
- People who exercise regularly have lower rates of heart disease.
-
Limitations of Secondary Data
- A student uses a 2010 census to draw conclusions about today's Australian population. Identify one limitation.
- A report on average wages uses data collected only from workers in the mining industry. What limitation does this create?
- A student uses data from a study conducted in the United States to answer a question about Australian teenagers. What problem might this cause?
- A dataset shows average temperatures recorded only at city weather stations. How might this limit studying rural Australia?
- Explain the difference between a limitation and a bias in a secondary data source.
-
Screen Time Table
Average screen time (hours per week) by age group — Oakfield Study, 2023
Age Group 6–9 10–12 13–15 16–18 19–25 Avg screen time (hrs/week) 14 22 35 41 38 - Which age group has the highest average screen time?
- Which age group has the lowest average screen time?
- By how many hours does screen time increase from the 6–9 group to the 13–15 group?
- What is the average screen time across all five age groups?
- A parent says "Screen time always increases with age." Does this table support or contradict that claim? Explain.
- Identify one limitation of this secondary data source.
-
Extended Analysis
A student is investigating whether taller students tend to have bigger foot sizes. She finds a published study with the following data (50 Year 7 students):
Height range (cm) 140–149 150–159 160–169 170–179 Average shoe size 5.2 6.8 7.9 9.1 - Describe the trend between height and shoe size.
- Does this data show correlation, causation, or both? Explain your reasoning.
- Identify two limitations of this secondary data for drawing general conclusions.
- The student writes: "This proves that being tall makes your feet grow bigger." Evaluate this conclusion and write a better version.
- Suggest one additional piece of information you would want to know about this study before trusting its results.