Measures of Centre
Key Terms
- Mean (&x̄)
- Arithmetic average: &x̄ = ∑x / n. Affected by outliers — can be misleading for skewed data.
- Median
- The middle value when data is ordered. For even n, average the two middle values. Resistant to outliers.
- Mode
- The most frequently occurring value. A dataset may have no mode, one mode, or several modes.
- Outlier
- A value that lies far from the main cluster; can significantly affect the mean but has little effect on the median.
- Symmetric vs skewed
- Symmetric: mean ≈ median. Positively skewed: mean > median (tail to the right). Negatively skewed: mean < median.
- Choosing a measure
- Use median for skewed data or data with outliers; mean is preferred for symmetric data without outliers.
The Three Measures of Centre
A measure of centre describes a typical or representative value for a dataset. The three measures are mean, median and mode.
Sum all values, then divide by the count.
Median: The middle value when data is ordered.
• Odd n: middle value (position (n+1)/2)
• Even n: average of the two middle values
Mode: The most frequently occurring value.
A dataset may have no mode, one mode, or multiple modes (bimodal).
When to Use Each Measure
| Measure | Best used when… | Weakness |
|---|---|---|
| Mean | Distribution is symmetric; no extreme outliers | Strongly affected by outliers — can be misleading |
| Median | Distribution is skewed; outliers are present | Ignores the actual numerical values of most data |
| Mode | Categorical data; finding the most popular value | May not exist, or may not be near the centre |
x̅w = Σ(wi × xi) ÷ Σwi
Used when some values contribute more than others (e.g. subjects with different weightings).
SVG — Mean, Median and Mode on a Skewed Distribution
In a positively skewed distribution, the mean is pulled towards the outlier while the median stays near the centre of the bulk of the data.
Worked Example 1 — Calculating All Three Measures
Dataset: 4, 7, 3, 9, 7, 5, 8, 7
Step 1 — Order the data: 3, 4, 5, 7, 7, 7, 8, 9
Mean: x̅ = (3+4+5+7+7+7+8+9) ÷ 8 = 50 ÷ 8 = 6.25
Median: n=8 (even), so median = average of 4th and 5th values = (7+7)/2 = 7
Mode: 7 appears 3 times — mode = 7
Worked Example 2 — Effect of an Outlier
Dataset without outlier: 5, 6, 7, 8, 8, 9, 10
Mean = (5+6+7+8+8+9+10) ÷ 7 = 53 ÷ 7 = 7.57
Ordered: 5, 6, 7, 8, 8, 9, 10 → Median = 8
Add outlier: 5, 6, 7, 8, 8, 9, 10, 42
New mean = (53 + 42) ÷ 8 = 95 ÷ 8 = 11.875
Ordered: 5, 6, 7, 8, 8, 9, 10, 42 → Median = (8+8)/2 = 8
Conclusion: The mean jumped from 7.57 to 11.88 — no longer represents the typical value. The median stayed at 8. The median is robust to outliers.
Full Lesson: Measures of Centre
What is a Measure of Centre?
A measure of centre is a single number that attempts to summarise an entire dataset by describing a “typical” or “central” value. Think of it as the answer to: “If I had to pick one number to represent this whole dataset, what would it be?” Different measures give different answers depending on the distribution.
The Mean in Depth
The mean (x̅, read “x-bar”) is calculated by adding all values and dividing by the count:
x̅ = Σx ÷ n
where Σx means “the sum of all x values” and n is the number of values.
The mean uses every single data point in its calculation. This makes it highly informative when all data points are legitimate — but it also makes it sensitive to extreme values (outliers). One very large or very small value can drag the mean far away from the centre of the bulk of the data.
The mean is most reliable for symmetric distributions with no outliers.
The Median in Depth
The median is the value that sits exactly in the middle of an ordered dataset. Half the values lie below it and half above it.
- If n is odd: median is the value at position (n+1)/2
- If n is even: median is the average of the values at positions n/2 and (n/2)+1
The median only depends on the middle value(s), not the actual size of any individual data point. This makes it robust to outliers. Moving an extreme value further away from the centre has no effect on the median.
The median is most useful for skewed distributions or datasets with outliers.
The Mode in Depth
The mode is the most frequently occurring value. A dataset can have:
- No mode: if all values appear exactly once
- One mode (unimodal): most common situation
- Two modes (bimodal): suggests two distinct subgroups in the data
- Multiple modes (multimodal): rare; usually not very useful
The mode is the only measure of centre that can be used for categorical (nominal) data — it tells you the most popular category.
The Weighted Mean
When different values contribute different amounts to an average, we use a weighted mean:
x̅w = Σ(wi × xi) ÷ Σwi
For example, if a subject has assessments worth different percentages, you cannot simply average the scores — you must weight each score by its importance.
- Assignment: score 72, weight 20%
- Mid-year test: score 65, weight 30%
- Final exam: score 80, weight 50%
= (1440 + 1950 + 4000) ÷ 100
= 7390 ÷ 100 = 73.9
Simple average would give (72+65+80)/3 = 72.3 — different because the exam carries more weight.
Choosing the Right Measure
The key question to ask before choosing a measure of centre:
- Are there outliers? → Use the median
- Is the distribution skewed? → Use the median
- Is the data categorical? → Use the mode
- Is the distribution symmetric with no outliers? → Use the mean
Mastery Practice
-
Fluency
For the dataset: 4, 7, 2, 9, 3, 7, 5, 8, 7, 6, find:
- The mean
- The median
- The mode
-
Fluency
Find the median for each dataset:
- 11, 14, 17, 20, 23, 26, 29 (7 values)
- 8, 13, 15, 18, 21, 25 (6 values)
- 3, 3, 3, 5, 7, 9, 11, 11 (8 values, bimodal)
-
Fluency
A student scores 72, 68, 85, 79, 68, 91, 75 on seven tests.
- Find the mean (to 1 decimal place).
- Find the median.
- Which measure better represents the student’s typical performance? Explain.
-
Fluency
A student’s subject results are weighted as follows:
Subject Weight Score Mathematics 30% 88 English 25% 72 Science 20% 81 History 15% 65 PE 10% 90 Calculate the student’s weighted average score.
-
Understanding
Consider the dataset: 5, 6, 7, 8, 8, 9, 10, 42
- Calculate the mean of the first seven values (ignoring 42).
- Calculate the mean including the value 42.
- Find the median of the first seven values.
- Find the median of all eight values.
- Which measure of centre is least affected by the outlier 42? Explain why.
-
Understanding
House sale prices in a suburb ($thousands): 420, 450, 480, 510, 520, 540, 560, 580, 620, 1200.
- Calculate the mean (to the nearest whole number).
- Calculate the median.
- Which measure better represents the typical house price in this suburb? Justify your answer.
-
Understanding
The table shows a grouped frequency distribution of test scores:
Score Frequency Midpoint 0–4 3 2 5–9 8 7 10–14 12 12 15–19 7 17 Estimate the mean score using the class midpoints.
-
Understanding
Three classes sat the same exam:
- Class A: 25 students, mean score 72
- Class B: 30 students, mean score 68
- Class C: 20 students, mean score 75
Calculate the overall mean score for all 75 students combined.
-
Problem Solving
A cricket batsman’s scores over 10 innings: 45, 12, 87, 0, 63, 34, 51, 112, 8, 76.
- Calculate the mean and median score (to 1 decimal place where needed).
- Which measure better represents the batsman’s “typical” innings? Justify your choice.
- In his 11th innings the batsman scores 200. Recalculate the mean and median.
- The median is described as “robust to outliers”. What does this mean in this context?
-
Problem Solving
Monthly salaries at a small company (8 employees): $3200, $3500, $3800, $4200, $4500, $5000, $5800, $18500.
- Calculate the mean salary (to the nearest dollar).
- Calculate the median salary.
- The manager claims “the average salary at this company is over $6000.” Is this mathematically true? Is it a fair representation of typical pay? Explain.
- Which measure of centre should employees use when negotiating a pay rise? Why?