Evaluating Data Displays — Solutions
Click any answer to watch the solution video.
-
Match display to context
- Daily maximum temperature over 30 days:
- Favourite colours of 25 students:
- Proportion of weekly time across activities:
- Heights of 20 students showing distribution shape:
- Quiz scores out of 10, every individual score shown:
- Goals scored per match over 15 matches:
- Exam results grouped into intervals:
- Monthly rainfall totals over one year:
-
Read and interpret data displays
- Students who read more than 2 books (3, 4, or 5 books):
- More students own dogs than cats:
- Students who ride a bike:
- Values: 23, 24, 27, 31, 35, 35, 38, 40, 42, 46, 53. Median (6th value):
- Largest increase in savings:
- Students who scored 70 or above:
- Total sales:
- Flathead caught:
-
Identify misleading features
- Truncated y-axis starting at $950 makes the difference between $980 and $1020 (only $40) appear very large visually.
- 3D perspective distorts the size of pie slices — front slices appear larger than equally-sized back slices due to angle and depth.
- Unequal time intervals on the x-axis but equal spacing between them makes the rate of change look consistent when it is not.
- Different bar widths mean bars cover different areas — a wide bar looks like it represents more data even if the height (frequency density) is the same.
- Y-axis from 60 to 70 makes the difference between 62% and 68% look much larger than 6 percentage points out of 100.
- Percentages totalling 110% is mathematically impossible for a pie chart; each sector is inflated to make the proportions appear more extreme.
- Omitting the y-axis scale makes it impossible to determine actual values, allowing any impression to be created by bar heights alone.
- An unlabelled axis break makes it appear the y-axis starts at 0 when it does not, exaggerating differences between values.
-
Compare two displays
- Display B (line graph) is more appropriate. Sunshine hours over 30 consecutive days shows change over time, which a line graph represents clearly. A dot plot does not show time order.
- Display B (pie chart) is more appropriate. A pie chart shows proportions of a whole budget clearly. A column graph is better for comparing independent categories, not parts of a total.
- Display B (stem-and-leaf plot) is more appropriate. With 30 scores ranging from 42 to 98, a dot plot would be very cluttered. A stem-and-leaf plot groups by tens, making the distribution easy to read.
- Display B (back-to-back stem-and-leaf plot) is more appropriate. It allows direct comparison of both groups on the same scale and preserves every data value.
- Display B (line graph) is more appropriate. The data shows change across 24 time periods. A pie chart would be meaningless here as hourly counts are not parts of a meaningful whole.
- Display B (column graph) is more appropriate. Letter grades are discrete categories. A histogram is for continuous data grouped into class intervals and would be misleading for labelled categories.
-
Fix misleading displays
- Start the y-axis at 0 so that bars representing 97 and 99 are seen in context and the tiny difference (2 out of 100) is not visually exaggerated.
- Redraw as a flat (2D) bar chart. This removes depth distortion caused by 3D perspective and makes all bar heights directly comparable.
- Recalculate each segment so percentages sum to exactly 100%. Each slice should represent its true proportion of the whole dataset.
- Space the x-axis intervals proportionally to the actual time gaps (e.g., 1 year gap = 1 unit, 5 year gap = 5 units) so the gradient of the line reflects the true rate of change.
- Replace pictographs of different-sized people with equal-sized symbols or use a standard column graph where height or count alone represents the value.
- Use the same scale on both y-axes (or a single y-axis), so that equal changes in improvement are represented by equal visual distances on the graph.
-
Critique statistical claims
- The claim is not accurate. Sales increased from 4600 to 4900 units, a rise of 300 units. That is a 6.5% increase, not “double.” The bar graph’s truncated y-axis (starting at 4500) makes the second bar appear approximately twice the height of the first, which visually suggests doubling but is mathematically false.
- Problem 1: The sample is too small (n = 40) to represent an entire suburb or city population reliably. Problem 2: Surveying only shoppers at a shopping centre on a Tuesday morning introduces selection bias — this group is not representative of all residents (e.g. it excludes people who are working, housebound, or do not shop at that centre).
- The graph appears to show crime rising steeply because the y-axis runs from 980 to 1020. In reality, crime increased from 988 to 1008 incidents — a rise of 20 incidents over six years, roughly a 2% increase. The compressed y-axis makes a gradual, modest rise appear dramatic. The display should use a y-axis starting at 0 (or at minimum clearly label and justify the truncation) and the headline should be reworded to accurately reflect the small absolute and percentage change.
-
True or False about data displays
- False — a histogram is used for continuous (or grouped numerical) data, not categorical data.
- False — a line graph shows change over time; a pie chart shows parts of a whole.
- True.
- False — a y-axis starting at 0 can still be misleading if, for example, the scale is compressed, intervals are unequal, or data points are selectively highlighted.
- False — a dot plot becomes cluttered with large datasets; stem-and-leaf plots or histograms are more appropriate for larger datasets.
- True.
- False — a pie chart shows parts of a whole better; a column graph compares separate categories.
- True.
-
Read a back-to-back stem-and-leaf plot
- Class A: 3 + 4 + 3 + 2 + 1 = ; Class B: 3 + 4 + 4 + 3 + 2 =
- Class A highest: ; Class B highest:
- Class A has 13 values; median is the 7th value. Listing in order: 55, 57, 59, 62, 63, 66, 68, 71, 75, 79, 84, 88, 92. Median =
- Class B has 16 values; median is average of 8th and 9th. Listing in order: 52, 54, 58, 61, 63, 65, 69, 70, 73, 76, 77, 82, 85, 88, 91, 94. Median = (70 + 73) ÷ 2 =
- Class B performed better overall. Its median (71.5) is higher than Class A’s median (68), and Class B has more scores in the 70s and 80s, indicating a higher concentration of scores above average.
- Class B scores of 70 or more: 70, 73, 76, 77, 82, 85, 88, 91, 94 =
-
Choose and justify a display
- Line graph — it shows how a continuous quantity (water level) changes over ordered time periods.
- Dot plot or stem-and-leaf plot — the dataset is small (30 students) and discrete (0–10), so individual values can be displayed; a stem-and-leaf plot also shows the shape of the distribution.
- Pie chart — the three categories are parts of a whole (they sum to 100%), which a pie chart represents most clearly.
- Back-to-back stem-and-leaf plot — it allows direct comparison of two datasets of the same size while preserving all original values.
- Column graph — the data is categorical (states) with a numerical count per category; a column graph allows easy comparison across all states.
-
Create and critique data displays
-
(i) Stem-and-leaf plot:
5 | 8
6 | 2 4 5 7
7 | 0 2 4 8
8 | 1 5
9 | 1
(ii) The distribution is slightly skewed right (most scores are in the 60s and 70s, with fewer high scores).
(iii) Starting the y-axis at 55 would make small differences in scores look enormous. For example, a score of 91 and 58 would appear 6× further apart than they really are proportionally. This is misleading because it exaggerates variation. -
(i) Increase: 4.3 − 3.8 = 0.5 hours
(ii) Percentage increase: (0.5 ÷ 3.8) × 100 ≈ 13.2%
(iii) “Skyrocketed” is not justified. Usage increased by only 0.5 hours (13%) over 4 years — a gradual, modest rise. The truncated y-axis (3.5 to 4.5) makes this small increase appear dramatic.
(iv) The y-axis should start at 0 so that the bars or line segments are proportional to the actual values. This would make the 0.5-hour increase look appropriately small relative to a baseline of 0.
-
(i) Stem-and-leaf plot: