Hypothesis Testing for Population Mean — Worked Solutions — Grade 12 Specialist Maths

Q1 — Setting Up Hypotheses

A bottling plant claims it fills bottles to a mean volume of μ = 500 mL. State H₀ and H₁ for a quality inspector who suspects the mean fill is not exactly 500 mL.

Step 1: Define the parameter.

Let μ = the true mean fill volume (mL) of all bottles produced by the plant.

Step 2: State the null hypothesis.

H₀: μ = 500 (the plant fills to exactly 500 mL on average).

Step 3: State the alternative hypothesis.

H₁: μ ≠ 500 (the mean fill differs from 500 mL).

Step 4: Identify the test type.

This is a two-tailed test because the inspector suspects the mean is different in either direction (too much or too little), not specifically above or below 500 mL.

Note: H₀ always contains the equality sign (=). The alternative H₁ reflects the research question.

Q2 — Calculating the z Test Statistic

H₀: μ = 50, H₁: μ ≠ 50. Sample results: x̄ = 53, σ = 10, n = 25. Calculate the z test statistic.

Step 1: Identify the given values.

x̄ = 53, μ₀ = 50, σ = 10, n = 25.

Step 2: Calculate the standard error of the mean.

SE = σ / √n = 10 / √25 = 10 / 5 = 2

Step 3: Apply the z-test statistic formula.

z = (x̄ − μ₀) / SE = (53 − 50) / 2 = 3 / 2 = 1.5

Interpretation: The sample mean of 53 is 1.5 standard errors above the hypothesised mean of 50. Under H₀, a z-value of 1.5 is not unusually extreme.

Q3 — One-Tailed p-Value from a z-Score

A one-tailed (upper) test gives z = 1.8. Find the p-value. Would you reject H₀ at α = 0.05?

Step 1: Identify the test direction.

Upper one-tailed test: H₁: μ > μ₀. The p-value is the area to the right of z = 1.8.

Step 2: Calculate the p-value.

p = P(Z ≥ 1.8) = 1 − Φ(1.8) = 1 − 0.9641 = 0.0359

Step 3: Compare to α.

Since p = 0.0359 < α = 0.05, we reject H₀.

Conclusion: There is sufficient evidence at the 5% significance level to support the alternative hypothesis that μ > μ₀.

Note: For a two-tailed test with the same z, the p-value would be 2 × 0.0359 = 0.0718 > 0.05 and we would fail to reject H₀. The direction of H₁ matters.

Q4 — Two-Tailed Test Decision

x̄ = 75, μ₀ = 70, σ = 20, n = 100. Conduct a two-tailed test at α = 0.01.

Step 1: State hypotheses.

H₀: μ = 70. H₁: μ ≠ 70. α = 0.01.

Step 2: Calculate the test statistic.

SE = 20 / √100 = 20 / 10 = 2.

z = (75 − 70) / 2 = 5 / 2 = 2.5

Step 3: Find the two-tailed p-value.

p = 2P(Z ≥ 2.5) = 2(1 − Φ(2.5)) = 2(1 − 0.9938) = 2(0.0062) = 0.0124

Step 4: Decision.

Since p = 0.0124 > α = 0.01, we fail to reject H₀.

Conclusion: There is insufficient evidence at the 1% significance level to conclude that the population mean differs from 70.

Note: At α = 0.05, we would reject H₀ since p = 0.0124 < 0.05. The conclusion depends on the chosen significance level.

Q5 — Critical Value Approach

Find the critical z-values for a two-tailed test at α = 0.01. For what values of x̄ would you reject H₀: μ = 100, given n = 49 and σ = 14?

Step 1: Identify the critical z-values.

For a two-tailed test at α = 0.01, each tail has area 0.005.

z_crit = ±z_0.005 = ±2.576

Reject H₀ if z < −2.576 or z > 2.576.

Step 2: Calculate the standard error.

SE = σ / √n = 14 / √49 = 14 / 7 = 2

Step 3: Convert critical z-values to x̄ values.

Upper boundary: x̄ = μ₀ + z_crit × SE = 100 + 2.576 × 2 = 100 + 5.152 = 105.15

Lower boundary: x̄ = μ₀ − z_crit × SE = 100 − 5.152 = 94.85

Rejection region: Reject H₀ if x̄ < 94.85 or x̄ > 105.15.

This is equivalent to the p-value approach and will always lead to the same conclusion.

Q6 — Reject or Fail to Reject at α = 0.05

A quality control test gives z = −2.3. Using a two-tailed test at α = 0.05, state the decision and conclusion in context. The test was checking whether a machine produces bolts of diameter μ = 10 mm.

Step 1: State hypotheses.

H₀: μ = 10 mm. H₁: μ ≠ 10 mm. α = 0.05.

Step 2: Find the two-tailed p-value.

p = 2P(Z ≤ −2.3) = 2 × Φ(−2.3)

Φ(−2.3) = 1 − Φ(2.3) = 1 − 0.9893 = 0.0107

p = 2 × 0.0107 = 0.0214

Step 3: Decision.

Since p = 0.0214 < α = 0.05, reject H₀.

Conclusion in context: There is sufficient evidence at the 5% significance level to conclude that the machine is not producing bolts of mean diameter 10 mm. The quality control check has detected a significant deviation and the machine should be recalibrated.

Q7 — Hypothesis Test and 95% Confidence Interval

A manufacturer claims μ = 500 g. A sample of n = 36 gives x̄ = 496 g with σ = 12 g. (a) Conduct a two-tailed z-test at α = 0.05. (b) Construct a 95% CI. Compare the two approaches.

Part (a): Hypothesis Test

H₀: μ = 500. H₁: μ ≠ 500. α = 0.05.

SE = 12 / √36 = 12 / 6 = 2.

z = (496 − 500) / 2 = −4 / 2 = −2.0.

p = 2P(Z ≤ −2.0) = 2(0.0228) = 0.0456.

Since p = 0.0456 < 0.05, reject H₀.

Part (b): 95% Confidence Interval

CI = x̄ ± z_0.025 × SE = 496 ± 1.96 × 2 = 496 ± 3.92

CI = (492.08, 499.92)

Comparison:

The value μ₀ = 500 is not contained in the 95% CI (492.08, 499.92). This corresponds exactly to rejecting H₀ in the two-tailed test at α = 0.05. The two approaches are equivalent: H₀ is rejected at level α if and only if μ₀ falls outside the (1 − α)100% confidence interval.

Q8 — Interpreting Type I Error Risk

A hypothesis test at α = 0.05 rejects H₀. Explain what Type I error risk means in this context, and what it means if α is set at 0.01 instead.

Type I Error Definition:

A Type I error occurs when we reject H₀ when H₀ is actually true. This is a false positive.

At α = 0.05:

By choosing α = 0.05, we accept a 5% probability of making a Type I error. That is, in the long run, if H₀ is truly correct, we would wrongly reject it in 5% of repeated experiments. Rejecting H₀ at this level means the result is “statistically significant at the 5% level” — but there remains a 5% chance this decision is a false alarm.

Effect of reducing α to 0.01:

A lower significance level (α = 0.01) demands stronger evidence before rejecting H₀. The Type I error risk drops to 1%, making false positives less likely. However, this comes at a cost: we are less likely to detect a true difference (the Type II error rate β increases), reducing the power of the test.

Practical note: In high-stakes decisions (e.g., medical approvals, safety regulations), a lower α is preferred to minimise false positives, even at the cost of requiring larger samples to maintain power.

Q9 — Effect of Larger n on the Test

A test uses x̄ = 52, μ₀ = 50, σ = 10. Compare the test statistic and p-value for n = 25 and n = 100. What does this illustrate about the effect of sample size?

For n = 25:

SE = 10 / √25 = 10 / 5 = 2.

z = (52 − 50) / 2 = 1.0.

Two-tailed p-value = 2P(Z ≥ 1.0) = 2(1 − 0.8413) = 2(0.1587) = 0.3174.

At α = 0.05: p = 0.317 > 0.05 ⇒ Fail to reject H₀.

For n = 100:

SE = 10 / √100 = 10 / 10 = 1.

z = (52 − 50) / 1 = 2.0.

Two-tailed p-value = 2P(Z ≥ 2.0) = 2(0.0228) = 0.0456.

At α = 0.05: p = 0.046 < 0.05 ⇒ Reject H₀.

Conclusion — Effect of larger n:

With the same sample mean and the same difference from μ₀, a larger sample size gives a smaller standard error, producing a larger z-statistic and a smaller p-value. This means larger samples have greater power to detect true departures from H₀, even when the effect size (x̄ − μ₀ = 2) is modest. With n = 100 the difference is detectable; with n = 25 it is not.

Q10 — Full Hypothesis Test in Context

A pharmaceutical company claims a new pain relief drug reduces pain scores by a mean of at least 15 points (on a 100-point scale). In a clinical trial, 64 patients show a mean reduction of 13.8 points. Historical standard deviation for such trials is σ = 4.8 points. Test at α = 0.05 whether there is evidence that the drug’s mean reduction falls short of the claim.

Step 1: Define the parameter and state hypotheses.

Let μ = true mean pain score reduction (points) across all patients using the drug.

H₀: μ = 15 (the drug achieves the claimed mean reduction).

H₁: μ < 15 (the drug falls short of the claimed reduction). [Lower one-tailed test]

α = 0.05.

Step 2: Calculate the test statistic.

SE = σ / √n = 4.8 / √64 = 4.8 / 8 = 0.6.

z = (x̄ − μ₀) / SE = (13.8 − 15) / 0.6 = −1.2 / 0.6 = −2.0.

Step 3: Find the p-value.

Lower one-tailed test: p = P(Z ≤ −2.0) = Φ(−2.0) = 1 − 0.9772 = 0.0228.

Step 4: Decision.

Since p = 0.0228 < α = 0.05, reject H₀.

Step 5: Conclusion in context.

There is sufficient evidence at the 5% significance level to conclude that the drug’s true mean pain score reduction is less than the claimed 15 points. The pharmaceutical company’s claim of a minimum 15-point reduction is not supported by this clinical trial data. The drug provides a statistically significantly smaller mean reduction than claimed.

Note on effect size: The observed mean of 13.8 is 1.2 points below the claim. While statistically significant, the practical significance depends on the clinical context — whether a 1.2-point difference matters to patients should be considered alongside the statistical result.