Practice Maths

Hypothesis Testing for Population Mean — Worked Solutions

◀ Back to Questions

  1. Q1 — Setting Up Hypotheses

    A bottling plant claims it fills bottles to a mean volume of μ = 500 mL. State H0 and H1 for a quality inspector who suspects the mean fill is not exactly 500 mL.

    Step 1: Define the parameter.

    Let μ = the true mean fill volume (mL) of all bottles produced by the plant.

    Step 2: State the null hypothesis.

    H0: μ = 500 (the plant fills to exactly 500 mL on average).

    Step 3: State the alternative hypothesis.

    H1: μ ≠ 500 (the mean fill differs from 500 mL).

    Step 4: Identify the test type.

    This is a two-tailed test because the inspector suspects the mean is different in either direction (too much or too little), not specifically above or below 500 mL.

    Note: H0 always contains the equality sign (=). The alternative H1 reflects the research question.

  2. Q2 — Calculating the z Test Statistic

    H0: μ = 50, H1: μ ≠ 50. Sample results: x̄ = 53, σ = 10, n = 25. Calculate the z test statistic.

    Step 1: Identify the given values.

    x̄ = 53, μ0 = 50, σ = 10, n = 25.

    Step 2: Calculate the standard error of the mean.

    SE = σ / √n = 10 / √25 = 10 / 5 = 2

    Step 3: Apply the z-test statistic formula.

    z = (x̄ − μ0) / SE = (53 − 50) / 2 = 3 / 2 = 1.5

    Interpretation: The sample mean of 53 is 1.5 standard errors above the hypothesised mean of 50. Under H0, a z-value of 1.5 is not unusually extreme.

  3. Q3 — One-Tailed p-Value from a z-Score

    A one-tailed (upper) test gives z = 1.8. Find the p-value. Would you reject H0 at α = 0.05?

    Step 1: Identify the test direction.

    Upper one-tailed test: H1: μ > μ0. The p-value is the area to the right of z = 1.8.

    Step 2: Calculate the p-value.

    p = P(Z ≥ 1.8) = 1 − Φ(1.8) = 1 − 0.9641 = 0.0359

    Step 3: Compare to α.

    Since p = 0.0359 < α = 0.05, we reject H0.

    Conclusion: There is sufficient evidence at the 5% significance level to support the alternative hypothesis that μ > μ0.

    Note: For a two-tailed test with the same z, the p-value would be 2 × 0.0359 = 0.0718 > 0.05 and we would fail to reject H0. The direction of H1 matters.

  4. Q4 — Two-Tailed Test Decision

    x̄ = 75, μ0 = 70, σ = 20, n = 100. Conduct a two-tailed test at α = 0.01.

    Step 1: State hypotheses.

    H0: μ = 70.   H1: μ ≠ 70.   α = 0.01.

    Step 2: Calculate the test statistic.

    SE = 20 / √100 = 20 / 10 = 2.

    z = (75 − 70) / 2 = 5 / 2 = 2.5

    Step 3: Find the two-tailed p-value.

    p = 2P(Z ≥ 2.5) = 2(1 − Φ(2.5)) = 2(1 − 0.9938) = 2(0.0062) = 0.0124

    Step 4: Decision.

    Since p = 0.0124 > α = 0.01, we fail to reject H0.

    Conclusion: There is insufficient evidence at the 1% significance level to conclude that the population mean differs from 70.

    Note: At α = 0.05, we would reject H0 since p = 0.0124 < 0.05. The conclusion depends on the chosen significance level.

  5. Q5 — Critical Value Approach

    Find the critical z-values for a two-tailed test at α = 0.01. For what values of x̄ would you reject H0: μ = 100, given n = 49 and σ = 14?

    Step 1: Identify the critical z-values.

    For a two-tailed test at α = 0.01, each tail has area 0.005.

    zcrit = ±z0.005 = ±2.576

    Reject H0 if z < −2.576 or z > 2.576.

    Step 2: Calculate the standard error.

    SE = σ / √n = 14 / √49 = 14 / 7 = 2

    Step 3: Convert critical z-values to x̄ values.

    Upper boundary: x̄ = μ0 + zcrit × SE = 100 + 2.576 × 2 = 100 + 5.152 = 105.15

    Lower boundary: x̄ = μ0 − zcrit × SE = 100 − 5.152 = 94.85

    Rejection region: Reject H0 if x̄ < 94.85 or x̄ > 105.15.

    This is equivalent to the p-value approach and will always lead to the same conclusion.

  6. Q6 — Reject or Fail to Reject at α = 0.05

    A quality control test gives z = −2.3. Using a two-tailed test at α = 0.05, state the decision and conclusion in context. The test was checking whether a machine produces bolts of diameter μ = 10 mm.

    Step 1: State hypotheses.

    H0: μ = 10 mm.   H1: μ ≠ 10 mm.   α = 0.05.

    Step 2: Find the two-tailed p-value.

    p = 2P(Z ≤ −2.3) = 2 × Φ(−2.3)

    Φ(−2.3) = 1 − Φ(2.3) = 1 − 0.9893 = 0.0107

    p = 2 × 0.0107 = 0.0214

    Step 3: Decision.

    Since p = 0.0214 < α = 0.05, reject H0.

    Conclusion in context: There is sufficient evidence at the 5% significance level to conclude that the machine is not producing bolts of mean diameter 10 mm. The quality control check has detected a significant deviation and the machine should be recalibrated.

  7. Q7 — Hypothesis Test and 95% Confidence Interval

    A manufacturer claims μ = 500 g. A sample of n = 36 gives x̄ = 496 g with σ = 12 g. (a) Conduct a two-tailed z-test at α = 0.05. (b) Construct a 95% CI. Compare the two approaches.

    Part (a): Hypothesis Test

    H0: μ = 500.   H1: μ ≠ 500.   α = 0.05.

    SE = 12 / √36 = 12 / 6 = 2.

    z = (496 − 500) / 2 = −4 / 2 = −2.0.

    p = 2P(Z ≤ −2.0) = 2(0.0228) = 0.0456.

    Since p = 0.0456 < 0.05, reject H0.

    Part (b): 95% Confidence Interval

    CI = x̄ ± z0.025 × SE = 496 ± 1.96 × 2 = 496 ± 3.92

    CI = (492.08, 499.92)

    Comparison:

    The value μ0 = 500 is not contained in the 95% CI (492.08, 499.92). This corresponds exactly to rejecting H0 in the two-tailed test at α = 0.05. The two approaches are equivalent: H0 is rejected at level α if and only if μ0 falls outside the (1 − α)100% confidence interval.

  8. Q8 — Interpreting Type I Error Risk

    A hypothesis test at α = 0.05 rejects H0. Explain what Type I error risk means in this context, and what it means if α is set at 0.01 instead.

    Type I Error Definition:

    A Type I error occurs when we reject H0 when H0 is actually true. This is a false positive.

    At α = 0.05:

    By choosing α = 0.05, we accept a 5% probability of making a Type I error. That is, in the long run, if H0 is truly correct, we would wrongly reject it in 5% of repeated experiments. Rejecting H0 at this level means the result is “statistically significant at the 5% level” — but there remains a 5% chance this decision is a false alarm.

    Effect of reducing α to 0.01:

    A lower significance level (α = 0.01) demands stronger evidence before rejecting H0. The Type I error risk drops to 1%, making false positives less likely. However, this comes at a cost: we are less likely to detect a true difference (the Type II error rate β increases), reducing the power of the test.

    Practical note: In high-stakes decisions (e.g., medical approvals, safety regulations), a lower α is preferred to minimise false positives, even at the cost of requiring larger samples to maintain power.

  9. Q9 — Effect of Larger n on the Test

    A test uses x̄ = 52, μ0 = 50, σ = 10. Compare the test statistic and p-value for n = 25 and n = 100. What does this illustrate about the effect of sample size?

    For n = 25:

    SE = 10 / √25 = 10 / 5 = 2.

    z = (52 − 50) / 2 = 1.0.

    Two-tailed p-value = 2P(Z ≥ 1.0) = 2(1 − 0.8413) = 2(0.1587) = 0.3174.

    At α = 0.05: p = 0.317 > 0.05 ⇒ Fail to reject H0.

    For n = 100:

    SE = 10 / √100 = 10 / 10 = 1.

    z = (52 − 50) / 1 = 2.0.

    Two-tailed p-value = 2P(Z ≥ 2.0) = 2(0.0228) = 0.0456.

    At α = 0.05: p = 0.046 < 0.05 ⇒ Reject H0.

    Conclusion — Effect of larger n:

    With the same sample mean and the same difference from μ0, a larger sample size gives a smaller standard error, producing a larger z-statistic and a smaller p-value. This means larger samples have greater power to detect true departures from H0, even when the effect size (x̄ − μ0 = 2) is modest. With n = 100 the difference is detectable; with n = 25 it is not.

  10. Q10 — Full Hypothesis Test in Context

    A pharmaceutical company claims a new pain relief drug reduces pain scores by a mean of at least 15 points (on a 100-point scale). In a clinical trial, 64 patients show a mean reduction of 13.8 points. Historical standard deviation for such trials is σ = 4.8 points. Test at α = 0.05 whether there is evidence that the drug’s mean reduction falls short of the claim.

    Step 1: Define the parameter and state hypotheses.

    Let μ = true mean pain score reduction (points) across all patients using the drug.

    H0: μ = 15 (the drug achieves the claimed mean reduction).

    H1: μ < 15 (the drug falls short of the claimed reduction). [Lower one-tailed test]

    α = 0.05.

    Step 2: Calculate the test statistic.

    SE = σ / √n = 4.8 / √64 = 4.8 / 8 = 0.6.

    z = (x̄ − μ0) / SE = (13.8 − 15) / 0.6 = −1.2 / 0.6 = −2.0.

    Step 3: Find the p-value.

    Lower one-tailed test: p = P(Z ≤ −2.0) = Φ(−2.0) = 1 − 0.9772 = 0.0228.

    Step 4: Decision.

    Since p = 0.0228 < α = 0.05, reject H0.

    Step 5: Conclusion in context.

    There is sufficient evidence at the 5% significance level to conclude that the drug’s true mean pain score reduction is less than the claimed 15 points. The pharmaceutical company’s claim of a minimum 15-point reduction is not supported by this clinical trial data. The drug provides a statistically significantly smaller mean reduction than claimed.

    Note on effect size: The observed mean of 13.8 is 1.2 points below the claim. While statistically significant, the practical significance depends on the clinical context — whether a 1.2-point difference matters to patients should be considered alongside the statistical result.