Practice Maths

Solutions — Sample Proportions as Random Variables

← Back to Questions

  1. The sample proportion is p̂ = X/n, where X is the number of successes and n is the sample size.
    p̂ is a random variable because its value varies from sample to sample — different random samples from the same population will give different numbers of successes, and hence different values of p̂.
    The population proportion p is a fixed (unknown) parameter, while p̂ is a statistic that varies with each sample.
  2. p̂ = X/n = 36/120 = 0.3
    This means 30% of the sampled voters supported the policy. This is our estimate of the population proportion p, but the true proportion is unknown.
  3. (a) E(p̂) = p = 0.6
    (b) Var(p̂) = p(1−p)/n = 0.6 × 0.4 / 50 = 0.24/50 = 0.0048
    (c) SD(p̂) = √0.0048 ≈ 0.0693
  4. For larger n, Var(p̂) = p(1−p)/n decreases. Since variance decreases, the distribution of p̂ clusters more tightly around the true proportion p.
    Effect: Larger samples give more precise estimates of p. The standard deviation of p̂ is √[p(1−p)/n], which decreases as n increases. To halve the standard deviation, you must quadruple the sample size.
  5. Var(p̂) = p(1−p)/n
    (a) n=100: Var = 0.4 × 0.6 / 100 = 0.0024; SD = √0.0024 ≈ 0.0490
    (b) n=400: Var = 0.24/400 = 0.0006; SD = √0.0006 ≈ 0.0245
    (c) n=1600: Var = 0.24/1600 = 0.00015; SD = √0.00015 ≈ 0.01225
    Pattern: when n is multiplied by 4, SD is halved. SD is proportional to 1/√n.
  6. The condition for the normal approximation is:
    np ≥ 5 AND n(1−p) ≥ 5
    (a) n=20, p=0.4: np = 8 ≥ 5 ✓   n(1−p) = 12 ≥ 5 ✓   Satisfied
    (b) n=10, p=0.2: np = 2 < 5 ✗   NOT satisfied (sample too small relative to p)
    (c) n=50, p=0.95: n(1−p) = 2.5 < 5 ✗   NOT satisfied (p is very close to 1, so failures are rare)
    (d) n=100, p=0.07: np = 7 ≥ 5 ✓   n(1−p) = 93 ≥ 5 ✓   Satisfied
  7. p = 0.35, n = 200
    Check conditions: np = 200 × 0.35 = 70 ≥ 5 ✓   n(1−p) = 130 ≥ 5 ✓
    E(p̂) = 0.35
    Var(p̂) = 0.35 × 0.65 / 200 = 0.2275/200 = 0.0011375
    SD(p̂) = √0.0011375 ≈ 0.03373
    By the normal approximation: p̂ ~ N(0.35, 0.0011375)
    Approximately p̂ ~ N(0.35, 0.001138)   (writing as N(μ, σ²) notation)
  8. (a) p̂ is an unbiased estimator of p because E(p̂) = p. On average, across many random samples, the sample proportion equals the true population proportion.

    (b) Unbiasedness is important because it means our estimation method does not systematically overestimate or underestimate the true value. Even though any one sample gives a specific value of p̂ that may differ from p, on average we “get it right”. This is a fundamental property we want from estimation procedures.
  9. p̂ = X/n where X ~ Bin(n, p)
    E(X) = np, so E(p̂) = E(X/n) = (1/n)E(X) = (1/n)(np) = p ✓
    Var(X) = np(1−p), so Var(p̂) = Var(X/n) = (1/n²)Var(X) = (1/n²)(np(1−p)) = p(1−p)/n ✓
  10. p = 0.25, n = 80
    E(p̂) = 0.25;   Var(p̂) = 0.25 × 0.75/80 = 0.1875/80 ≈ 0.002344
    SD(p̂) ≈ 0.04841
    Conditions: np = 20 ≥ 5 ✓   n(1−p) = 60 ≥ 5 ✓
    (a) The distribution of p̂ is approximately N(0.25, 0.002344).
    (b) The distribution is centred at p = 0.25 with a relatively small spread (σ ≈ 0.048), so most sample proportions will be within about 0.10 of 0.25 (i.e., between 0.15 and 0.35).