Expected Value and Variance of Continuous RVs

Key Terms

Expected value (mean): E(X) = ∫ x · f(x) dx over the domain
E(X²): ∫ x² · f(x) dx over the domain
Variance: Var(X) = E(X²) − [E(X)]²
Standard deviation: SD(X) = √Var(X)
Mode: the value of x where f(x) is maximum (differentiate and set f′(x) = 0)
Median m: ∫_lower^m f(x) dx = 0.5
Linear transformations: E(aX + b) = aE(X) + b; Var(aX + b) = a²Var(X)

Expected Value: E(X) = ∫_domain x · f(x) dx

Variance (shortcut): Var(X) = E(X²) − [E(X)]²
where E(X²) = ∫_domain x² · f(x) dx

Standard Deviation: SD(X) = √Var(X)

Linear Transformations:
E(aX + b) = aE(X) + b
Var(aX + b) = a²Var(X)
SD(aX + b) = |a| · SD(X)

Worked Example (Find E(X), Var(X), SD): Let f(x) = 3x² on [0, 1].

E(X):
E(X) = ∫₀¹ x · 3x² dx = ∫₀¹ 3x³ dx = [3x⁴/4]₀¹ = 3/4

E(X²):
E(X²) = ∫₀¹ x² · 3x² dx = ∫₀¹ 3x⁴ dx = [3x⁵/5]₀¹ = 3/5

Var(X):
Var(X) = 3/5 − (3/4)² = 3/5 − 9/16 = 48/80 − 45/80 = 3/80

SD(X):
SD(X) = √(3/80) = √3 / (4√5) ≈ 0.194

Hot Tip: Always use the shortcut formula Var(X) = E(X²) − [E(X)]² — it is much faster than the definition Var(X) = E[(X − μ)²] = ∫(x − μ)²f(x)dx. The shortcut avoids expanding a squared bracket. For linear transformations, remember that adding a constant b shifts the mean but does NOT change the variance (since shifting doesn't change spread).

Why Integration Replaces Summation for E(X)

For a discrete random variable, the expected value is the probability-weighted sum: E(X) = ∑ x_k · P(X = x_k). Each value x_k is weighted by how probable it is. For a continuous random variable, there are no individual probabilities — instead, the probability density f(x) tells us how densely probability is packed near each value x. The continuous analogue of the sum is the integral:

E(X) = ∫ x · f(x) dx

Interpret this geometrically: E(X) is the balance point of the pdf curve. If you could cut out the region under the pdf graph from cardboard, E(X) is the x-coordinate where it would balance on a knife edge. For symmetric distributions, E(X) equals the median and the axis of symmetry.

Variance as Average Squared Deviation

Variance measures how spread out the distribution is around its mean μ = E(X). By definition:

Var(X) = E[(X − μ)²] = ∫ (x − μ)² f(x) dx

Each squared deviation (x − μ)² is weighted by the density f(x). Large deviations far from the mean contribute heavily. The shortcut formula Var(X) = E(X²) − [E(X)]² follows from expanding the square and using linearity of integration — it is algebraically equivalent but computationally much faster.

The standard deviation SD(X) = √Var(X) is in the same units as X, making it the more interpretable measure of spread.

The Mode

The mode of a continuous distribution is the value of x where the pdf is maximum — the peak of the density curve. Find it by differentiating f(x) and setting f′(x) = 0, then confirming it is a maximum (f′′(x) < 0 or by inspection). For simple pdfs on a closed interval, also check the endpoints.

Note: for a uniform distribution (f(x) = constant), every point is equally a mode. For strictly increasing pdfs, the mode is at the right endpoint; for strictly decreasing pdfs, at the left endpoint.

The Median vs the Mean

The median m splits the distribution equally: P(X ≤ m) = 0.5. The mean E(X) is the balance point. For symmetric distributions, median = mean. For skewed distributions:

Right-skewed: a long right tail pulls the mean to the right of the median
Left-skewed: a long left tail pulls the mean to the left of the median

The median is more robust to extreme values and is often a better measure of the “typical” value in skewed distributions.

Linear Transformations: Effect on E(X) and Var(X)

If Y = aX + b (a linear function of X), then:

E(Y) = aE(X) + b. Multiplying X by a scales the mean by a; adding b shifts the mean by b.

Var(Y) = a²Var(X). Multiplying X by a scales the variance by a². Adding a constant b has no effect on variance — shifting all values left or right does not change how spread out they are.

SD(Y) = |a| · SD(X). Standard deviation scales by |a|, not a².

Exam technique: When computing Var(X), always find E(X) first, then E(X²) separately, then subtract. Trying to compute ∫(x − μ)²f(x) dx directly is much slower. Check your variance is positive — if you get a negative number, you've made an arithmetic error in E(X²) − [E(X)]².

Mastery Practice

Given f(x) = 2x on [0, 1], find E(X).
Using f(x) = 2x on [0, 1], find Var(X) and SD(X).
For f(x) = 3x² on [0, 1], find the mode of the distribution.
For f(x) = 2x on [0, 1], find the median m.
If X has E(X) = 3 and Var(X) = 4, find E(2X + 5) and Var(2X + 5).
If X has E(X) = 5 and SD(X) = 2, find E(3X − 1) and SD(3X − 1).
Two distributions are proposed for modelling a waiting time (minutes):
Distribution A: E(X) = 4, SD(X) = 1
Distribution B: E(X) = 4, SD(X) = 3
Which distribution produces more reliable (consistent) waiting times? Explain.
A random variable X has pdf f(x) = (3/2)x(2 − x) on [0, 2]. Find E(X) and interpret it in context as the average position of a measurement along a 2 m beam.
A piecewise pdf is defined as:
f(x) = 2x for 0 ≤ x ≤ 1
f(x) = 2(2 − x) for 1 < x ≤ 2
Find E(X) and Var(X).
The pdf of X is f(x) = k(1 − x²) on [−1, 1] for some constant k.
(a) Find k.
(b) Find E(X) (without integrating — use symmetry).
(c) Find Var(X).

See Answers ➔