Continuous Random Variables and PDFs
Key Terms
- A pdf f(x) must satisfy: (1) f(x) ≥ 0 for all x in the domain, and (2) ∫ f(x) dx = 1 over the entire domain
- P(a ≤ X ≤ b) = ∫ab f(x) dx — probability equals area under the curve
- Finding k
- : set ∫ f(x) dx = 1 over the domain and solve for k
- CDF
- F(x) = ∫lower boundx f(t) dt; gives P(X ≤ x)
- P(X > a) = 1 − F(a)
- Median m
- : solve F(m) = 0.5, i.e., ∫lowerm f(x) dx = 0.5
- For a continuous random variable, P(X = a) = 0 for any specific value a
• f(x) ≥ 0 for all x in domain
• ∫domain f(x) dx = 1
Probability: P(a ≤ X ≤ b) = ∫ab f(x) dx
CDF: F(x) = ∫ax f(t) dt where a is the lower bound of the domain
Complement: P(X > a) = 1 − F(a)
Median m: F(m) = 0.5 ⇒ ∫am f(x) dx = 0.5
Step 1: Require ∫04 kx dx = 1
k[x²/2]04 = 1
k × 8 = 1
k = 1/8
Check: f(x) = x/8 ≥ 0 for x ∈ [0, 4] ✓
P(1 ≤ X ≤ 3):
= ∫13 (x/8) dx = [x²/16]13 = 9/16 − 1/16 = 8/16 = 1/2
Median m:
∫0m (x/8) dx = 0.5
[x²/16]0m = 0.5
m²/16 = 0.5
m² = 8
m = 2√2 ≈ 2.83
Why Integration Replaces Summation
In the discrete case, a random variable takes specific countable values. The probability that X equals some value xk is a positive number, and all probabilities sum to 1: ∑ P(X = xk) = 1. Probabilities for ranges are found by adding individual probabilities.
For a continuous random variable, the variable can take any value in an interval — such as the exact height of a randomly chosen person, or the precise time until a radioactive atom decays. There are uncountably many possible values. It therefore makes no sense to ask “what is the probability of exactly 1.732 m?” — the probability of any single specific value is zero.
Instead, a continuous random variable is described by a probability density function (pdf) f(x). The pdf does not give probabilities directly; it gives a density. The probability that X falls in the interval [a, b] is the area under the curve:
P(a ≤ X ≤ b) = ∫ab f(x) dx
Integration replaces summation because the continuous variable “smears” probability over an interval rather than concentrating it at discrete points. The integral of the pdf over the entire domain plays the same role as ∑ P(X = xk) = 1 in the discrete case.
The Two PDF Requirements
For f(x) to be a valid pdf, two conditions must hold:
1. Non-negativity: f(x) ≥ 0 for all x in the domain. Probability density cannot be negative — you cannot have a negative likelihood of an event occurring. This is usually satisfied by inspection if the formula has no negative terms on the stated domain.
2. Total area = 1: ∫domain f(x) dx = 1. The random variable must take some value from the domain with certainty. The total probability must be 1, and for a continuous variable this is expressed as a definite integral equalling 1.
When a constant k is unknown, you find k by imposing condition 2: set up the integral, evaluate it in terms of k, and solve k from the equation “expression in k = 1.”
The PDF Graph and Its Meaning
A pdf graph shows how probability density varies across the domain. A tall peak at some value x = c means the random variable is relatively more likely to fall near c than in regions where the curve is low. But the actual probability of falling in any specific region is the area under the curve over that region, not the height of the curve.
This is why the pdf value f(x) can exceed 1 — f(x) is a density, not a probability. What cannot exceed 1 is the area under the curve over any subset of the domain.
The Cumulative Distribution Function (CDF)
The CDF F(x) = P(X ≤ x) accumulates probability from the lower bound up to x. It is found by integrating the pdf:
F(x) = ∫lower boundx f(t) dt
Key properties of the CDF:
- F(lower bound) = 0 and F(upper bound) = 1
- F(x) is always non-decreasing (probability accumulates as x increases)
- P(a ≤ X ≤ b) = F(b) − F(a)
- P(X > a) = 1 − F(a)
The Median
The median of a continuous distribution is the value m that splits the distribution in half: half the probability lies below m and half above. It satisfies F(m) = 0.5, or equivalently:
∫lowerm f(x) dx = 0.5
To find m, set up the integral, evaluate, and solve the resulting equation for m. This often gives a square root or cube root equation. Always check that m is within the stated domain.
Piecewise PDFs
Some pdfs are defined by different formulas over different sub-intervals. When verifying a piecewise pdf, you must check non-negativity on each piece separately, and split the total integral accordingly. The sum of the integrals over each piece must equal 1.
Mastery Practice
- Show that f(x) = 3x² on [0, 1] is a valid pdf by verifying both conditions. Then find k if instead f(x) = kx² on [0, 2].
- Given f(x) = (3/4)(1 − x²) on [−1, 1], find P(0 ≤ X ≤ 0.5).
- For f(x) = 3x² on [0, 1], find the CDF F(x) for x ∈ [0, 1].
- Using f(x) = 3x² on [0, 1] (with CDF F(x) = x³), find:
(a) P(X > 0.7)
(b) The median of the distribution. - Given f(x) = k(4 − x) on [0, 4]:
(a) Find k
(b) Find P(1 ≤ X ≤ 3)
(c) Find P(X > 2.5) - A piecewise pdf is defined by:
f(x) = 2x for 0 ≤ x ≤ 1
f(x) = 0 otherwise
Verify this is a valid pdf, then find the CDF F(x) and use it to find P(0.4 ≤ X ≤ 0.8). - Find the constant k so that f(x) = kx(1 − x) is a valid pdf on [0, 1]. Then find the median of this distribution.
- The pdf of X is f(x) = c/x² for x ∈ [1, 5].
(a) Find c
(b) Find P(X > 3)
(c) Find the CDF F(x) - A random variable X has pdf f(x) = ax + b on [0, 2], where f(0) = 0.2 and f(2) = 0.8.
(a) Find a and b
(b) Verify that f(x) is a valid pdf
(c) Find P(X ≤ 1) and the median. - A machine produces components with a fault occurring at position X (in mm from one end) modelled by f(x) = k(6x − x²) on [0, 6].
(a) Find k
(b) Find P(2 ≤ X ≤ 4)
(c) A component is rejected if the fault position is less than 1 mm or greater than 5 mm from the end. Find the probability that a component is rejected.