The Fundamental Theorem of Calculus (FTC) bridges the concepts of differentiation and integration, serving as a cornerstone in mathematical analysis. In probability theory, this theorem establishes a crucial relationship between probability density functions (PDFs) and cumulative distribution functions (CDFs). In this section, we will explore this relationship and demonstrate how the FTC applies to probability distributions.
The FTC consists of two parts:
If \( f \) is continuous on \([a, b]\) and \( F \) is an antiderivative of \( f \) on \([a, b]\), then:
\( \displaystyle \int_a^b f(x) \, dx = F(b) - F(a) \)
If \( f \) is continuous on an open interval \( I \) and \( a \) is any point in \( I \), then for every \( x \) in \( I \):
\( \displaystyle \frac{d}{dx} \left( \int_a^x f(t) \, dt \right) = f(x) \)
In probability theory, the cumulative distribution function (CDF) of a continuous random variable \( X \) is defined as:
\( F_X(x) = P(X \leq x) = \displaystyle \int_{-\infty}^x f_X(t) \, dt \)
where \( f_X(x) \) is the probability density function (PDF) of \( X \).
Using the FTC, we can establish the relationship between PDFs and CDFs:
The CDF is the integral of the PDF:
\( F_X(x) = \displaystyle \int_{-\infty}^x f_X(t) \, dt \)
The PDF is the derivative of the CDF:
\( f_X(x) = \dfrac{d}{dx} F_X(x) \)
Consider the standard normal distribution with PDF:
\( f_X(x) = \dfrac{1}{\sqrt{2\pi}} e^{-\dfrac{x^2}{2}} \)
The CDF is:
\( F_X(x) = \displaystyle \int_{-\infty}^x \dfrac{1}{\sqrt{2\pi}} e^{-\dfrac{t^2}{2}} \, dt \)
Although the CDF does not have a closed-form expression, its derivative recovers the PDF:
\( \dfrac{d}{dx} F_X(x) = f_X(x) \)
In this exercise, we will:
Let's define a discrete random variable \( X \) that takes values in \( \{1, 2, 3, 4, 5\} \) with the following arbitrary probabilities:
| \( x \) | 1 | 2 | 3 | 4 | 5 |
|---|---|---|---|---|---|
| \( P(X = x) \) | 0.1 | 0.2 | 0.4 | 0.2 | 0.1 |
Mean (\( \mu \)):
\( \mu = E[X] = \sum_{i=1}^{5} x_i P(X = x_i) = (1)(0.1) + (2)(0.2) + (3)(0.4) + (4)(0.2) + (5)(0.1) = 3 \)
Variance (\( \sigma^2 \)):
\( \begin{align*} \sigma^2 &= E[(X - \mu)^2] = \sum_{i=1}^{5} (x_i - \mu)^2 P(X = x_i) \\ &= (1 - 3)^2(0.1) + (2 - 3)^2(0.2) + (3 - 3)^2(0.4) + (4 - 3)^2(0.2) + (5 - 3)^2(0.1) \\ &= (4)(0.1) + (1)(0.2) + (0)(0.4) + (1)(0.2) + (4)(0.1) \\ &= 0.4 + 0.2 + 0 + 0.2 + 0.4 = 1.2 \end{align*} \)
We will use the inverse transform sampling method to generate random samples from the defined discrete distribution.
We will generate samples of increasing sizes (\( N = 10^2, 10^3, 10^4, 10^5 \)) and graphically show how the empirical distribution converges to the theoretical distribution.
Welford's algorithm provides a numerically stable method for computing the mean and variance incrementally.
We will compare the empirical mean and variance computed using Welford's algorithm with the theoretical values, discussing the convergence and the relationship between them.