Exploring Random Variable

Probability theory is a foundational element of statistics, data science, and numerous applied mathematics fields. At its core are mass functions, distribution functions, and density functions, which collectively describe the behavior of random variables. This article provides a comprehensive exploration of these functions, their properties, and their real-world applications.

1. Random Variables: An Introduction

A random variable is a numerical outcome of a random process or experiment, serving as a link between probabilistic concepts and real-world observations. Random variables are broadly classified into:

1.1 Discrete Random Variables

Discrete random variables take on a countable set of distinct values. For example, the number of heads obtained in a series of coin flips is a discrete random variable.

1.2 Continuous Random Variables

Continuous random variables can assume an infinite number of values within a given range. For instance, the height of individuals in a population is modeled as a continuous random variable.

To characterize random variables, three essential functions are used: the probability mass function (PMF), the cumulative distribution function (CDF), and the probability density function (PDF). These functions will be discussed in detail below.

2. Probability Mass Function (PMF)

The probability mass function (PMF) quantifies the probability of each possible value of a discrete random variable. It is defined as:

P(X = x) = f_X(x),\quad \text{for } x \in \text{range}(X),

where:

$\footnotesize \mathbf{X}$ : The discrete random variable.
$\footnotesize \mathbf{f_X(x)}$ : The PMF of $\footnotesize \mathbf{X}$ .
$\footnotesize \mathbf{P(X = x)}$ : The probability that $\footnotesize \mathbf{X}$ equals $\footnotesize \mathbf{x}$ .

2.1 Properties of the PMF

Non-Negativity : $\footnotesize \mathbf{f_X(x) \geq 0}$ for all $\footnotesize \mathbf{x}$ .
Normalization: $\footnotesize \mathbf{\sum_{x \in \text{Range}(X)} f_X(x) = 1}$ .
Support: The PMF is defined only for values within the range of $\footnotesize \mathbf{X}$ .

2.2 Example: Rolling a Die

Consider a fair six-sided die. The random variable $\footnotesize \mathbf{X}$ represents the outcome of the roll. The PMF is:

f_X(x) = \begin{cases} \frac{1}{6}, & x \in \{1, 2, 3, 4, 5, 6\}, \\ 0, & \text{otherwise}. \end{cases}

3. Cumulative Distribution Function (CDF)

The cumulative distribution function (CDF) describes the probability that a random variable $\footnotesize \mathbf{X}$ is less than or equal to a certain value $\footnotesize \mathbf{x}$ . It is defined as:

F_X(x) = P(X \leq x).

3.1 CDF for Discrete Random Variables

F_X(x) = \sum_{t \leq x} f_X(t).

3.2 CDF for Continuous Random Variables

F_X(x) = \int_{-\infty}^x f_X(t)\, dt.

3.3 Properties of the CDF

Monotonicity: $\footnotesize \mathbf{F_X(x)}$ is a non-decreasing function.
Limits: $\footnotesize \mathbf{\lim_{x \to -\infty} F_X(x) = 0}$ and $\footnotesize \mathbf{\lim_{x \to \infty} F_X(x) = 1}$ .
Right-Continuity: The CDF is continuous from the right.

3.4 Example: Rolling a Die

For the six-sided die example, the CDF is:

F_X(x) = \begin{cases} 0, & x < 1, \\ \frac{1}{6}, & 1 \leq x < 2, \\ \frac{2}{6}, & 2 \leq x < 3, \\ \frac{3}{6}, & 3 \leq x < 4, \\ \frac{4}{6}, & 4 \leq x < 5, \\ \frac{5}{6}, & 5 \leq x < 6, \\ 1, & x \geq 6. \end{cases}

4. Probability Density Function (PDF)

The probability density function (PDF) describes the likelihood of a continuous random variable taking on a particular value. Unlike the PMF, the PDF does not give probabilities directly but instead indicates the density of probabilities across a range of values.

The relationship between the PDF $\footnotesize \mathbf{f_X(x)}$ and the CDF $\footnotesize \mathbf{F_X(x)}$ is:

F_X(x) = \int_{-\infty}^x f_X(t)\, dt,\quad f_X(x) = \frac{d}{dx} F_X(x).

4.1 Properties of the PDF

Non-Negativity: $\footnotesize \mathbf{f_X(x) \geq 0}$ for all $\footnotesize \mathbf{x}$ .
Normalization: $\footnotesize \mathbf{\int_{-\infty}^\infty f_X(x)\, dx = 1}$ .
Support: The PDF is defined over the entire range of $\footnotesize \mathbf{X}$ .

4.2 Example: Uniform Distribution

Consider a random variable $\footnotesize \mathbf{X}$ uniformly distributed over $\footnotesize \mathbf{[a, b]}$ . The PDF is:

f_X(x) = \begin{cases} \frac{1}{b-a}, & a \leq x \leq b, \\ 0, & \text{otherwise}. \end{cases}

5. Relationships Between PMF, CDF, and PDF

The PMF, CDF, and PDF provide complementary perspectives on random variables.

5.1 Discrete Case

The PMF $\footnotesize \mathbf{f_X(x)}$ gives probabilities for specific values.
The CDF $\footnotesize \mathbf{F_X(x)}$ is the cumulative sum of the PMF up to $\footnotesize \mathbf{x}$ .

5.2 Continuous Case

The PDF $\footnotesize \mathbf{f_X(x)}$ describes the density of probabilities.
The CDF $\footnotesize \mathbf{F_X(x)}$ is the integral of the PDF.

5.3 Key Transitions

Discrete: $\footnotesize \mathbf{F_X(x) = \sum_{t \leq x} f_X(t)}$ .
Continuous: $\footnotesize \mathbf{F_X(x) = \int_{-\infty}^x f_X(t)\, dt}$ .

6. Moments and Expectation

Moments provide insights into the distribution of a random variable. Common moments include:

6.1 Expected Value (Mean)

Discrete: $\footnotesize \mathbf{E[X] = \sum_{x \in \text{Range}(X)} x f_X(x)}$ .
Continuous: $\footnotesize \mathbf{E[X] = \int_{-\infty}^\infty x f_X(x)\, dx}$ .

6.2 Variance

\text{Var}(X) = E[(X - E[X])^2] = E[X^2] - (E[X])^2.

6.3 Higher-Order Moments

Skewness: Measures the asymmetry of the distribution.
Kurtosis: Measures the “tailedness” of the distribution.

7. Real-World Applications

7.1 Discrete Random Variables

Modeling the number of customer arrivals at a store.
Describing outcomes in games of chance (e.g., dice rolls).

7.2 Continuous Random Variables

Analyzing sensor data in engineering.
Modeling time-to-failure in reliability studies.

7.3 Combined Applications

Mixed models for scenarios involving both discrete and continuous elements.

Understanding mass, distribution, and density functions is essential for analyzing random variables. These functions form a framework for describing and interpreting random processes in both theoretical and practical contexts. By mastering these concepts, one gains a powerful toolset for exploring the stochastic nature of the world and applying probabilistic models across diverse fields.

Check out my full article on Medium