Skip to content
Go back

Introduction to Random Vectors

Edit page

When analyzing real-world data—be it financial time series, engineering measurements, public health metrics, or e-commerce behaviors—we rarely focus on a single variable. More often, we observe multiple attributes or features at once: daily returns of several stocks, temperature and humidity at a weather station, or user behaviors across different pages on a website. Random vectors are the probability-theory construct that let us handle these variables jointly, ensuring we capture both individual behaviors and collective interactions among them.


Table of contents

Open Table of contents

1. Introduction: The Need for Random Vectors

1.1 Beyond One-Dimensional Analysis

Traditionally, introductory probability courses start with a single random variable—describing phenomena like one measurement at a time (e.g., an individual’s height, or the outcome of a single die roll). However, real-world problems typically involve multiple measurements:

In all these scenarios, we gather data across multiple variables, creating an inherently multidimensional space of possible outcomes. Random vectors formalize this concept by treating each “dimension” as a component of a larger probabilistic entity.

1.2 Random Vectors in Formal Terms

A random vector X=(X1,X2,,Xn)\footnotesize \mathbf{X = (X_1, X_2, \dots, X_n)} is a function from a sample space Ω\footnotesize \mathbf{\Omega} into Rn\footnotesize \mathbf{\mathbb{R}^n}. Each outcome ωΩ\footnotesize \mathbf{\omega \in \Omega} maps to a point xRn\footnotesize \mathbf{x \in \mathbb{R}^n}. If we focus solely on X1\footnotesize \mathbf{X_1}, that’s just one random variable, but the vector X\footnotesize \mathbf{X} lets us track all n\footnotesize \mathbf{n} variables simultaneously and preserve their interdependencies.


2. Joint Distributions: The Full Multidimensional Story

2.1 Defining the Joint Distribution

To describe how a random vector behaves, we use its joint distribution. In one dimension, we have a single PMF (probability mass function) or PDF (probability density function). In multiple dimensions, we have:

2.2 Marginal Distributions

From the joint distribution, derive marginals by summing/integrating out the other variables. For example:

fX1(x1)=fX1,X2(x1,x2)dx2f_{X_1}(x_1) = \int_{-\infty}^{\infty} f_{X_1, X_2}(x_1, x_2)\, dx_2

Marginals show individual behavior, but not co-movement.

2.3 Why Joint Distributions Matter

  1. Capture correlations (stocks, health indicators).
  2. Underlie multivariate regression, PCA, Bayesian networks.
  3. Essential for simulation and generative models.

3. Conditional Distributions

3.1 Formula

Discrete:

PX1X2(x1x2)=PX1,X2(x1,x2)PX2(x2)P_{X_1 \mid X_2}(x_1 \mid x_2) = \frac{P_{X_1, X_2}(x_1, x_2)}{P_{X_2}(x_2)}

Continuous:

fX1X2(x1x2)=fX1,X2(x1,x2)fX2(x2)f_{X_1 \mid X_2}(x_1 \mid x_2) = \frac{f_{X_1, X_2}(x_1, x_2)}{f_{X_2}(x_2)}

3.2 Applications

3.3 Chain Rule

P(x1,,xn)=P(x1)P(x2x1)P(xnx1,,xn1)P(x_1,\dots,x_n) = P(x_1) P(x_2 \mid x_1)\dots P(x_n \mid x_1,\dots,x_{n-1})

4. Independence

4.1 Definition

Independent if:

fX1,X2(x1,x2)=fX1(x1)fX2(x2)f_{X_1, X_2}(x_1, x_2) = f_{X_1}(x_1)\, f_{X_2}(x_2)

4.2 Pairwise vs Mutual Independence

Mutual independence is stronger than pairwise.

4.3 Independence vs Uncorrelatedness

Zero correlation ≠ independence. Independence demands full factorization.

4.4 Applications


5. Extended Topics


6. Case Studies


7. Pitfalls


8. Conclusion

Random vectors and their distributions are foundational for multivariate analysis:

  1. Handle multiple variables in Rn\footnotesize \mathbf{\mathbb{R}^n}.
  2. Joint distributions capture relationships.
  3. Conditional distributions update beliefs with info.
  4. Independence simplifies but rarely holds perfectly.

By mastering these, you gain a robust toolkit for analyzing real-world multivariate data.

Check out my full article on Medium



Edit page
Share this post on:

Previous Post
Random Vectors Continued...
Next Post
Exploring Random Variable