A Visual Journey in Random Variables

Srijit Mukherjee
Srijit Mukherjee Content
5 min readJun 15, 2020

--

Random Variables have no special color. But, Random Variables do have a special geometry and so does its expectation. This article delves into the world of the geometry of Random Variables, where we will explore the visual secret behind covariance and variance along with the geometry of conditional expectations.

You can jump to the summary section and get spoilers or read through this story with the flow. The choice is yours.

Random Variables are the basic building blocks of modeling real events, under the shed of probability theory. How are they defined?

Real-Valued Random Variable is a function say

X: set of outcomes → Real Numbers

For example

Suppose in gambling, you get Rs 100 for getting prime numbers in a dice throw and you will be fined Rs 130 otherwise. So, naturally, you will be intrigued to know how much you win on average because based on that you will decide to play this game or not. To understand and model the scenario, we need random variables as follows.

X = Win Function : {1,2,3,4,5,6} → Real Numbers, where

X(1)= X(4) = X(6) = -130

X(2)= X(3) = X(5) = 100

Now, analyze X to decide whether we will play or not.

Visual Secret 1

Random Variables just look like vectors. Do you mean a line segment with direction and magnitude? Yes. But, that’s meant for kids actually.

You must look at it from the perspective of Vector / Linear Spaces.

What is Vector / Linear Space?

It is a set set of objects (called vectors), which when zoomed and added to another vector, it is already inside that space. Also, we need to have an origin to start with to define a coordinate system (a frame of reference).

Examples

  1. Real Numbers
  2. The Euclidean Plane (2D Plane)
  3. The 3D Space

Random Variables form a Vector / Linear Space.

This is easy to see. Why?

Take any real-valued random variable. Multiply with it a constant, it is still a random variable. Add it to another real-valued random variable, it is still a random variable. The zero random variable is the origin.

The Set of Real-Valued Random Variables with finite variance form a Vector Space. Each such Random Variable is a Vector.

Now, you can think of a Random Variable as a Line Sticking out from the origin, with their heads held high and depicted by this.

Exercise 1: The constant random variables form a one-dimensional subspace of the whole linear space of Random Variables.

We will connect this space to the expectation of a random variable.

Visual Secret 2

The set of Expectations of real-valued random variables is the same as the set of Real Numbers.

Therefore, we can represent the expectation of a real-valued random variable to be a point on the Real number line.

Note: For, real two dimensional vector-valued random variables, the expectation set will be the same as the 2D Plane. This can be generalized.

The expectation has beautiful linear properties.

E(aX+bY) = aE(X) + bE(Y)

E is a linear map from the Linear Space of Random Variables to the Real Number Line, which has a one to one connection with the subspace of constant random variables right?

This means we can write that

Expectation (E) is a linear map from the linear space of random variables to itself. Moreover, the expectation is a projection map.

Why projection?

E(E(X)) = E(X). Expectation map applied twice is same as applying once.

Visual Secret 3

What is the geometry behind the covariance Cov(X, Y)?

Observe that Var(X) = Cov(X, X). Thus, the geometry of Cov(X, Y) will reveal the geometry of Var(X).

Observe the following properties of Covariance.

  1. Cov(aX, Y) = aCov(X,Y)
  2. Cov(X+Z, Y) = Cov(X,Y) + Cov(Z,Y)
  3. Cov(X,X) =Var(X) is always non — negative.

Do you know why these properties are special?

They are the exact same properties, followed by the Dot Product of two vectors. In general, they are called an Inner Product.

So, what does Inner Product measure?

The inner product helps us to measure the angle between two vectors, and the length of a vector. So, does the Covariance. How?

Think of Covariance as Dot Product of two Vectors.

Let’s see more of its properties.

  1. Cov(X,Y) = 0 if and only if X and Y are linearly uncorrelated. This means linearly zero correlation is the same as the orthogonality of two random variables.
  2. Var(X)= Cov(X,X) =Square of the length of X (as a vector).
  3. Cor(X, Y) measures the amount of linear association between X and Y. Cor(X, Y) = cos(θ), where θ is the angle between X and Y.
    Thus, -1 ≤Cor(X, Y)≤ +1
  4. Cor(X, Y) = ±1 if and only if cos(θ) =±1 if and only if the angle between X and Y is either 0 or 180°. Thus, X and Y are linearly dependent.

Visual Secret 4

We will not discover the geometry of E(Y|X).

Intuitively, what it means? It means the amount of Y explained by X. Why?

We have already seen that the Expectation is a projection linear map. But, what about Conditional Expectation? Is it also a projection?

The Geometry

We know that

E(E(Y|X)) = E(Y)

The covariance will be 0 if and only if E(E(Y|X)) = E(Y).

Therefore, E(Y|X) i is the projection of Y onto the subspace generated by X due to the smoothing property of expectation.

Y = [Y — E(Y|X)] +[E(Y|X)]

Exercise: Prove that E(Y|X) is a projection linear map from the linear space of random variables to the random variables spanned by X.

This is because the X and Y — E(Y|X) are perpendicular to each other. This means,

Cov(X,Y) = Cov(X, E(Y|X))

Exercise: Prove that Cov(E(Y|X), Y — E(Y|X)) = 0

Summary

  1. The Space of Random Variables with finite variance is a Linear Space.
  2. E[X] is a projection linear map.
  3. Cov(X, Y) is an inner product on the Linear Space of Random Variables.
  4. Cor(X, Y) = cos(θ), where θ is the angle between X and Y
  5. X and Y are orthogonal means they are linearly independent.
  6. E[. |X] is a projection linear map onto the space of random variables spanned by X.
  7. E[Y|X] is the projection of Y onto the subspace spanned by X.
  8. X and Y — E(Y|X) are orthogonal to each other. This is equivalent to the smoothing property of E(E(Y|X)) = E(Y).
  9. Y = [Y — E(Y|X)] +[E(Y|X)]
    The orthogonal decomposition of Y wrt to X.
  10. Cov(X,Y) = Cov(X, E(Y|X)). This intuitively means that the portion of Y explained by X is the projection of Y onto X i.e. E(Y|X).

Note: This will only help you explain everything related to Linear Character of Random Variables. For example, Cov(X, Y) = 0 doesn’t mean that X and Y are independent.

I hope this post has given you some new insights into the space of randomness and chaos.

If you like this, please clap and share it among your enthusiastic peers.

©Srijit Mukherjee 2020.

--

--

Srijit Mukherjee
Srijit Mukherjee Content

I can help you carve a career in statistics, data science, and machine learning. Know me more at https://www.linkedin.com/in/srijit-mukherjee/.