top of page

An intuitive understanding of Random Variables

  • Writer: Nisha Mandal
    Nisha Mandal
  • Jan 21, 2022
  • 4 min read

What is a Random Variable? A random variable is a function that performs the mapping of the outcomes of a random process to a numeric value.

To understand better, let’s take the example of a coin toss. It has two possible outcomes, a head and a tail. Now, to do experiments and record observations, we use a random variable, X. The random variable X is defined as,

Definition of the random variable X

Here, we mapped the outcomes of a random process (flipping a coin) to the numerical values (0 and 1) and this mapping is denoted by the random variable X.

Why do we need Random Variables? By the help of the mapping, random variables quantify the outcomes of a random phenomena.

We will now see how the different types of random variables were discovered as a natural consequence of Stochastic processes. Suppose I am waiting at the bus stop for a taxi. I wonder what is the probability of a vehicle that arrives to be a taxi. After some observation and taking into consideration other factors, I decide the probability of an arriving vehicle to be a bus is p. Hence, the probability of the arriving vehicle of being any other than a bus is 1-p. This becomes a Bernoulli trail, because it has only two possible outcomes and the probabilities of the success and failure, namely the two outcomes, remains constant. Now my 5 more friends arrive and now I need 6 taxis. I wonder what is the probability of getting 6 taxis among the next 50 vehicles, the Bernoulli trial now becomes a sequence of 6 Bernoulli trails and is called a Bernoulli process.

Hence, we obtain the Binomial Probability Distribution

Now I wonder how many vehicles will pass before the first bus arrives? I want to find out that what is the probability that after 25 vehicles pass, the first bus arrives.


This is the Geometric Probability Distribution

Now that the first person has got on the bus, I want to know what is the probability that 10 vehicles will pass before the second taxi arrives. Since the probability of the arrival of buses is independent, having had the first bus arrived doesn’t make a difference in the waiting time for the second bus. Hence the waiting time for the second bus also has a geometric distribution.

Now instead of wondering now many buses will arrive among the first 30 vehicles, I now wonder what is the probability of 30 vehicles passing to get the 6 bus arrivals? This means that the 30th vehicle is the 6th bus.


Negative Binomial Distribution

Now, suppose while waiting for the taxis, a parade passes by. I talk to the organizing committee and find out that there are going to be 50 males and 40 females in the parade. People in the parade are passing by, one by one, with a flag in their hand. What is the probability that among the first 20 people, 5 are going to be males?


Hypergeometric Probability Distribution

Now suppose we try to look at this situation with a different perspective. I wonder what is the probability of getting 6 taxis in the next 1 hour. Instead of looking at the number of vehicles that passed before getting 6 taxis, I am looking at the amount of time that passed before getting 6 taxis and we are moving from a Bernoulli process to a Poisson process.

For an initial estimate, I observed the road for several hours and concluded that the average rate of taxis crossing the bus stop is λ taxis per hour.

This can be modeled with the help of a Binomial distribution by taking n=60 minutes and p=λ/60

then, the probability of getting 6 taxis in 60 minutes (according to the Binomial model) is


Hypergeometric Probability Distribution

But the problem here is that, we are defining success by getting one taxi arriving in a minute, but there is the possibility of more than one taxi passing the bus stop in a minute.

So we make the process more granular, instead of minutes, we do the calculation by seconds,


but even here we can have more than one speedy taxi arriving at the bus stop in a second, so we make it even more granular.

Essentially, we are breaking down the 1 hour time interval into more and more small intervals. Since the intervals are very small, the number of intervals tends to infinity. and we can confidently ask the question, what is the probability that in x amount of time we get 6 taxis?

The number of arrivals of taxis will now have a binomial distribution that can be safely approximated to the Poisson distribution (Find the steps to the derivation here).


Poisson Distribution Function

Now the question becomes, what is the probability that it will take time t to get the first taxi? This situation can be modeled with the help of an exponential distribution with parameter lambda that is the average rate of taxi arrival. Exponential distribution is the continuous analog of the geometric distribution.


Exponential Distribution

The next question would be, what is the probability that it will take time t to get the 6th taxi from the 5th taxi? Since the arrivals of the taxi in any given time are independent, hence the distribution of the time to get the 6th taxi from the 5th taxi is the same as the distribution of the time of the first taxi. Hence, the distribution of the inter-arrival time is also Exponential with parameter lambda, that is the rate of taxis arriving.

Now, I ask what is the probability that it will take t amount of time of get the first 6 taxis? The random variable that denotes the time taken to get the 6 taxis is the sum of the time taken to get the individual taxis. The time taken to get the first taxi and the inter-arrival time between taxis is Exponential. This can be modeled as the sum of exponential random variables that has the Gamma distribution.


Gamma Distribution

We also have the case where given that 6(=4+2) taxis have arrived in 50 minutes, then what is the probability that the 4th taxi arrived till 30 minutes. This can be modeled with the help of beta distribution with parameters 4(α) and 2(β).


Beta Distribution

The aim of this article is to provide you a intuitive understanding of the different random variables and the situations in which they naturally arise. A lot of the content is inspired from the classes on Khan Academy and Cheenta Academy. So feel free to go through their wonderful resources.

Hope you enjoyed learning!

Kommentit


Drop Me a Line, Let Me Know What You Think

Thanks for submitting!

© 2023 by Train of Thoughts. Proudly created with Wix.com

bottom of page