

Statistics is a very important branch of Mathematics with a lot of practical applications. It is important in statistics to know the mean and variance of random variables. Let us try to understand what are random variables, how are they generated, and how to do calculations using random variables. A random variable is defined as variables that assign numerical values to the outcomes of random experiments. Random variables are mainly divided into discrete and continuous random variables. A random variable is termed as a continuous random variable when it can take infinitely many values. Here the value of X is not limited to integer values. Most of our real-life applications make use of continuous random variables. For example, if we are doing experiments with variables such as height, weight, age, etc, then these variables are continuous random variables.
In the article, we have covered the basics of the random variable and have strived to help understand the basic idea of mean and variable and their applications in detail.
Probability Distributions
A probability distribution helps us to make sense of the huge data collected by plotting it against random variables. Depending on the nature of the random variables, the probability distributions also change. Some of the important probability distributions are binomial distributions, Poisson’s distributions, Gaussian distribution, etc. These are classified into continuous or discrete distributions based on the random variables.
The function plotted in these probability distributions are called Probability Distribution Functions or P.D.Fs. Each PDFs is characterized by a mean and variance. Mean is often used synonymously to average, though its meaning might slightly vary according to the nature of the random variable. Variance is the spread of the curve or in other words the deviation of the data from the mean value.
Mean of Discrete Random Variables
In the case of a discrete random variable, the mean implies a weighted average. To understand this let us discuss a couple of examples.
Let us take our previous coin-tossing example. Here the experiment is repeated n times and if asked us to predict the most probable outcome or mean outcome we would answer that there is a 50 percent chance of obtaining either head or a tail. This is because the probability of obtaining both outcomes is equal or in other words, each outcome is of equal weight.
Now consider the following example:
Suppose a person is playing poker. The possible outcomes are to lose 2 coins, break-even, gain 1 coin or gain 5 coins. If the probability distribution is given what will be our mean outcome using this data?
Here we have to bring in the concept of weighted average. Suppose we have n values of measurement with the probability of each value being different. The sum of those values multiplied with their probabilities is called mean or weighted average. For example, suppose from an experiment we obtain 3 energies E1, E2, E3 of values 0, 5, and 3 units with probability equal to \[\frac{1}{5}\],\[\frac{3}{5}\] and \[\frac{1}{5}\] respectively. The mean/average energy = 0 * \[\frac{1}{5}\] + 5 * \[\frac{3}{5}\] + 3 * \[\frac{1}{5}\] = \[\frac{43}{5}\] units.
Now returning to our earlier examples of the poker game.
The mean outcome/ expectation value of the poker game is
E(X) = -2*0.30 + 0*0.40 + 2*0.20 + 5*0.10 = 0.3 coins
Hence let us write a formula for the mean of a discrete random variable.
Suppose pi is the weighted average of xi. Then
E(X) = ∑xipi
Mean and Variance of Continuous Random Variable
When our data is continuous, then the corresponding random variable and probability distribution will be continuous. The principle of mean and variance remains the same. However, we cannot use the same formula, as when the discrete variables become continuous, the addition will become integration. Hence our formulas for mean and variance will be of the form
E(X) = \[\int_{-\infty}^{\infty} xf_{x} (x)dx\] , Here \[f_{x} (x)\] is the probability density function.
Var(X) = \[\int_{-\infty}^{\infty} (x-E(X))^{2} xf_{x} (x)dx\]
Here the most important term is our probability density function. It contains all the data of our system. Even the shape of our distribution can vary based on the probability distribution function.
It is very difficult to study all the continuous distributions under one title as each of them has their specific characteristics and properties and require individual attention.
FAQs on Mean and Variance of Random Variable
1. What is the mean of a random variable, and what does it represent?
The mean of a random variable, also known as its expected value (E(X)), represents the long-term average outcome of a random experiment if it were repeated many times. It is a weighted average of all possible values the variable can take, with the weights being their respective probabilities. For a discrete random variable X, it is calculated as a central point around which the outcomes tend to cluster.
2. What are the formulas for calculating the mean and variance of a discrete random variable?
For a discrete random variable X that takes values x₁, x₂, ..., xₙ with probabilities p₁, p₂, ..., pₙ:
- Mean (Expected Value) E(X) or μ: It is calculated by summing the product of each value and its probability. The formula is:
μ = E(X) = Σ xᵢpᵢ - Variance Var(X) or σ²: It measures the spread of the data points from the mean. The formula is:
σ² = Var(X) = Σ (xᵢ - μ)²pᵢ - An alternative, often simpler, formula for variance is:
σ² = Var(X) = E(X²) - [E(X)]², where E(X²) = Σ xᵢ²pᵢ.
3. What is the variance of a random variable, and how is it related to standard deviation?
The variance of a random variable is a numerical measure of how much the values of the variable are spread out from their mean or expected value. A low variance indicates that the values are clustered closely around the mean, while a high variance suggests they are widely dispersed. The standard deviation (σ) is simply the positive square root of the variance (σ²). It is often preferred as a measure of spread because it is expressed in the same units as the random variable itself.
4. How would you calculate the mean and variance for the outcome of rolling a single fair six-sided die?
Let X be the random variable for the outcome of rolling a fair die. The possible values are {1, 2, 3, 4, 5, 6}, each with a probability of 1/6.
1. Calculate the Mean (E(X)):
E(X) = (1 × 1/6) + (2 × 1/6) + (3 × 1/6) + (4 × 1/6) + (5 × 1/6) + (6 × 1/6) = (1+2+3+4+5+6)/6 = 21/6 = 3.5.
2. Calculate E(X²):
E(X²) = (1² × 1/6) + (2² × 1/6) + (3² × 1/6) + (4² × 1/6) + (5² × 1/6) + (6² × 1/6) = (1+4+9+16+25+36)/6 = 91/6 ≈ 15.17.
3. Calculate the Variance (Var(X)):
Var(X) = E(X²) - [E(X)]² = 91/6 - (3.5)² = 15.167 - 12.25 = 2.917.
5. Why is variance important, and what does a high or low variance tell you about a probability distribution?
Variance is fundamentally important because it quantifies uncertainty or risk. It measures the dispersion of outcomes around the expected value.
- A low variance indicates that the outcomes of an experiment are very consistent and tend to be close to the mean. This implies a lower level of uncertainty and higher predictability.
- A high variance indicates that the outcomes are spread out over a wider range of values. This implies a higher level of uncertainty and less predictability. For example, in finance, two investments might have the same average return (mean), but the one with higher variance is considered riskier.
6. What is the key difference between finding the mean and variance for discrete vs. continuous random variables?
The fundamental difference lies in the mathematical operation used. For a discrete random variable, which has countable outcomes, we use summation (Σ) to sum the products of values and their probabilities. For a continuous random variable, which can take any value within a given range, we use integration (∫) over the range of the variable with its probability density function (f(x)). The concept remains the same—finding a weighted average—but the tool changes from discrete summation to continuous integration.
7. If a random variable X is transformed to Y = aX + b, how does this change its mean and variance?
A linear transformation affects the mean and variance in specific, predictable ways. Let E(X) be the mean and Var(X) be the variance of X. For the new variable Y = aX + b:
- The new mean is E(Y) = aE(X) + b. The mean is scaled by 'a' and shifted by 'b'.
- The new variance is Var(Y) = a²Var(X). The variance is scaled by the square of 'a' and is completely unaffected by the constant 'b', because adding 'b' shifts the entire distribution without changing its spread.
8. Can the variance of a random variable ever be negative? Explain your reasoning.
No, the variance of a random variable can never be negative. The reasoning lies in its definition: Var(X) = Σ (xᵢ - μ)²pᵢ. In this formula:
- The term (xᵢ - μ)² is a squared number, so it must be zero or positive.
- The probability pᵢ is, by definition, also zero or positive.

















