CDF and Survival Function

In the previous section, we talked about Probability Density Function (PDF) which roughly is the probability of each outcome to occur. The Cumulative Distribution Function (CDF) is a concept that determines the probability of occurrence of all outcomes which are smaller than a given value.

Formal Definition

The Cumulative Distribution Function (CDF) of a random variable \(X\) is defined as the function \( F(X = x) = P(X \leq x) \)

The Survival Function of a random variable \(X\) is defined as the function \( S(X = x) = P(X > x) = 1 - F(X = x) \)

Normal Distribution CDF and Survival

Normal distribution is used to model random variables which have a continuous outcome. There are a lot of phenomena that roughly have distribution. For example, the height or weight of students in a school has this distribution. Mathematically, this distribution is \( \rm{Normal(\mu, \sigma)}(X = x) = \frac{1}{\sqrt{2\pi} \sigma} e^{- \frac{(x-\mu)^2}{2\sigma^2}} \) where \(\mu\) and \(\sigma\) are the population mean and standard deviation. The normal distribtion with \(\mu\)=0 and \(\sigma\) is called the Standard Normal Distribution.

In python, we can use scipy.stats to get PMF, CDF or simulate data for various probability distributions. In the following code, we use scipy.stats to find CDF and survival of standard normal distribution. Can you guess what are loc and scale parameters in this code?


import numpy as np

from scipy.stats import norm



# find CDF of standard normal distribution for x=0

cdf_value = norm.cdf(x=0, loc=0, scale=1)

# the cdf_value would be 0.5



# find survivale of standard normal distribution for x=2

survival = 1 - norm.cdf(x=3, loc=0, scale=1)

# Survival would be 0.001