Python Cumulative distribution function
import numpy as np import matplotlib.pyplot as plt from scipy.stats import norm import matplotlib.pyplot as plt import seaborn as sns
Example 1 Manual Calculation
data = [2, 3, 3, 5, 7]
sorted_data = np.sort(data)
data_len = len(sorted_data)
cdf_values = []
for i in range(data_len): # Calculate CDF as the proportion of data points less than or equal to sorted_data[i] cdf_value = np.sum(sorted_data <= sorted_data[i]) / data_len #each element is True if the corresponding element in sorted_data is less than or equal to sorted_data[i], and False otherwise cdf_values.append(cdf_value)
print(cdf_values)

Much Easier Examples
A Cumulative Distribution Function (CDF) can be used with either a value from the distribution or a Z-score, depending on the context:
np.random.seed(12)
mean = 0 std_dev = 1 size = 1000
data = np.random.normal(loc=mean, scale=std_dev, size=size)
Example 2 CDF at a single point
cdf_neg_one = norm.cdf(-1, loc=data.mean(), scale=data.std())
print(cdf_neg_one)

cdf_one = norm.cdf(1, loc=data.mean(), scale=data.std())
print(cdf_one)

Example 3 CDF Range
Upper_CDF = norm.cdf(2, loc=data.mean(), scale=data.std()) Lower_CDF = norm.cdf(-2, loc=data.mean(), scale=data.std())
cdf_range = Upper_CDF - Lower_CDF
print(cdf_range)

Example 4 CDF Right Side, Example Value greater than 2
value_greater_2 = 1 - norm.cdf(2, loc=data.mean(), scale=data.std())
print(value_greater_2)

Example 5 Graph Seaborn
sns.ecdfplot(data, label='CDF') plt.title('CDF of Normally Distributed Data') plt.xlabel('Data Values') plt.ylabel('Cumulative Probability') plt.legend() plt.grid(True) plt.show()

Ryan is a Data Scientist at a fintech company, where he focuses on fraud prevention in underwriting and risk. Before that, he worked as a Data Analyst at a tax software company. He holds a degree in Electrical Engineering from UCF.