Excel 2019 for Marketing Statistics in pandas

Excel 2019 for Marketing Statistics in pandas
data science
statistics
marketing
Author

Oren Bochman

Published

Friday, September 24, 2021

Modified

Sunday, May 1, 2022

TLDR

Is there a port of Excel 2019 for Marketing Statistics ?

Sample Size, Mean, Standard Deviation, and Standard Error of the Mean

Let’s start with some data:


# importing pandas as pd
import pandas as pd

# Creating the data frame
df = pd.DataFrame({
    "A":[12, 4, 5, 44, 1],
    "B":[5, 2, 54, 3, 2],
    "C":[20, 16, 7, 3, 8],
    "D":[14, 3, 17, 2, 6],
    })

# Print the data frame
df

Sample Size

We want a replacement of Excel’s =COUNT() function to get the sample size n - we need to get the row count in the data frame. Pandas also has a count() function but we should use the first item in the shape property.

df.shape[0]

pitfall:

df.count() will only return the count of non-NA/NaN rows for each column. ## Mean We want a replacement of Excel’s =AVERAGE() function to calculate the mean \bar X according to: \bar X = \frac{\sum X}{n}


# Print the data frame
df.mean(axis = 0)

Standard Deviation

We want a replacement of Excel’s =STDEV() function to calculate the mean \bar X according to:

SD = \sqrt { \frac{\sum {(X-\bar X)^2}}{n-1}}

df.std(axis = 0) 

Standard Error of the Mean

In this case there is no built in formula in excel for this. It would be calculated with an ugly formula that looks like: =STDEV()\SQRT(COUNT()) to calculate the SE according to: SE = \frac{SD}{\sqrt{n}}


# find standard error of the mean of all the columns
df.sem(axis = 0)

Citation

BibTeX citation:
@online{bochman2021,
  author = {Bochman, Oren},
  title = {Excel 2019 for {Marketing} {Statistics} in Pandas},
  date = {2021-09-24},
  url = {https://orenbochman.github.io/posts/2021/2021-09-24-statistics-for-marketing-in-python/2021-09-24-statistics-for-marketing-in-python.html},
  langid = {en}
}
For attribution, please cite this work as:
Bochman, Oren. 2021. “Excel 2019 for Marketing Statistics in Pandas.” September 24, 2021. https://orenbochman.github.io/posts/2021/2021-09-24-statistics-for-marketing-in-python/2021-09-24-statistics-for-marketing-in-python.html.