Post With Code

news
code
analysis
Author

Oren Bochman

Published

Sunday, January 28, 2024

Modified

Wednesday, February 14, 2024

This is an obligatory post with executable code.

1 + 1
1
this is an annotation
2

and this is a figure with a caption

import numpy as np
import matplotlib.pyplot as plt

r = np.arange(0, 2, 0.01)
theta = 2 * np.pi * r
fig, ax = plt.subplots(
  subplot_kw = {'projection': 'polar'} 
)
ax.plot(theta, r)
ax.set_rticks([0.5, 1, 1.5, 2])
ax.grid(True)
plt.show()
Figure 1: A line plot on a polar axis

It’s also useful to have a small sample of printing a table from a pandas data frame and a quick access to Pandas a fluent wrangling block

import numpy as np
import pandas as pd
from itables import show
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
import xgboost as xgb

df = (    pd.read_csv('./data/Salary Data.csv')
          .dropna()
          .drop_duplicates()
          .assign(is_male=lambda x: x['Gender'].apply(lambda y: 1 if y == 'Male' else 0),
                  is_PhD=lambda x: x['Education Level'].apply(lambda y: 1 if y == 'PhD' else 0),
                  is_BA=lambda x: x['Education Level'].apply(lambda y: 1 if y == 'Bachelor\'s' else 0),
                  is_MA=lambda x: x['Education Level'].apply(lambda y: 1 if y == 'Master\'s' else 0),
                 
          )
          .rename(columns={'Years of Experience':'xp'})
          .drop(['Gender','Education Level','Job Title'],axis=1)

    )

#df['Education Level'] = edu_label_encoder.fit_transform(df['Education Level'])
#job_title_encoder = LabelEncoder()
#df['Job Title']=job_title_encoder.fit_transform(df['Job Title'])
show(df)
1
import the usual suspects
2
load the salary dataset
3
remove rows with missing values
4
remove duplicate entries
5
recode gender to is_male
6
recode categorical education level to dummies
7
rename columns
8
drop columns
9
peek at the data
Table 1
Age xp Salary is_male is_PhD is_BA is_MA
Loading ITables v2.1.0 from the internet... (need help?)

raw Salary DataSet

Citation

BibTeX citation:
@online{bochman2024,
  author = {Bochman, Oren},
  title = {Post {With} {Code}},
  date = {2024-01-28},
  url = {https://orenbochman.github.io/posts/2024/2024-02-12-post-with-code/},
  langid = {en}
}
For attribution, please cite this work as:
Bochman, Oren. 2024. “Post With Code.” January 28, 2024. https://orenbochman.github.io/posts/2024/2024-02-12-post-with-code/.