by Devon Ankar for class DSE 6000 @ Wayne State University
Salary information for all mayoral appointees
Anyone who lives or works in Detroit pays taxes to the city, so we want to know where that money is going. We want to find key facts about where our taxpayer dollars are going.
Question: Which departments have the highest mean salaries?
Initial hypothesis: Mayor's Office will have the highest mean salary
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
salaries = pd.read_csv('Mayoral_Appointee_Salaries.csv', encoding = "ISO-8859-1")
salaries.head(10)
salaries.columns
print(salaries['CITY SALARY'].head(10))
salaries['CITY SALARY'] = [float(x.strip("$")) for x in salaries['CITY SALARY']]
print(salaries['CITY SALARY'].head(10))
print(salaries['ADDITIONAL GRANT SUPPORT'].head(10))
salaries['ADDITIONAL GRANT SUPPORT'] = [float(x.strip("$")) for x in salaries['ADDITIONAL GRANT SUPPORT']]
print(salaries['ADDITIONAL GRANT SUPPORT'].head(10))
n = 20
filtered_salaries = salaries.filter(['DEPARTMENT/AREA','CITY SALARY'])
grouped_by_dept = filtered_salaries.groupby(['DEPARTMENT/AREA'],as_index=True).mean().sort_values(['CITY SALARY'], ascending=False);
plt.ticklabel_format(style='plain', axis='x')
data=grouped_by_dept.nlargest(n,'CITY SALARY')
sns.barplot(y=data.index, x=data['CITY SALARY'])
grouped_by_dept['CITY SALARY'] = grouped_by_dept.apply(lambda x: "{:,}".format(round(x['CITY SALARY'])), axis=1)
grouped_by_dept.head(n)
Initial hypothesis was Mayor's Office will have the highest mean salary - this is definitely false! Highest mean salaries are in Police, Office of the CFO, Health Dept.
Additional question: Which departments have the highest total additional grant support?
n = 4
filtered_salaries = salaries.filter(['DEPARTMENT/AREA','ADDITIONAL GRANT SUPPORT'])
grouped_by_dept = filtered_salaries.groupby(['DEPARTMENT/AREA'],as_index=True).sum().sort_values(['ADDITIONAL GRANT SUPPORT'], ascending=False);
plt.ticklabel_format(style='plain', axis='x')
data=grouped_by_dept.nlargest(n,'ADDITIONAL GRANT SUPPORT')
sns.barplot(y=data.index, x=data['ADDITIONAL GRANT SUPPORT'])
grouped_by_dept['ADDITIONAL GRANT SUPPORT'] = grouped_by_dept.apply(lambda x: "{:,}".format(round(x['ADDITIONAL GRANT SUPPORT'])), axis=1)
grouped_by_dept.head(n)
Detroit Economic Growth has the overwhelming majority of the grant support. (The Detroit Economic Growth Corporation (DEGC) is responsible for several aspects of economic development, including commercial/industrial development and small business promotion, including Motor City Match.) This is to be expected because it is an entire corporation/association and not an individual department. Does Detroit Economic Growth receive a city salary?
salaries[salaries['DEPARTMENT/AREA'] == 'Detroit Economic Growth Corporation/Association']['CITY SALARY']
It appears that Detroit Economic Growth does not receive a city salary. Then it is expected that it receives grant support instead.
Additional question: Which titles have the highest mean city salaries?
n = 10
filtered_salaries = salaries.filter(['TITLE','CITY SALARY'])
grouped_by_dept = filtered_salaries.groupby(['TITLE'],as_index=True).mean().sort_values(['CITY SALARY'], ascending=False);
plt.ticklabel_format(style='plain', axis='x')
data=grouped_by_dept.nlargest(n,'CITY SALARY')
sns.barplot(y=data.index, x=data['CITY SALARY'])
grouped_by_dept['CITY SALARY'] = grouped_by_dept.apply(lambda x: "{:,}".format(round(x['CITY SALARY'])), axis=1)
grouped_by_dept.head(n)
Chief of Police and CFO have the highest salaries.
Initial hypothesis was Mayor's Office will have the highest mean salary - this is definitely false! Highest mean salaries are in Police, Office of the CFO, Health Dept.
Detroit Economic Growth has the overwhelming majority of the grant support. It appears that Detroit Economic Growth does not receive a city salary.
Chief of Police and CFO have the highest salaries based on title.
Unfortunately, there are no variables here about which we can investigate possible correlations.