페이지

5/17/2016

Python code: Chi-square & Post hoc Test

“”“
S1Q10A: Total personal income in last 12M (Quantitative)
SINTOCAT: Total personal imcome in last 12M (Categorical, Low: 0~30%, Medium: 31~70%, High: 71%~max)
DMAJORDEPSNI12: Major depression in last 12M(Illness-induced ruled out) (0:N, 1:Y)
@author: Daehwan, Kim
”“”
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import scipy.stats
data = pd.read_csv(“nesarc_pds.csv”, low_memory=False)
LvIncome = [0, data.S1Q10A.quantile(0.3), data.S1Q10A.quantile(0.7), data.S1Q10A.max()]
splitIntoCat = pd.cut(data.S1Q10A, LvIncome, labels=[‘low’, 'medium’, 'high’])
sub = data.copy()
sub['SINTOCAT’] = splitIntoCat
ct = pd.crosstab(sub.DMAJORDEPSNI12, sub.SINTOCAT, colnames=['Income Level’], rownames=['Depression’])
print(ct)
pct=ct/(ct.sum(axis=0))
print(pct)
sns.factorplot(x=“SINTOCAT”, y=“DMAJORDEPSNI12”, data=sub, ci=None, kind=“bar”)
plt.xlabel('Personal income in last 12M’)
plt.ylabel('Pct of people have depression’)
chi2, p_val, df, exct =scipy.stats.chi2_contingency(ct)
print(’*** chi-squre value, p-value, dof’)
print('chi2: %0.2f\np-value: %s\ndof: %d\n\n’ % (chi2, p_val, df) )
comp_lvm = sub[['SINTOCAT’, 'DMAJORDEPSNI12’]].query('SINTOCAT == “low” or SINTOCAT == “medium”’)
ct_lvm = pd.crosstab(comp_lvm.DMAJORDEPSNI12, comp_lvm.SINTOCAT).drop('high’, 1)
print(ct_lvm)
pct_lvm=ct_lvm/(ct_lvm.sum(axis=0))
print(pct_lvm)
chi2, p_val, df, exct =scipy.stats.chi2_contingency(ct_lvm)
print(’*** chi-squre value, p-value, dof’)
print('chi2: %0.2f\np-value: %s\ndof: %d\n’ % (chi2, p_val, df) )
comp_lvm = sub[['SINTOCAT’, 'DMAJORDEPSNI12’]].query('SINTOCAT == “low” or SINTOCAT == “high”’)
ct_lvm = pd.crosstab(comp_lvm.DMAJORDEPSNI12, comp_lvm.SINTOCAT).drop('medium’, 1)
print(ct_lvm)
pct_lvm=ct_lvm/(ct_lvm.sum(axis=0))
print(pct_lvm)
chi2, p_val, df, exct =scipy.stats.chi2_contingency(ct_lvm)
print(’*** chi-squre value, p-value, dof’)
print('chi2: %0.2f\np-value: %s\ndof: %d\n’ % (chi2, p_val, df) )
comp_lvm = sub[['SINTOCAT’, 'DMAJORDEPSNI12’]].query('SINTOCAT == “medium” or SINTOCAT == “high”’)
ct_lvm = pd.crosstab(comp_lvm.DMAJORDEPSNI12, comp_lvm.SINTOCAT).drop('low’, 1)
print(ct_lvm)
pct_lvm=ct_lvm/(ct_lvm.sum(axis=0))
print(pct_lvm)
chi2, p_val, df, exct =scipy.stats.chi2_contingency(ct_lvm)
print(’*** chi-squre value, p-value, dof’)
print('chi2: %0.2f\np-value: %s\ndof: %d\n’ % (chi2, p_val, df) )

댓글 없음:

댓글 쓰기

블로그 보관함

Facebook

Advertising

Disqus Shortname

Popular Posts