Search

์„œ์šธ์‹œ ์ง€์—ญ๋ณ„ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„

์„œ์šธ์‹œ ์ง€์—ญ๋ณ„ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„

๋ฐ์ดํ„ฐ ๋ถ„์„ ๊ณผ์ •

1.
๊ฐ€์„ค ์„ค์ •
2.
๋ฐ์ดํ„ฐ ์ˆ˜์ง‘
3.
๋ฐ์ดํ„ฐ ๊ฐ€๊ณต
4.
๋ฐ์ดํ„ฐ ๋ถ„์„
5.
๊ฐ€์„ค ๊ฒ€์ •
6.
๊ฒฐ๋ก  ๋„์ถœ
โ€ข
๊ฐ€์„ค ๊ฒ€์ • : ๊ฐ€์„ค์ด ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜๋ฏธํ•œ์ง€ ํŒ๋‹จํ•˜๋Š” ๊ณผ์ •
โ€ข
t ๊ฒ€์ • ๋ถ„์„๊ธฐ๋ฒ• : ๋‘ ๋ฐ์ดํ„ฐ ํ‰๊ท ์˜ ์ฐจ์ด๋ฅผ ๋น„๊ตํ•˜๋Š” ๋ถ„์„ ๊ธฐ๋ฒ•
1.
๊ฐ€์„ค ์„ค์ •
"์„œ์šธ ์ค‘๊ตฌ์™€ ์„ฑ๋ถ๊ตฌ ์ง€์—ญ์€ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„์˜ ํ‰๊ท ์— ์ฐจ์ด๊ฐ€ ์—†๋‹ค."
2.
๋ฐ์ดํ„ฐ ์ˆ˜์ง‘
โ€ข
์„œ์šธํŠน๋ณ„์‹œ ๋Œ€๊ธฐํ™˜๊ฒฝ ์ •๋ณด
โ€ข
[๋Œ€๊ธฐ์งˆ ํ†ต๊ณ„]
โ€ข
์ธก์ •๊ธฐ๊ฐ„ : 2023๋…„ 5์›” ๋ฏธ์„ธ๋จผ์ง€ ์ „์ฒด
โ€ข
[์—‘์…€ ๋‹ค์šด๋กœ๋“œ]
3.
๋ฐ์ดํ„ฐ ๊ฐ€๊ณต
a.
์—‘์…€๋กœ ์ „์ฒ˜๋ฆฌ
โ€ข
1~4ํ–‰ ์‚ญ์ œ
โ€ข
"๊ตฌ๋ถ„" ์…€ ๋ณ‘ํ•ฉ ํ•ด์ œ
โ€ข
1ํ–‰ ๋‚ ์งœ ์„œ์‹ โ†’ ํ…์ŠคํŠธ ์„œ์‹ ๋ณ€๊ฒฝ
โ€ข
์…€ ๋ฐ์ดํ„ฐ ์„ ํƒ โ†’ ๋ณต์‚ฌ โ†’ ์•„๋ž˜์— ๋ถ™์—ฌ๋„ฃ๊ธฐ(ํ–‰/์—ด๋ฐ”๊ฟˆ ์˜ต์…˜)
โ€ข
1์ผ โ†’ 2023-05-01 ๋‚ ์งœํ˜•์‹์œผ๋กœ ๋ณ€๊ฒฝ
โ€ข
1~31์ผ ๊นŒ์ง€ ์ž๋™์ฑ„์šฐ๊ธฐ
โ€ข
์—‘์…€ ํŒŒ์ผ ์ €์žฅ : dust.xlsx
b.
ํ•„์š”ํ•œ ๋ฐ์ดํ„ฐ ์ถ”์ถœ
# ํ•„์š”ํ•œ ๋ฐ์ดํ„ฐ ์ถ”์ถœ # ์„ฑ๋ถ๊ตฌ, ์ค‘๊ตฌ ๋ฐ์ดํ„ฐ๋งŒ ์ถ”์ถœ # ๊ฒฐ์ธก์น˜ ํ™•์ธ
Plain Text
๋ณต์‚ฌ
4.
๋ฐ์ดํ„ฐ ๋ถ„์„
5.
๊ฐ€์„ค ๊ฒ€์ •
โ€ข
๊ฐ€์„ค ๊ฒ€์ • : "๊ฐ€์„ค์˜ ํ•ฉ๋‹น์„ฑ์„ ํŒ๋‹จํ•˜๋Š” ๊ณผ์ •"
โ€ข
ํ†ต๊ณ„์  ์ถ”๋ก ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ฃผ์žฅํ•˜๊ณ ์ž ํ•˜๋Š” ๊ฐ€์„ค์ด ๋ฐ์ดํ„ฐ์™€ ์ผ์น˜ํ•˜๋Š”์ง€๋ฅผ ๊ฒ€์ฆํ•˜๋Š” ๊ณผ์ •
โ€ข
๊ฐ€์„ค ๊ฒ€์ • ๊ด€๋ จ ํ•จ์ˆ˜
โ—ฆ
bartlett(๋ฐ์ดํ„ฐ1, ๋ฐ์ดํ„ฐ2) or levene(๋ฐ์ดํ„ฐ1, ๋ฐ์ดํ„ฐ2) : ๋‘ ๊ฐœ์˜ ๋…๋ฆฝ์ ์ธ ์ง‘๋‹จ์˜ ๋ถ„์‚ฐ ๋น„๊ต
โ—ฆ
p-value <= 0.05 : ์˜๋ฏธ๊ฐ€ ์žˆ๋Š” ์ฐจ์ด๊ฐ€ ์žˆ๋‹ค ("๋ถ„์‚ฐ์ด ๋‹ค๋ฅด๋‹ค")
โ—ฆ
p-value > 0.05 : ์˜๋ฏธ๊ฐ€ ์žˆ๋Š” ์ฐจ์ด๊ฐ€ ์—†๋‹ค ("๋ถ„์‚ฐ์ด ๊ฐ™๋‹ค") โ†’ "๋“ฑ๋ถ„์‚ฐ"
โ—ฆ
ttest_ind(๋ฐ์ดํ„ฐ1, ๋ฐ์ดํ„ฐ2, var.equal = ๋“ฑ๋ถ„์‚ฐ์„ฑ ์—ฌ๋ถ€) : ๋‘ ๊ฐœ์˜ ๋…๋ฆฝ์ ์ธ ์ง‘๋‹จ์˜ ํ‰๊ท ์„ ๋น„๊ต
โ—ฆ
p-value <= 0.05 : ์˜๋ฏธ๊ฐ€ ์žˆ๋Š” ์ฐจ์ด๊ฐ€ ์žˆ๋‹ค ("ํ‰๊ท ์ด ๋‹ค๋ฅด๋‹ค")
โ—ฆ
p-value > 0.05 : ์˜๋ฏธ๊ฐ€ ์žˆ๋Š” ์ฐจ์ด๊ฐ€ ์—†๋‹ค ("ํ‰๊ท ์ด ๊ฐ™๋‹ค")
โ€ข
๊ฐ€์„ค ๊ฒ€์ • ๊ณผ์ •
a.
๊ท€๋ฌด๊ฐ€์„ค, ๋Œ€๋ฆฝ๊ฐ€์„ค ์„ค์ •
โ€ข
๊ท€๋ฌด๊ฐ€์„ค : "์ฃผ์žฅ์ด ํšจ๊ณผ๊ฐ€ ์—†๊ฑฐ๋‚˜ ์ฐจ์ด๊ฐ€ ์—†๋‹ค"
โ€ข
๋Œ€๋ฆฝ๊ฐ€์„ค : "์ฃผ์žฅ์ด ํšจ๊ณผ๊ฐ€ ์žˆ๊ฑฐ๋‚˜ ์ฐจ์ด๊ฐ€ ์žˆ๋‹ค"
b.
์œ ์˜์ˆ˜์ค€ ์„ค์ •
โ€ข
๊ท€๋ฌด ๊ฐ€์„ค์„ ๊ธฐ๊ฐํ•˜๊ธฐ ์œ„ํ•œ p-value ๊ฐ’์˜ ๊ธฐ์ค€ ์„ค์ •
โ€ข
์ผ๋ฐ˜์ ์œผ๋กœ 0.05
c.
๊ฒ€์ • ๋ฐฉ๋ฒ• ์„ ํƒ
โ€ข
ํ‰๊ท ์˜ ์ฐจ์ด๋ฅผ ๋น„๊ตํ•˜๋Š” ๊ฒฝ์šฐ : t๊ฒ€์ • ๋ถ„์„๊ธฐ๋ฒ• - t.test() ํ•จ์ˆ˜ ์‚ฌ์šฉ
d.
p-value ๋„์ถœ
โ€ข
p-value (probability value) : ๊ณ„์‚ฐ๋œ ํ†ต๊ณ„๋Ÿ‰์ด ๊ท€๋ฌด๊ฐ€์„ค์— ๋ถ€ํ•ฉ๋˜๋Š” ์ •๋„๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฐ’
โ€ข
์ ์ ˆํ•œ ๋ถ„์„๊ธฐ๋ฒ•์˜ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ p-value ๋ฅผ ๊ฒฐ๊ณผ๋กœ ์–ป์Œ
e.
๊ฒฐ๋ก  ๋„์ถœ
โ€ข
0.05 ๋ณด๋‹ค ํฌ๋ฉด ํด์ˆ˜๋ก ํ†ต๊ณ„๋Ÿ‰์ด ๋”์šฑ ๋” ๊ท€๋ฌด๊ฐ€์„ค์— ๋ถ€ํ•ฉํ•จ
โ€ข
0.05 ๋ณด๋‹ค ์ž‘์œผ๋ฉด ํ†ต๊ณ„์ ์œผ๋กœ ์˜๋ฏธ์žˆ๋Š” ์ฐจ์ด๊ฐ€ ์žˆ๋‹ค๊ณ  ํŒ๋‹จ โ†’ ๊ท€๋ฌด๊ฐ€์„ค์„ ๊ธฐ๊ฐํ•˜๊ณ  ๋Œ€๋ฆฝ๊ฐ€์„ค ์ฑ„ํƒ
6.
๊ฒฐ๋ก  ๋„์ถœ
โ€ข
๊ท€๋ฌด๊ฐ€์„ค : "์„œ์šธ ์ค‘๊ตฌ์™€ ์„ฑ๋ถ๊ตฌ ์ง€์—ญ์€ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„์˜ ํ‰๊ท ์— ์ฐจ์ด๊ฐ€ ์—†๋‹ค."
โ€ข
t.test()
โ—ฆ
p-value = 0.1115 > 0.05 โ†’ "๊ท€๋ฌด๊ฐ€์„ค ์ฑ„ํƒ" โ†’ "์„œ์šธ ์ค‘๊ตฌ์™€ ์„ฑ๋ถ๊ตฌ ์ง€์—ญ์€ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„์˜ ํ‰๊ท ์— ์ฐจ์ด๊ฐ€ ์—†๋‹ค."

์‹ค์Šต์ฝ”๋“œ

โ€ข
์„ฑ๋ถ๊ตฌvs์ค‘๊ตฌ
# ์„œ์šธ์‹œ ์ง€์—ญ๋ณ„ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„ # ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์„ค์น˜ # !pip install pandas openpyxl import os import pandas as pd program_path = os.path.abspath(__file__) path = os.path.dirname(program_path) input_file = path + '/dust.xlsx' # ์—‘์…€ ํŒŒ์ผ ์ฝ๊ธฐ dust_data = pd.read_excel(input_file) # ์„ฑ๋ถ๊ตฌ, ์ค‘๊ตฌ ๋ฐ์ดํ„ฐ๋งŒ ์ถ”์ถœ dust_data_select = dust_data[["๋‚ ์งœ", "์„ฑ๋ถ๊ตฌ", "์ค‘๊ตฌ"]] print(dust_data_select.head()) # ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์˜ ์ฒ˜์Œ 5๊ฐœ์˜ ํ–‰์„ ๊ฐ€์ ธ์™€ ์ถœ๋ ฅ print('--------------------------------') # ๊ฒฐ์ธก์น˜ ํ™•์ธ print('๊ฒฐ์ธก์น˜ : ') print(dust_data_select.isna().sum()) print('--------------------------------') # ์ง€์—ญ๋ณ„ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„ ๊ธฐ์ˆ ํ†ต๊ณ„๋Ÿ‰ print('์„ฑ๋ถ๊ตฌ ๋ฏธ์„ธ๋ฒˆ์ง€ ๊ธฐ์ˆ ํ†ต๊ณ„๋Ÿ‰ : ') print(dust_data_select["์„ฑ๋ถ๊ตฌ"].describe()) print('--------------------------------') print('์ค‘๊ตฌ ๋ฏธ์„ธ๋ฒˆ์ง€ ๊ธฐ์ˆ ํ†ต๊ณ„๋Ÿ‰ : ') print(dust_data_select["์ค‘๊ตฌ"].describe()) print('--------------------------------') # ์ง€์—ญ๋ณ„ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„ ๋“ฑ๋ถ„์‚ฐ์„ฑ ๊ฒ€์ • ๋ฐ ํ‰๊ท  ์ฐจ์ด ๊ฒ€์ • from scipy.stats import bartlett, levene, f_oneway # Bartlett's test ๋˜๋Š” Levene's test๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ถ„์‚ฐ์˜ ๋“ฑ๋ถ„์‚ฐ์„ฑ์„ ๊ฒ€์ • print('์ค‘๊ตฌ-์„ฑ๋ถ๊ตฌ ๋“ฑ๋ถ„์‚ฐ์„ฑ์„ ๊ฒ€์ •') bartlett_statistic, bartlett_p_value = bartlett(dust_data_select["์ค‘๊ตฌ"], dust_data_select["์„ฑ๋ถ๊ตฌ"]) levene_statistic, levene_p_value = levene(dust_data_select["์ค‘๊ตฌ"], dust_data_select["์„ฑ๋ถ๊ตฌ"]) print("Bartlett's test - Statistic:", bartlett_statistic, "p-value:", bartlett_p_value) print("Levene's test - Statistic:", levene_statistic, "p-value:", levene_p_value) print('--------------------------------') # f_oneway() (One-Way ANOVA): "์ผ์›๋ถ„์‚ฐ๋ถ„์„" # - ANOVA(Analysis of Variance)์˜ ์ผ์›๋ถ„์‚ฐ๋ถ„์„(One-Way ANOVA)์„ ์ˆ˜ํ–‰ํ•˜๋Š” ํ•จ์ˆ˜ # - ์„ธ ๊ฐœ ์ด์ƒ์˜ ๊ทธ๋ฃน ๊ฐ„์˜ ํ‰๊ท  ์ฐจ์ด๋ฅผ ๋น„๊ตํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. # ๋“ฑ๋ถ„์‚ฐ์„ฑ์ด ๋งŒ์กฑ๋˜๋ฉด ANOVA๋ฅผ ์ˆ˜ํ–‰ํ•˜์—ฌ ๊ทธ๋ฃน ๊ฐ„์˜ ํ‰๊ท  ์ฐจ์ด ๊ฒ€์ • # ANOVA(Analysis of Variance) : ๊ทธ๋ฃน ๊ฐ„์˜ ํ‰๊ท ์˜ ์ฐจ์ด๋ฅผ ๋น„๊ตํ•˜๋Š” ํ†ต๊ณ„์ ์ธ ๋ฐฉ๋ฒ• if bartlett_p_value > 0.05 and levene_p_value > 0.05: print('์ค‘๊ตฌ-์„ฑ๋ถ๊ตฌ ํ‰๊ท  ์ฐจ์ด ๊ฒ€์ •') f_statistic, f_p_value = f_oneway(dust_data_select["์ค‘๊ตฌ"], dust_data_select["์„ฑ๋ถ๊ตฌ"]) print("F-statistic:", f_statistic, "p-value:", f_p_value) else: print("๋“ฑ๋ถ„์‚ฐ์„ฑ์„ ๋งŒ์กฑํ•˜์ง€ ์•Š์•„ ANOVA๋ฅผ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.") print('----------------------------------------------------------------') # ttest_ind() (Independent Samples t-test): # - ๋…๋ฆฝํ‘œ๋ณธ t-๊ฒ€์ •(Independent Samples t-test)์„ ์ˆ˜ํ–‰ํ•˜๋Š” ํ•จ์ˆ˜ # - ๋…๋ฆฝํ‘œ๋ณธ t-๊ฒ€์ •์€ ๋‘ ๊ทธ๋ฃน์˜ ํ‰๊ท  ์ฐจ์ด๊ฐ€ ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜ํ•œ์ง€๋ฅผ ๊ฒ€์ •ํ•ฉ๋‹ˆ๋‹ค. # - ๋‘ ๊ฐœ์˜ ๊ทธ๋ฃน ๊ฐ„์˜ ํ‰๊ท  ์ฐจ์ด๋ฅผ ๋น„๊ตํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. # * equal_var= [True: ๋“ฑ๋ถ„์‚ฐ์„ฑ ๊ฐ€์ •, False: ๋“ฑ๋ถ„์‚ฐ์„ฑ ๊ฐ€์ •โŒ(๋ถ„์‚ฐ์ด ๋‹ค๋ฆ„์„ ๊ฐ€์ •)] # # ์ง€์—ญ๋ณ„ ํ‰๊ท  ์ฐจ์ด ๊ฒ€์ •ํ•˜๊ธฐ from scipy.stats import ttest_ind t_statistic, p_value = ttest_ind(dust_data_select["์ค‘๊ตฌ"], dust_data_select["์„ฑ๋ถ๊ตฌ"], equal_var=True) print("T-statistic:", t_statistic) print("p-value:", p_value) print('----------------------------------------------------------------') print('*** ๊ฒฐ๋ก  ๋„์ถœ ***') if p_value > 0.05: print('๊ท€๋ฌด๊ฐ€์„ค ์ฑ„ํƒ - "์„œ์šธ ์ค‘๊ตฌ์™€ ์„ฑ๋ถ๊ตฌ ์ง€์—ญ์€ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„์˜ ํ‰๊ท ์— ์ฐจ์ด๊ฐ€ ์—†๋‹ค."') else: print('๋Œ€๋ฆฝ๊ฐ€์„ค ์ฑ„ํƒ - "์„œ์šธ ์ค‘๊ตฌ์™€ ์„ฑ๋ถ๊ตฌ ์ง€์—ญ์€ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„์˜ ํ‰๊ท ์— ์ฐจ์ด๊ฐ€ ์žˆ๋‹ค."') # ์„ฑ๋ถ๊ตฌ์™€ ์ค‘๊ตฌ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„ ์ƒ์ž ๊ทธ๋ฆผ ๊ทธ๋ฆฌ๊ธฐ import matplotlib.pyplot as plt plt.rcParams['font.family'] ='Malgun Gothic' plt.rcParams['axes.unicode_minus'] =False plt.figure(figsize=(8, 6)) dust_data_select.boxplot(column=["์„ฑ๋ถ๊ตฌ", "์ค‘๊ตฌ"]) plt.title("finedust") plt.xlabel("AREA") plt.ylabel("FINEDUST_PM") plt.show()
Python
๋ณต์‚ฌ
โ€ข
์ง€์—ญ์„ ์ž…๋ ฅํ•˜์—ฌ ๊ฒ€์ •ํ•˜๊ธฐ
# ์„œ์šธ์‹œ ์ง€์—ญ๋ณ„ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„ # ํ•„์š”ํ•œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ์„ค์น˜ # !pip install pandas openpyxl import os import pandas as pd # ์„œ์ดˆ๊ตฌvs์ค‘๋ž‘๊ตฌ : ๋Œ€๋ฆฝ๊ฐ€์„ค # ๊ฐ•๋‚จ๊ตฌvs์ข…๋กœ๊ตฌ : ๊ท€๋ฌด๊ฐ€์„ค program_path = os.path.abspath(__file__) path = os.path.dirname(program_path) input_file = path + '/dust.xlsx' # ์—‘์…€ ํŒŒ์ผ ์ฝ๊ธฐ dust_data = pd.read_excel(input_file) a_group = input('A ์ง€์—ญ : ') b_group = input('B ์ง€์—ญ : ') # A, B ์ง€์—ญ๋งŒ ๋ฐ์ดํ„ฐ๋งŒ ์ถ”์ถœ dust_data_select = dust_data[["๋‚ ์งœ", a_group, b_group]] print(dust_data_select.head()) # ๋ฐ์ดํ„ฐํ”„๋ ˆ์ž„์˜ ์ฒ˜์Œ 5๊ฐœ์˜ ํ–‰์„ ๊ฐ€์ ธ์™€ ์ถœ๋ ฅ print('--------------------------------') # ๊ฒฐ์ธก์น˜ ํ™•์ธ print('๊ฒฐ์ธก์น˜ : ') print(dust_data_select.isna().sum()) print('--------------------------------') # ์ง€์—ญ๋ณ„ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„ ๊ธฐ์ˆ ํ†ต๊ณ„๋Ÿ‰ print('{} ๋ฏธ์„ธ๋ฒˆ์ง€ ๊ธฐ์ˆ ํ†ต๊ณ„๋Ÿ‰ : '.format(a_group)) print(dust_data_select[a_group].describe()) print('--------------------------------') print('{} ๋ฏธ์„ธ๋ฒˆ์ง€ ๊ธฐ์ˆ ํ†ต๊ณ„๋Ÿ‰ : '.format(b_group)) print(dust_data_select[b_group].describe()) print('--------------------------------') # ์ง€์—ญ๋ณ„ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„ ๋“ฑ๋ถ„์‚ฐ์„ฑ ๊ฒ€์ • ๋ฐ ํ‰๊ท  ์ฐจ์ด ๊ฒ€์ • from scipy.stats import bartlett, levene, f_oneway # Bartlett's test ๋˜๋Š” Levene's test๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ถ„์‚ฐ์˜ ๋“ฑ๋ถ„์‚ฐ์„ฑ์„ ๊ฒ€์ • print('{}-{} ๋“ฑ๋ถ„์‚ฐ์„ฑ์„ ๊ฒ€์ •'.format(a_group, b_group)) bartlett_statistic, bartlett_p_value = bartlett(dust_data_select[b_group], dust_data_select[a_group]) levene_statistic, levene_p_value = levene(dust_data_select[b_group], dust_data_select[a_group]) print("Bartlett's test - Statistic:", bartlett_statistic, "p-value:", bartlett_p_value) print("Levene's test - Statistic:", levene_statistic, "p-value:", levene_p_value) print('--------------------------------') # f_oneway() (One-Way ANOVA): "์ผ์›๋ถ„์‚ฐ๋ถ„์„" # - ANOVA(Analysis of Variance)์˜ ์ผ์›๋ถ„์‚ฐ๋ถ„์„(One-Way ANOVA)์„ ์ˆ˜ํ–‰ํ•˜๋Š” ํ•จ์ˆ˜ # - ์„ธ ๊ฐœ ์ด์ƒ์˜ ๊ทธ๋ฃน ๊ฐ„์˜ ํ‰๊ท  ์ฐจ์ด๋ฅผ ๋น„๊ตํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. # ๋“ฑ๋ถ„์‚ฐ์„ฑ์ด ๋งŒ์กฑ๋˜๋ฉด ANOVA๋ฅผ ์ˆ˜ํ–‰ํ•˜์—ฌ ๊ทธ๋ฃน ๊ฐ„์˜ ํ‰๊ท  ์ฐจ์ด ๊ฒ€์ • # ANOVA(Analysis of Variance) : ๊ทธ๋ฃน ๊ฐ„์˜ ํ‰๊ท ์˜ ์ฐจ์ด๋ฅผ ๋น„๊ตํ•˜๋Š” ํ†ต๊ณ„์ ์ธ ๋ฐฉ๋ฒ• if bartlett_p_value > 0.05 and levene_p_value > 0.05: print('{}-{} ํ‰๊ท  ์ฐจ์ด ๊ฒ€์ •'.format(a_group, b_group)) f_statistic, f_p_value = f_oneway(dust_data_select[b_group], dust_data_select[a_group]) print("F-statistic:", f_statistic, "p-value:", f_p_value) else: print("๋“ฑ๋ถ„์‚ฐ์„ฑ์„ ๋งŒ์กฑํ•˜์ง€ ์•Š์•„ ANOVA๋ฅผ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค.") print('----------------------------------------------------------------') # ttest_ind() (Independent Samples t-test): # - ๋…๋ฆฝํ‘œ๋ณธ t-๊ฒ€์ •(Independent Samples t-test)์„ ์ˆ˜ํ–‰ํ•˜๋Š” ํ•จ์ˆ˜ # - ๋…๋ฆฝํ‘œ๋ณธ t-๊ฒ€์ •์€ ๋‘ ๊ทธ๋ฃน์˜ ํ‰๊ท  ์ฐจ์ด๊ฐ€ ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜ํ•œ์ง€๋ฅผ ๊ฒ€์ •ํ•ฉ๋‹ˆ๋‹ค. # - ๋‘ ๊ฐœ์˜ ๊ทธ๋ฃน ๊ฐ„์˜ ํ‰๊ท  ์ฐจ์ด๋ฅผ ๋น„๊ตํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. # * equal_var= [True: ๋“ฑ๋ถ„์‚ฐ์„ฑ ๊ฐ€์ •, False: ๋“ฑ๋ถ„์‚ฐ์„ฑ ๊ฐ€์ •โŒ(๋ถ„์‚ฐ์ด ๋‹ค๋ฆ„์„ ๊ฐ€์ •)] # # ์ง€์—ญ๋ณ„ ํ‰๊ท  ์ฐจ์ด ๊ฒ€์ •ํ•˜๊ธฐ from scipy.stats import ttest_ind t_statistic, p_value = ttest_ind(dust_data_select[b_group], dust_data_select[a_group], equal_var=True) print("T-statistic:", t_statistic) print("p-value:", p_value) print('----------------------------------------------------------------') print('*** ๊ฒฐ๋ก  ๋„์ถœ ***') if p_value > 0.05: print('๊ท€๋ฌด๊ฐ€์„ค ์ฑ„ํƒ - "์„œ์šธ {}์™€ {} ์ง€์—ญ์€ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„์˜ ํ‰๊ท ์— ์ฐจ์ด๊ฐ€ ์—†๋‹ค."'.format(a_group, b_group)) else: print('๋Œ€๋ฆฝ๊ฐ€์„ค ์ฑ„ํƒ - "์„œ์šธ {}์™€ {} ์ง€์—ญ์€ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„์˜ ํ‰๊ท ์— ์ฐจ์ด๊ฐ€ ์žˆ๋‹ค."'.format(a_group, b_group)) # A ์ง€์—ญ๊ณผ B์ง€์—ญ ๋ฏธ์„ธ๋จผ์ง€ ๋†๋„ ์ƒ์ž ๊ทธ๋ฆผ ๊ทธ๋ฆฌ๊ธฐ import matplotlib.pyplot as plt plt.rcParams['font.family'] ='Malgun Gothic' plt.rcParams['axes.unicode_minus'] =False plt.figure(figsize=(8, 6)) dust_data_select.boxplot(column=[a_group, b_group]) plt.title("finedust") plt.xlabel("AREA") plt.ylabel("FINEDUST_PM") plt.savefig(path + '/์„œ์šธ์‹œ ๋ฏธ์„ธ๋จผ์ง€ - {}vs{}.png'.format(a_group, b_group)) plt.show()
Python
๋ณต์‚ฌ