๐Ÿ” ๋ฐ์ดํ„ฐ ๋ถ„์„/04. Data Analysis

๋ชจํ‰๊ท ์— ๋Œ€ํ•œ ์œ ์˜์„ฑ ๊ฒ€์ •์œผ๋กœ t-test ๊ฒ€์ •์„ ์‹ค์‹œํ•œ๋‹ค. ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํฌ๊ฒŒ ์„ธ๊ฐ€์ง€์˜ ๋ฐฉ๋ฒ•์ด ์žˆ๋‹ค. 1. ๋‹จ์ผํ‘œ๋ณธ t-๊ฒ€์ •(One-sample t-test) 2. ๋…๋ฆฝํ‘œ๋ณธ t-๊ฒ€์ •(Independent-tw-sample t-test) 3. ๋Œ€์‘ํ‘œ๋ณธ t-๊ฒ€์ •(Paired-two-sample t-test) ๋‹จ์ผํ‘œ๋ณธ t-๊ฒ€์ • : ๊ด€์‹ฌ์žˆ๋Š” ์—ฐ์†ํ˜• ๋ณ€์ˆ˜์˜ ํ‰๊ท ๊ฐ’์„ ํŠน์ • ๊ธฐ์ค€๊ฐ’๊ณผ ๋น„๊ตํ•˜์—ฌ ๊ทธ ์ฐจ์ด๊ฐ€ ํ†ต๊ณ„์ ์œผ๋กœ ์œ ์˜ํ•œ๊ฐ€๋ฅผ ํŒ๋‹จํ•˜๋Š” ๋ฐฉ๋ฒ•. p-value๊ฐ€ ์œ ์˜์ˆ˜์ค€(์ผ๋ฐ˜์ ์œผ๋กœ 0.05)๋ณด๋‹ค ์ž‘์œผ๋ฉด, ๊ท€๋ฌด๊ฐ€์„ค ๊ธฐ๊ฐ 1. ํŒจํ‚ค์ง€ ์ž„ํฌํŠธ from sklearn.datasets import load_iris import pandas as pd import numpy as np from scipy.stats import tte..
2022.03.14 - [๋ฐ์ดํ„ฐ ๋ถ„์„/04. Data Analysis] - [Python] EDA(ํƒ์ƒ‰์  ๋ฐ์ดํ„ฐ๋ถ„์„)๋ฅผ ํ†ตํ•œ ๋ณ€์ˆ˜ํƒ์ƒ‰ [Python] EDA(ํƒ์ƒ‰์  ๋ฐ์ดํ„ฐ๋ถ„์„)๋ฅผ ํ†ตํ•œ ๋ณ€์ˆ˜ํƒ์ƒ‰ sklearn์— ๋‚ด์žฅ๋˜์–ด์žˆ๋Š” iris๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ง€๊ณ  ๊ธฐ๋ณธ์ ์ธ ๋ฐ์ดํ„ฐ๋ถ„์„์„ ํ•ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค! 1. ํŒจํ‚ค์ง€ ์ž„ํฌํŠธ ๋ฐ ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ from sklearn.datasets import load_iris import pandas as pd import numpy as np ir.. xod22.tistory.com ์ €๋ฒˆ ๊ธ€์— ์ด์–ด์„œ EDA์˜ Visualization์— ๋Œ€ํ•ด ๊ณต๋ถ€ํ•ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค! ๋ฐ์ดํ„ฐ ๊ฐ’์˜ ๋ฐ€๋„ ์‚ดํŽด๋ณด๊ธฐ cols=iris_dataframe.columns[:4] densityplot=iris_datafra..
sklearn์— ๋‚ด์žฅ๋˜์–ด์žˆ๋Š” iris๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ง€๊ณ  ๊ธฐ๋ณธ์ ์ธ ๋ฐ์ดํ„ฐ๋ถ„์„์„ ํ•ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค! 1. ํŒจํ‚ค์ง€ ์ž„ํฌํŠธ ๋ฐ ๋ฐ์ดํ„ฐ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ from sklearn.datasets import load_iris import pandas as pd import numpy as np iris=load_iris() iris_dataframe=pd.DataFrame(iris.data, columns=iris.feature_names) # y์ปฌ๋Ÿผ ์ถ”๊ฐ€ iris_dataframe['group']=pd.Series([iris.target_names[k] for k in iris.target], dtype="category") iris_dataframe 2. ํ†ต๊ณ„๋Ÿ‰ ํ™•์ธ ~ํ‰๊ท ๊ฐ’~ numeric_only : int/float ์—ด..
xod22
'๐Ÿ” ๋ฐ์ดํ„ฐ ๋ถ„์„/04. Data Analysis' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก (3 Page)