๊ด€๋ฆฌ ๋ฉ”๋‰ด

๐Ÿฆ• ๊ณต๋ฃก์ด ๋˜์ž!

์ง‘ ๊ฐ’ ์˜ˆ์ธก ๋ถ„์„...2 ๋ณธ๋ฌธ

Data/Dacon

์ง‘ ๊ฐ’ ์˜ˆ์ธก ๋ถ„์„...2

Kirok Kim 2022. 2. 8. 21:09
์ˆ˜์น˜ํ˜•๋ฐ์ดํ„ฐ ๋ฐ ๋ช…๋ชฉํ˜• ๋ฐ์ดํ„ฐ ์‹œ๊ฐํ™”
#์ˆ˜์น˜ํ˜• ๋ฐ์ดํ„ฐ
numeric_feature = data.columns[(data.dtypes==int) | (data.dtypes== float)]
# ์นดํ…Œ๊ณ ๋ฆฌํ˜• ๋ฐ์ดํ„ฐ
categorical_feature = data.columns[data.dtypes=='O']

import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use("ggplot")

feature = numeric_feature

# Boxplot ์„ ์‚ฌ์šฉํ•ด์„œ ๋ฐ์ดํ„ฐ์˜ ๋ถ„ํฌ๋ฅผ ์‚ดํŽด๋ด…๋‹ˆ๋‹ค.
plt.figure(figsize=(20,15))
plt.suptitle("Boxplots", fontsize=40)

for i in range(len(feature)):
    # 4ํ–‰ 3์—ด 1~๋๊นŒ์ง€
    plt.subplot(4,3,i+1) # ์ˆ˜์น˜ํ˜• ๋ฐ์ดํ„ฐ๊ฐ€ 11๊ฐœ์ด๋ฏ€๋กœ 4*3=12๊ฐœ ์ž๋ฆฌ๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.
    # ๊ทธ๋ž˜ํ”„ ์ œ๋ชฉ 
    plt.title(feature[i])
    # ๊ทธ๋ž˜ํ”„๊ทธ๋ฆฌ๊ธฐ    
    plt.boxplot(data[feature[i]])
# ๊ทธ๋ž˜ํ”„์ถœ๋ ฅ  
plt.show()

# ํžˆ์Šคํ† ๊ทธ๋žจ ์„ ์‚ฌ์šฉํ•ด์„œ ๋ฐ์ดํ„ฐ์˜ ๋ถ„ํฌ๋ฅผ ์‚ดํŽด๋ด…๋‹ˆ๋‹ค.
feature = categorical_feature

plt.figure(figsize=(20,10))
plt.suptitle("Bar Plot", fontsize=40)

for i in range(len(feature)):
    # 1ํ–‰ 3์—ด 1~๋
    plt.subplot(1,3,i+1)
    plt.title(feature[i], fontsize=20)
    temp = data[feature[i]].value_counts()
    plt.bar(temp.keys(), temp.values, width=0.5, color='b', alpha=0.5)
    plt.xticks(temp.keys(), fontsize=12)
plt.tight_layout(rect=[0, 0.03, 1, 0.95])
plt.show()

๋ฐ˜์‘ํ˜•
Comments