๊ด€๋ฆฌ ๋ฉ”๋‰ด

๐Ÿฆ• ๊ณต๋ฃก์ด ๋˜์ž!

Pandas ํ†ต๊ณ„ ์˜ˆ์‹œ(๊ทธ๋ž˜ํ”„ ์œ ์‚ฌ๋„ ์ธก์ •) ๋ณธ๋ฌธ

Development/Python

Pandas ํ†ต๊ณ„ ์˜ˆ์‹œ(๊ทธ๋ž˜ํ”„ ์œ ์‚ฌ๋„ ์ธก์ •)

Kirok Kim 2022. 2. 11. 01:56
# ๋™๋ณ„ 0์„ธ~100์„ธ์ด์ƒ ์นผ๋Ÿผ๋งŒ ์ถ”์ถœ
# ๊ธฐ์ค€๋™ ์ œ์™ธ ์กฐ๊ฑด
์ „์ฒด์—ฐ๋ น = j.loc[ ~j["์‹œ๊ตฐ๊ตฌ"].str.endswith("๋…์‚ฐ์ œ1๋™")  , "2021๋…„12์›”_๊ณ„_0์„ธ":"2021๋…„12์›”_๊ณ„_100์„ธ ์ด์ƒ"]
#์ „์ฒด์—ฐ๋ น์˜ index๋Š” ์ˆœ์ฐจ๋ฒˆํ˜ธ

์ตœ์†Œ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ = 999999999999999
์ตœ์†Œ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉidx = -1

# ์ „์ฒด์—ฐ๋ น์˜ ๋ชจ๋“  ํ–‰์„ ๋ฐ˜๋ณต
for idx in ์ „์ฒด์—ฐ๋ น.index:
  #idx๋ฒˆ์งธ ํ–‰ ์ถ”์ถœ
  ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ = ((๋…์‚ฐ์ œ1๋™์—ฐ๋ น - ์ „์ฒด์—ฐ๋ น.loc[idx, :] )**2).loc[296].sum() 
  if ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ < ์ตœ์†Œ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ:
    ์ตœ์†Œ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ = ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ
    ์ตœ์†Œ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉidx = idx

print("์ตœ์†Œ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ",์ตœ์†Œ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ)
print("์ตœ์†Œ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉidx",์ตœ์†Œ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉidx)
print("์ตœ์†Œ์—ฐ๋ น๋ณ„์ฐจ์ด์˜ ๋™์ด๋ฆ„",j.loc[์ตœ์†Œ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉidx, "์‹œ๊ตฐ๊ตฌ"])
# ์‚ผ์ผ๋Œ€๋กœ์ ์˜ ์ฃผ๋‹จ์œ„ ๋งค์ถœ๊ณผ ๊ฐ€์žฅ ๋น„์Šทํ•œ ๋งค์ถœ ํŒจํ„ด์„ ๊ฐ€์ง„ ๋งค์žฅ๊ณผ 

# ์ฃผ๋ฌธ์ทจ์†Œ๋งค์ถœํฌํ•จ
def store(i):
  l=order_all[order_all['store_name']==i].pivot_table(
    index='week',
    values='total_amount',
    aggfunc='sum'
  )
  return l
# ์ฃผ๋ฌธ์ทจ์†Œ๋งค์ถœ์ œ์™ธ
def store1(i):
  l=order_all[~order_all['status_name'].isin(['์ฃผ๋ฌธ ์ทจ์†Œ'])][order_all['store_name']==i].pivot_table(
    index='week',
    values='total_amount',
    aggfunc='sum'
  )
  return l
์‚ผ์ผ๋Œ€๋กœ์ =v['total_amount']
mindiffsum = 999999999999999
diffsumsn = 'ใ…‹'
# ๋ชจ๋“  ์Šคํ† ์–ด์ด๋ฆ„
for i in order_all['store_name'].unique():
  if i =='์‚ผ์ผ๋Œ€๋กœ์ ':
    continue
  #idx๋ฒˆ์งธ ํ–‰ ์ถ”์ถœ

  diffsum = ((์‚ผ์ผ๋Œ€๋กœ์  - store(i)['total_amount'])**2).values.sum() 
  if diffsum < mindiffsum:
    mindiffsum = diffsum
    diffsumsn = i

print("mindiffsum",mindiffsum)
print("๋น„์Šท๋งค์žฅ",diffsumsn)

# ๋ชจ๋“  ์Šคํ† ์–ด์ด๋ฆ„
# ์ฃผ๋ฌธ์ทจ์†Œ๋งค์ถœ์ œ์™ธ
์‚ผ์ผ๋Œ€๋กœ์ 1=v1['total_amount']
mindiffsum1 = 999999999999999
diffsumsn1 = 'ใ…‹'
for i in order_all['store_name'].unique():
  if i =='์‚ผ์ผ๋Œ€๋กœ์ ':
    continue
  #idx๋ฒˆ์งธ ํ–‰ ์ถ”์ถœ

  diffsum1 = ((์‚ผ์ผ๋Œ€๋กœ์ 1 - store1(i)['total_amount'])**2).values.sum() 
  if diffsum1 < mindiffsum1:
    mindiffsum1 = diffsum1
    diffsumsn1 = i

print("์ฃผ๋ฌธ์ทจ์†Œ๋งค์ถœ์ œ์™ธ mindiffsum",mindiffsum1)
print("์ฃผ๋ฌธ์ทจ์†Œ๋งค์ถœ์ œ์™ธ ๋น„์Šท๋งค์žฅ",diffsumsn1)

ํ˜•ํƒœ ์ œ์ผ ๋‹ค๋ฅธ๊ฒƒ

# ๋…์‚ฐ์ œ1๋™๊ณผ ๋น„์Šทํ•œ 10๊ฐœ๋™ ๊ฒ€์ƒ‰
# ์ตœ์†Œ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ์„ ์ƒˆ๋กœ์šด ์นผ๋Ÿผ์— ์ €์žฅ
# ์ตœ์†Œ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ์„ ๋ฆฌ์ŠคํŠธ์— ์ €์žฅ, ๋ฆฌ์ŠคํŠธ๋ฅผ ์นผ๋Ÿผ์— ์ €์žฅ

์ „์ฒด์—ฐ๋ น = j.loc[ 
                 :  , 
                 "2021๋…„12์›”_๊ณ„_0์„ธ":"2021๋…„12์›”_๊ณ„_100์„ธ ์ด์ƒ" 
                 ]
#์ „์ฒด์—ฐ๋ น์˜ index๋Š” ์ˆœ์ฐจ๋ฒˆํ˜ธ

์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ_๋ฆฌ์ŠคํŠธ = [] 

# ์ „์ฒด์—ฐ๋ น์˜ ๋ชจ๋“  ํ–‰์„ ๋ฐ˜๋ณต
for idx in ์ „์ฒด์—ฐ๋ น.index:
  #idx๋ฒˆ์งธ ํ–‰ ์ถ”์ถœ
  ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ = ((๋…์‚ฐ์ œ1๋™์—ฐ๋ น - ์ „์ฒด์—ฐ๋ น.loc[idx, :] )**2).loc[296].sum() 
  ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ_๋ฆฌ์ŠคํŠธ.append( ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ )

j["์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ"] = ์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ_๋ฆฌ์ŠคํŠธ

j.set_index('์‹œ๊ตฐ๊ตฌ').sort_values(by="์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ").head().loc[:,"2021๋…„12์›”_๊ณ„_0์„ธ":"2021๋…„12์›”_๊ณ„_100์„ธ ์ด์ƒ"].T.plot(figsize=(20,4))
j.set_index('์‹œ๊ตฐ๊ตฌ').sort_values(by="์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ").tail().loc[:,"2021๋…„12์›”_๊ณ„_0์„ธ":"2021๋…„12์›”_๊ณ„_100์„ธ ์ด์ƒ"].T.plot(figsize=(20,4))j.set_index('์‹œ๊ตฐ๊ตฌ').sort_values(by="์—ฐ๋ น๋ณ„์ฐจ์ด์ œ๊ณฑ์˜ํ•ฉ").head().loc[:,"2021๋…„12์›”_๊ณ„_0์„ธ":"2021๋…„12์›”_๊ณ„_100์„ธ ์ด์ƒ"].T.plot(figsize=(20,4))

# ๋‹ค๋ฅธ ๋งค์ถœ ํŒจํ„ด์„ ๊ฐ€์ง„ ์ ์„ ์ถ”์ถœํ•˜์—ฌ ์„ ๊ทธ๋ž˜ํ”„๋กœ ์ถœ๋ ฅํ•˜์„ธ์š”
order_all['diffsum']=0
for i in order_all['store_name'].unique():
  diffsum = ((์‚ผ์ผ๋Œ€๋กœ์  - store(i)['total_amount'])**2).values.sum()
  order_all['store_name'].isin([i])
  order_all.loc[order_all['store_name'] == i, 'diffsum'] = diffsum
for i in order_all.sort_values(by="diffsum",ascending=False).head(1)['store_name']:
  a=v
  a=a.merge(store(i),on='week',how='outer',suffixes=(' ์‚ผ์ผ๋Œ€๋กœ์ ', i))
  a
  a.plot()
๋ฐ˜์‘ํ˜•
Comments