c o d i n g . . ๐Ÿ‰/Python

[pandas + openpyxl] excel ํŒŒ์ผ ๋‹ค๋ฃจ๊ธฐ

H J 2022. 9. 25. 15:13

๋žฉ์‹ค์—์„œ (ํ˜ผ์ž ใ… ) ํ•˜๋Š” ํŒŒ์ด์ฌ ์Šคํ„ฐ๋”” ๋งˆ์ง€๋ง‰ ์ฃผ ๊ณผ์ œ ์ค‘์— ์ฒ˜์Œ์œผ๋กœ ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ์•„๋‹Œ(?) excel ํŒŒ์ผ ์ฝ๊ธฐ๋ฅผ ํ•ด์„œ ๋”ฐ๋กœ ์ •๋ฆฌํ•˜๋ ค๊ณ  ํ•œ๋‹ค!

 

๋ฐ”๋กœ ์„œ์šธ์‹œ ๋ฒ„์Šค๋…ธ์„  ๋ฐ์ดํ„ฐ ์—‘์…€ ํŒŒ์ผ์„ ์‚ฌ์šฉํ•ด์„œ ์ •๋ฅ˜์žฅ ์ด๋ฆ„์„ ๊ฒ€์ƒ‰ํ•˜๋ฉด ํ•ด๋‹น ์ •๋ฅ˜์žฅ์— ์ •์ฐจํ•˜๋Š” ๋ฒ„์Šค ๋ฒˆํ˜ธ๋ฅผ ์ถœ๋ ฅํ•˜๊ณ , ๋ฒ„์Šค ๋ฒˆํ˜ธ๋ฅผ ๊ฒ€์ƒ‰ํ•˜๋ฉด ์ •์ฐจํ•˜๋Š” ์ •๋ฅ˜์žฅ ์ด๋ฆ„๋“ค์„ ์ถœ๋ ฅํ•˜๋Š” ๋ฌธ์ œ์ด๋‹ค

 


๊ฐ€์žฅ ๋จผ์ € pandas ๋ชจ๋“ˆ์„ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด ์„ค์น˜ํ•ด์ฃผ๊ธฐ

pip install pandas

๋‚˜๋Š” ์ด๋•Œ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ๋‹ค. ํ•ด๊ฒฐ๋ฐฉ๋ฒ•์€ ์•„๋ž˜์— 

๋”๋ณด๊ธฐ
pip install --upgrade pip
pip install --upgrade pandas

์ด๋ ‡๊ฒŒ ์—…๊ทธ๋ ˆ์ด๋“œ๋ฅผ ๋‹ค ํ•ด์ฃผ์—ˆ๋Š”๋ฐ๋„ ์˜ค๋ฅ˜๊ฐ€ ๋– ์„œ ๋” ์„œ์น˜ ํ•ด๋ณธ ํ›„

python3 -m pip install pandas

์ด๋ ‡๊ฒŒ ํ•ด์ฃผ์—ˆ๋”๋‹ˆ ์˜ค๋ฅ˜ ํ•ด๊ฒฐ ๐Ÿ˜–

 

 

๋‹ค์Œ์œผ๋กœ pandas์˜ read_exel์„ ์‚ฌ์šฉํ•˜๊ธฐ ์œ„ํ•ด ๋ชจ๋“ˆ import ํ•˜๊ธฐ

import pandas as pd

 

 

ํŒŒ์ผ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ

pd.read_excel('๊ฒฝ๋กœ/ํŒŒ์ผ๋ช….xlsx')

๊ฐ™์€ ํŒŒ์ผ ๋‚ด์— ์žˆ๋‹ค๋ฉด ๊ฒฝ๋กœ๋Š” ํ•„์š” ์—†์Œ !

 

 

๊ทผ๋ฐ ์š”์ฆ˜์€ openpyxl์„ ์‚ฌ์šฉํ•ด์„œ ์—‘์…€ ํŒŒ์ผ์„ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์•„๋ž˜์™€ ๊ฐ™์ด engine ์˜ต์…˜์„ ์ถ”๊ฐ€ํ•ด์ฃผ์–ด์•ผ ํ•œ๋‹ค(๊ณ  ํ•œ๋‹ค)

data = pd.read_excel(file, engine="openpyxl")

( ๋‚˜๋Š” ๊ณผ์ œ ์ฝ”๋“œ๋ฅผ ์งœ๋Š” ์ค‘์ด๋ผ data๋Š” ๋‚ด๊ฐ€ ๊ทธ๋ƒฅ ๋ถˆ๋Ÿฌ์˜จ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๋Š” ๋ณ€์ˆ˜์ด๊ณ  file์€ excel ํŒŒ์ผ์˜ ๊ฒฝ๋กœ์ด๋‹ค ! )

 

 

์ด๋•Œ openpyxl์ด ์„ค์น˜๋˜์ง€ ์•Š์•˜๋‹ค๋ฉด ์„ค์น˜ํ•ด์ฃผ์–ด์•ผ ํ•œ๋‹ค

pip install openpyxl

๋‚˜๋Š” pip๋กœ ์„ค์น˜๋ฅผ ํ•˜๋ ค๊ณ  ํ–ˆ๋Š”๋ฐ ์˜ค๋ฅ˜๊ฐ€ ๋–ด๋‹ค

 

๋‚˜๋Š” ์–ด๋–ค ๊ฒฝ์šฐ์˜€๋ƒ๋ฉด ..

๋”๋ณด๊ธฐ
Defaulting to user installation because normal site-packages is not writeable

์ด๋ผ๋Š” ์˜ค๋ฅ˜ ๊ตฌ๋ฌธ์ด ๊ณ„์† ๋–ด๋‹ค.. ์ฒ˜์Œ์—” ์ œ๋Œ€๋กœ ์ฝ์–ด๋ณด์ง€ ์•Š๊ณ  ๊ทธ๋ƒฅ ์„ค์น˜๊ฐ€ ๋๋Š”๋ฐ ๊ณ„์† ์ œ๋Œ€๋กœ ์„ค์น˜๊ฐ€ ์•ˆ ๋˜๋Š” ๊ฑฐ์ธ ์ค„ ์•Œ์•˜๋Š”๋ฐ ์•Œ๊ณ  ๋ณด๋‹ˆ ์—ฌ๋Ÿฌ ๋ฒ„์ „์˜ ํŒŒ์ด์ฌ์ด ๊น”๋ ค์žˆ๋Š” ๊ฒฝ์šฐ ์ด๋Ÿฐ ๋ฌธ์ œ๊ฐ€ ์ƒ๊ธด๋‹ค๊ณ  ํ•œ๋‹ค

ํŒŒ์ด์ฌ์˜ ๋ฒ„์ „์„ ๋ช…์‹œํ•ด์„œ ์„ค์น˜ํ•ด์ฃผ๋ฉด ์„ค์น˜ ์™„๋ฃŒ! โ˜†

python3 -m pip install openpyxl

 

๊ทธ๋ฆฌ๊ณ  ๋‚ด ๊ณผ์ œ์—์„œ๋Š” ๋ชจ๋“  ์—ด์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— usecols๋ฅผ ์‚ฌ์šฉํ•ด์„œ ํ•„์š”ํ•œ ๋ฐ์ดํ„ฐ๋“ค๋งŒ ๊ฐ€์ ธ์™€์คฌ๋‹ค

data = pd.read_excel(file, engine="openpyxl", usecols=["๋…ธ์„ ๋ช…", "์ •๋ฅ˜์†Œ๋ช…"])

 

 

๊ณผ์ œ๊ฐ€ ๋ฐ์ดํ„ฐ๋งŒ ๋ถˆ๋Ÿฌ์˜ค๋ฉด ๊ฐ„๋‹จํ•˜๊ฒŒ ๊ฒ€์ƒ‰ํ•ด์„œ ์ถœ๋ ฅํ•˜๋Š” ๊ฑฐ๋ผ ๋ฐ”๋กœ ๋ !

for๋ฌธ์„ ์‚ฌ์šฉํ•ด์„œ data๋งŒํผ ๋Œ๋ ค์ฃผ๊ณ  in์„ ํ†ตํ•ด์„œ ํ•ด๋‹น ๋ฌธ์ž์—ด์ด ํฌํ•จ๋˜๋ฉด ์ถœ๋ ฅ๋˜๊ฒŒ ํ•ด์ฃผ์—ˆ๋‹ค

 


์ด๋ฒˆ ์ฃผ ๊ณผ์ œ 1๋ฒˆ ๋ฌธ์ œ ๋ !!

 

๋”๋ณด๊ธฐ