๐Ÿ” ๋ฐ์ดํ„ฐ ๋ถ„์„/01. Data Collection

์ƒ์„ฑํ˜• AI์™€ Python์„ ํ™œ์šฉํ•œ ์›น ์Šคํฌ๋ž˜ํ•‘ - ์ปค๋จธ์Šค ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘

xod22 2024. 8. 24. 19:02
728x90

์˜ค๋Š˜์€ ์ƒ์„ฑํ˜• ์ƒ์„ฑํ˜• AI๋ฅผ ํ™œ์šฉํ•œ ํฌ๋กค๋ง ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์†Œ๊ฐœํ•ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค~!!

 


1. ์ƒ์„ฑํ˜• AI

๋จผ์ € ์ƒ์„ฑํ˜• AI๋Š” ์ž˜ ์•Œ๊ณ ์žˆ๋Š” chat gpt๋ถ€ํ„ฐ ์—ฌ๋Ÿฌ ์ข…๋ฅ˜๊ฐ€ ์žˆ๋‹ค.

 

์—ฌ๊ธฐ์„œ ์›น์Šคํฌ๋ž˜ํ•‘์„ ํ•  ๋•Œ ์ตœ์ ํ™” ๋˜์–ด์žˆ๋Š”๊ฑด perplexity๋ผ๊ณ  ์—ฐ์‚ฌ๋‹˜์ด ๊ฐ•.์กฐ.ํ•˜์…จ๋‹ค.

 

 

2. chat GPT๋กœ ์›น์Šคํฌ๋ž˜ํ•‘ ํ”„๋กœ์„ธ์Šค ๋„์‹ํ™” ํ•ด๋ณด๊ธฐ

 

๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ž‘์„ฑํ•˜์—ฌ chatGPT์— ๋ช…๋ น์„ ๋‚ด๋ฆฌ๋ฉด mermaid ์ฝ”๋“œ๋ฅผ ์•Œ๋ ค์ค€๋‹ค. (์‹œ๊ฐํ™” ํˆด)

์›น ์Šคํฌ๋ž˜ํ•‘ ๊ณผ์ •์„ 10๊ฐ€์ง€ ๋‹จ๊ณ„๋กœ ๋‚˜์—ดํ•˜๊ณ  ์•ˆ๋˜๋Š” ๊ณผ์ •์— ๋Œ€ํ•œ ํ”ผ๋“œ๋ฐฑ์„ ์ถ”๊ฐ€ํ•ด์„œ ์ž‘์„ฑํ•  ๊ฒƒ.
์œ„ ๊ณผ์ •์„ mermaid ์ฝ”๋“œ๋กœ ์ž‘์„ฑํ•ด์ค˜

 

chatGPT๊ฐ€ ์•Œ๋ ค์ค€ ์ฝ”๋“œ๋ฅผ ์ขŒ์ธก์— ์ž…๋ ฅํ•˜๋ฉด ์›น ์Šคํฌ๋ž˜ํ•‘ ๊ณผ์ •์„ ๋„์‹ํ™” ํ•ด์ค€๋‹ค.

๋ฐœํ‘œ์ž๋ฃŒ ๋งŒ๋“ค๋•Œ๋‚˜, ์ด๋Ÿฐ ํ”„๋กœ์„ธ์Šค ๊ทธ๋ฆผ์„ ๋งŒ๋“ค๋•Œ ํŽธํ•˜๊ฒŒ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Œ!

 

3. ์ด์ปค๋จธ์Šค ๋ฐ์ดํ„ฐ ํฌ๋กค๋ง ํ•ด์˜ค๊ธฐ

ํŒจํ‚ค์ง€ ๋‹ค์šด ์ด์Šˆ๋กœ colab ํ™˜๊ฒฝ์—์„œ ์ฝ”๋“œ ์ž‘์„ฑ.

ํฌ๋กค๋ง ์‚ฌ์ดํŠธ๋Š” ์ข‹์•„ํ•˜๋Š” ํŒจ์…˜ ํ”Œ๋žซํผ์ธ 29cm ์—ฌ์„ฑ์˜๋ฅ˜ ์นดํ…Œ๊ณ ๋ฆฌ ํ™”๋ฉด์ด๋‹ค.

 

๋ฒ ์ŠคํŠธ - ๊ฐ๋„ ๊นŠ์€ ์ทจํ–ฅ ์…€๋ ‰ํŠธ์ƒต 29CM

ํŒจ์…˜, ๋ผ์ดํ”„์Šคํƒ€์ผ, ์ปฌ์ฒ˜๊นŒ์ง€ 29CM๋งŒ์˜ ๊ฐ๋„ ๊นŠ์€ ์…€๋ ‰์…˜์„ ๋งŒ๋‚˜๋ณด์„ธ์š”.

shop.29cm.co.kr

 

3-1) ์‚ฌ์ดํŠธ์— ์ ‘์†ํ•˜์—ฌ option+command+i๋ฅผ ์ž…๋ ฅํ•˜๋ฉด ๊ฐœ๋ฐœ์ž ๋„๊ตฌ๋กœ ์ ‘์†๋œ๋‹ค.

์ƒ๋‹จ์˜ networkํƒญ์— ๋“ค์–ด๊ฐ€์„œ ๋ชจ๋“  ๊ฐ’์„ ์ง€์šฐ๊ณ  ์ƒˆ๋กœ๊ณ ์นจ 

network -> ์ง€์šฐ๊ธฐ -> ์ƒˆ๋กœ๊ณ ์นจ

 

3-2) ํฌ๋กค๋ง ํ•˜๊ณ ์žํ•˜๋Š” ๋ฐ์ดํ„ฐ๊ฐ€ ๋“ค์–ด์žˆ๋Š” name์„ ์ฐพ๋Š”๋‹ค.

(์ด ๊ณผ์ •์ด ์ œ์ผ ์ค‘์š”ํ•˜๊ณ  preview๋ฅผ ๋ณด๋ฉด์„œ ๋ณด๋ฉด ์ข€ ๋” ์ฐพ๊ธฐ ์‰ฝ๋‹ค)

 

 

3-3) headers, payload, response ํ•œ๊ฐ€์ง€ ๋ณต์‚ฌํ•ด๋‘˜ ๊ฒƒ

 

3-4) ํ”„๋กฌํ”„ํŠธ ์ž‘์„ฑ

๋‹ค์Œ ์ •๋ณด๋ฅผ ์ฐธ๊ณ ํ•ด์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๋Š” ํŒŒ์ด์ฌ ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜๊ณ  ๋ฐ์ดํ„ฐ๋Š” sqlitedb์— "29cm"์ด๋ฆ„์œผ๋กœ ์ €์žฅํ•˜๊ณ  ํŒ๋‹ค์Šค๋กœ ์ฝ์–ด์™€์„œ ํ™•์ธํ•˜๋Š” ์ฝ”๋“œ๊นŒ์ง€ ์ž‘์„ฑํ•  ๊ฒƒ.

 

์ด๋ผ๋Š” ํ”„๋กฌํ”„ํŠธ์™€ ํ•จ๊ป˜ 3)์—์„œ ๋ณต์‚ฌํ•ด๋‘” ๊ฐ’์„ ๋„ฃ์–ด์ฃผ์–ด ํ”„๋กฌํ”„ํŠธ๋ฅผ ์ž‘์„ฑํ•ด์ค€๋‹ค

 

[ํ”„๋กฌํ”„ํŠธ]

๋”๋ณด๊ธฐ

๋‹ค์Œ ์ •๋ณด๋ฅผ ์ฐธ๊ณ ํ•ด์„œ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜๋Š” ํŒŒ์ด์ฌ ์ฝ”๋“œ๋ฅผ ์ž‘์„ฑํ•˜๊ณ  ๋ฐ์ดํ„ฐ๋Š” sqlitedb์— "29cm"์ด๋ฆ„์œผ๋กœ ์ €์žฅํ•˜๊ณ  ํŒ๋‹ค์Šค๋กœ ์ฝ์–ด์™€์„œ ํ™•์ธํ•˜๋Š” ์ฝ”๋“œ๊นŒ์ง€ ์ž‘์„ฑํ•  ๊ฒƒ.

 

Request URL: https://recommend-api.29cm.co.kr/api/v4/best/items?categoryList=268100100&periodSort=NOW&limit=100&offset=0 Request Method: GET Status Code: 200 OK Remote Address: 15.164.247.68:443 Referrer Policy: strict-origin-when-cross-origin

 

categoryList: 268100100 periodSort: NOW limit: 100 offset: 0

 

{ "result": "SUCCESS", "data": { "content": [ { "itemNo": 2163860, "itemName": "[29CM ๋‹จ๋…] CLASSIC TWEED JACKET MELANGE GREY_UDJA4C203G2", "itemDescription": null, "imageUrl": "/item/202408/11ef5a216b184cfb8362158ed367c2eb.jpg", "frontBrandNo": 13310, "frontBrandNameKor": "๋˜์ŠคํŠธ", "frontBrandNameEng": "DUNST", "consumerPrice": 289000, "lastSalePrice": 228888, "lastSalePercent": 21, "heartCount": 11372, "heartOn": false, "reviewCount": 228, "reviewAveragePoint": 5.0, "isCoupon": true, "isFreeShipping": false, "isNew": false, "isSoldOut": false, "subjectDescriptions": [ "[๋ฒ ์ŠคํŠธ์˜ค๋” -9/1]" ], "saleInfoV2": { "consumerPrice": 289000, "sellPrice": 260100, "saleRate": 10, "couponSaleRate": 12, "isCoupon": true, "totalSellPrice": 228888, "totalSaleRate": 21 }, "frontCategoryInfo": [ { "category1Code": 268100100, "category1Name": "์—ฌ์„ฑ์˜๋ฅ˜", "category2Code": 268102100, "category2Name": "์•„์šฐํ„ฐ", "category3Code": 268102104, "category3Name": "์žฌํ‚ท" }, { "category1Code": 268100100, "category1Name": "์—ฌ์„ฑ์˜๋ฅ˜", "category2Code": 268116100, "category2Name": "EXCLUSIVE", "category3Code": 268116103, "category3Name": "์•„์šฐํ„ฐ" } ] },

[์ž‘์„ฑํ•ด์ค€ python ์ฝ”๋“œ]

๋”๋ณด๊ธฐ
import requests
import pandas as pd
import sqlite3
import json

def fetch_data():
    url = "https://recommend-api.29cm.co.kr/api/v4/best/items"
    params = {
        "categoryList": "268100100",
        "periodSort": "NOW",
        "limit": 100,
        "offset": 0
    }
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'
    }
    
    response = requests.get(url, params=params, headers=headers)
    data = response.json()
    
    return data['data']['content']

# ๋ฐ์ดํ„ฐ ์ˆ˜์ง‘
items = fetch_data()

# DataFrame ์ƒ์„ฑ
df = pd.DataFrame(items)

# ํ•„์š”ํ•œ ์ปฌ๋Ÿผ๋งŒ ์„ ํƒ
columns_to_keep = [
    'itemNo', 'itemName', 'frontBrandNameKor', 'frontBrandNameEng', 'consumerPrice', 
    'lastSalePrice', 'lastSalePercent', 'heartCount', 'reviewCount', 'reviewAveragePoint',
    'isCoupon', 'isFreeShipping', 'isNew', 'isSoldOut'
]
df = df[columns_to_keep]

# SQLite ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์— ์ €์žฅ
db_filename = '29cm.db'
conn = sqlite3.connect(db_filename)
df.to_sql('best_items', conn, if_exists='replace', index=False)
conn.close()

# ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์—์„œ ๋ฐ์ดํ„ฐ ์ฝ์–ด์˜ค๊ธฐ
conn = sqlite3.connect(db_filename)
df_from_db = pd.read_sql_query("SELECT * FROM best_items", conn)
conn.close()

# ๊ฒฐ๊ณผ ํ™•์ธ
print(df_from_db.head())
print(f"\n์ด {len(df_from_db)} ๊ฐœ์˜ ์ƒํ’ˆ ๋ฐ์ดํ„ฐ๊ฐ€ ์ˆ˜์ง‘๋˜์—ˆ์Šต๋‹ˆ๋‹ค.")

 

colab ์„ฑ๊ณต ์‹œ!

 

๋‹ค์Œ ์ฝ”๋“œ๋ฅผ ์‹คํ–‰์‹œํ‚ค๋ฉด ๋ฐ์ดํ„ฐ๊ฐ€ ์ž˜ ์ˆ˜์ง‘๋œ ๊ฒƒ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๊ณ , ์š” ๋ฐ์ดํ„ฐ๋ฅผ .db ํ™•์žฅ์ž๋กœ ์ €์žฅ๊นŒ์ง€ ํ–ˆ๋‹ค๋ฉด? 

ํ•ด๋‹น ์‚ฌ์ดํŠธ์—์„œ db๋ฅผ ์„ค์น˜ํ•˜์ง€ ์•Š์•„๋„ ์›น์œผ๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ํ™•์ธํ•ด๋ณผ ์ˆ˜ ์žˆ๋‹ค.

 

SQLite Viewer Web App

SQLite Viewer Web App SQLite Viewer Web is a free, web-based SQLite Explorer, inspired by DB Browser for SQLite and Airtable. Use this web-based SQLite Tool to quickly and easily inspect .sqlite files. Your data stays private: Everything is done client-sid

sqliteviewer.app


์‹œํ–‰์ฐฉ์˜ค๋Š” ๋งŽ์•˜์ง€๋งŒ ํ•œ๋ฒˆ ์„ฑ๊ณตํ•ด๋ณด๊ณ  ๋‚˜๋‹ˆ, ์ •๋ง ํŽธ์•ˆํ•œ ๋„๊ตฌ๋ผ๋Š” ๊ฒƒ์„ ์‹ค๊ฐํ•˜๊ฒŒ ๋œ๋‹ค.

3๋…„์ „์— ๊ณต๋ชจ์ „์„ ์œ„ํ•ด ํฌ๋กค๋ง์„ ํ•  ๋•Œ๋งŒ ํ•˜๋”๋ผ๋„, ๋จธ๋ฆฌ๋ฅผ ์‹ธ๋งค๊ณ  ์ฝ”๋“œ ์˜ค๋ฅ˜๋ฅผ ํ•ด๊ฒฐํ•˜์˜€๋Š”๋ฐ ..! ์ด์   ์ฝ”๋“œ๋ฅผ ์ง์ ‘ ์งœ์ง€ ์•Š์•„๋„! ์ด๋ ‡๊ฒŒ ์ƒ์„ฑํ˜• AI๋กœ ์‰ฝ๊ฒŒ ํฌ๋กค๋ง์„ ํ•  ์ˆ˜ ์žˆ๋‹ค๋‹ˆ,, ์”์“ธํ•˜๋ฉด์„œ๋„ ๋น ๋ฅด๊ฒŒ ๋ณ€ํ•˜๋Š” ์„ธ์ƒ์— ๋น ๋ฅด๊ฒŒ ์ ์‘ํ•ด์•ผ๊ฒ ๋‹ค๋Š” ์ƒ๊ฐ์ด ๋“ ๋‹ค!

 

 

*ํŒŒ์ด์ฌ ์ปค๋ฎค๋‹ˆํ‹ฐ PyLadies์—์„œ ์ฃผ๊ด€ํ•œ ์„ธ๋ฏธ๋‚˜์—์„œ ๊ต์œก๋ฐ›์€ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค

 

โœจ ์ƒ์„ฑํ˜• AI์™€ Python์„ ํ™œ์šฉํ•œ ์›น ์Šคํฌ๋ž˜ํ•‘: ์‹ค์Šตํ•˜๊ณ  ๊ฒฝํ—˜ ๋‚˜๋ˆ„๊ธฐ โœจ, 2024๋…„ 8์›” 24์ผ (ํ† ) ์˜คํ›„ 1:0

# โœจ ์ƒ์„ฑํ˜• AI์™€ Python์„ ํ™œ์šฉํ•œ ์›น ์Šคํฌ๋ž˜ํ•‘: ์‹ค์Šตํ•˜๊ณ  ๊ฒฝํ—˜ ๋‚˜๋ˆ„๊ธฐ โœจ ์•ˆ๋…•ํ•˜์„ธ์š”! ์ง€๋‚œ ๋ฒˆ ์กฐ์€๋‹˜์ด ๋ฐœํ‘œํ•ด์ฃผ์‹  ์›น ์Šคํฌ๋ž˜ํ•‘ ์„ธ๋ฏธ๋‚˜์—์„œ ๋œจ๊ฑฐ์šด ๊ด€์‹ฌ์„ ๋ฐ›์•„, ์ง์ ‘ ์‹ค์Šตํ•˜๊ณ  ๊ฒฝํ—˜์„ ๋‚˜๋ˆŒ ์ˆ˜ ์žˆ

www.meetup.com

728x90