๐Ÿ ๋จธ์‹ ๋Ÿฌ๋‹ | ๋”ฅ๋Ÿฌ๋‹/์ถ”์ฒœ์‹œ์Šคํ…œ

์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” Implicit Feedback๋ฐ์ดํ„ฐ๋ฅผ ํ™œ์šฉํ•ด MF๋ฅผ ํ•™์Šตํ•  ์ˆ˜ ์žˆ๋Š” ๋…ผ๋ฌธ์˜ ๊ธฐ๋ฒ•์— ๋Œ€ํ•ด ์†Œ๊ฐœํ•ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค! ๋ฒ ์ด์ง€์•ˆ ์ถ”๋ก ์— ๊ธฐ๋ฐ˜ํ•˜๊ณ  ์žˆ์œผ๋ฉฐ ์ ‘๊ทผ์ด ์žฌ๋ฐŒ๊ณ  ๊ธฐ๋ฐœํ•ด์„œ ์œ ๋ช…ํ•ด์ง„ ๋ฐฉ๋ฒ•์ด๋ผ๊ณ  ํ•ฉ๋‹ˆ๋‹ค! Introduction 1) ์‚ฌ์šฉ์ž์˜ ํด๋ฆญ, ๊ตฌ๋งค ๋“ฑ์˜ ๋กœ๊ทธ๋Š” Implicit Feedback ๋ฐ์ดํ„ฐ 2) binary(0/1)๋กœ ์ด๋ฃจ์–ด์ ธ์žˆ์Œ -> ์„ ํ˜ธ๋„๋ฅผ ๊ณ ๋ คํ•˜์ง€ ์•Š๊ณ  ์ผ๋ฐ˜์ ์œผ๋กœ ์œ ์ €๊ฐ€ ์•„์ดํ…œ์„ ํด๋ฆญ/๊ตฌ๋งค ํ•  ํ™•๋ฅ ์„ ์˜ˆ์ธกํ•˜๋Š” ๋ฌธ์ œ 3) ์œ ์ €๊ฐ€ item i๋ณด๋‹ค j๋ฅผ ์ข‹์•„ํ•œ๋‹ค๋ฉด? ์ด๋ผ๋Š” ์ •๋ณด๋ฅผ ์‚ฌ์šฉํ•ด MF์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ํ•™์Šตํ•˜๋Š” ๋ฐฉ๋ฒ• => ๊ด€์ธก๋˜์ง€ ์•Š์€ ๋ฐ์ดํ„ฐ์— ๋Œ€ํ•ด์„œ ์œ ์ €๊ฐ€ ์•„์ดํ…œ์— ๊ด€์‹ฌ์ด ์—†๋Š” ๊ฒƒ์ธ์ง€ / ์œ ์ €๊ฐ€ ์‹ค์ œ๋กœ ๊ด€์‹ฌ์ด ์žˆ์ง€๋งŒ ์•„์ง ๋ชจ๋ฅด๋Š” ๊ฒƒ์ธ์ง€ ๊ณ ๋ คํ•ด์•ผ๋จ... Personalized R..
์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” Matrix Factorizaton๊ณผ ์ตœ์ ํ™” ๊ธฐ๋ฒ•์„ ์†Œ๊ฐœํ•˜๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค!_! ๋จผ์ € ๋‹ค์‹œํ•œ๋ฒˆ Matrix Factorization ๊ธฐ๋ฒ•์— ๋Œ€ํ•ด ์ •์˜ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค~_~ Matrix Factorization์ด๋ž€? R๊ณผ ์ตœ๋Œ€ํ•œ ์œ ์‚ฌํ•˜๊ฒŒ R^์„ ์ถ”๋ก ํ•˜๋Š” ๊ฒƒ์ด ๋ชฉํ‘œ ์ฆ‰ ๋ชฉ์ ํ•จ์ˆ˜๋Š” explicit feedback(1~5์  ํ‰์ )์—์„œ true rating๊ณผ ์šฐ๋ฆฌ๊ฐ€ ์˜ˆ์ธกํ•œ predicted rating์˜ ์ฐจ๋ฅผ Minimizeํ•˜๋Š” ๊ฒƒ์ด๋‹ค. Objective Fuction (๋ชฉ์ ํ•จ์ˆ˜) 1) ํ•™์Šต๋ฐ์ดํ„ฐ์— ์žˆ๋Š” ์œ ์ € u์˜ ์•„์ดํ…œ i์— ๋Œ€ํ•œ ์‹ค์ œ rating 2) ์œ ์ € u์˜ latent vector 3) ์•„์ดํ…œ i์˜ latent vector ->2),3)์€ ์ตœ์ ํ™” ๋ฌธ์ œ๋ฅผ ํ†ตํ•ด ์—…๋ฐ์ดํŠธ ๋˜๋Š” ํŒŒ๋ผ๋ฏธํ„ฐ 4) ๋žŒ๋‹ค te..
# Latent Factor Model? ๊ฐ„๋‹จํ•˜๊ฒŒ ๋งํ•˜๋ฉด ์ž„๋ฒ ๋”ฉ์„ ํ•œ๋‹ค๋Š” ๋œป! ๋‹ค์–‘ํ•˜๊ณ  ๋ณต์žกํ•œ ์œ ์ €์™€ ์•„์ดํ…œ์˜ ํŠน์„ฑ์„ ๋ช‡๊ฐœ์˜ ๋ฒกํ„ฐ๋กœ compact(์ž‘๊ฒŒ?) ํ‘œํ˜„ -> ์œ ์ €์™€ ์•„์ดํ…œ์„ ๊ฐ™์€ ์ฐจ์› ๋ฒกํ„ฐ๋กœ ํ‘œํ˜„ํ•˜์—ฌ ๋‚˜ํƒ€๋ƒ„ -> ๊ฐ™์€ ๋ฒกํ„ฐ ๊ณต๊ฐ„์—์„œ ์œ ์ €์™€ ์•„์ดํ…œ์˜ ์œ ์‚ฌํ•œ ์ •๋„๋ฅผ ๋ˆˆ์œผ๋กœ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Œ ์˜ˆ์‹œ) # ์ „ํ†ต์  SVD๋ž€? Matrix Factorization์˜ ์ฝ”์–ด ์•„์ด๋””์–ด๊ฐ€ ๋จ! Rating Matrix R์— ๋Œ€ํ•ด์„œ 1) ์œ ์ € ์ž ์žฌ ์š”์ธ ํ–‰๋ ฌ 2) ์ž ์žฌ ์š”์ธ ๋Œ€๊ฐํ–‰๋ ฌ 3) ์•„์ดํ…œ ์ž ์žฌ ์š”์ธ ํ–‰๋ ฌ ์ด๋ ‡๊ฒŒ ์„ธ๊ฐ€์ง€์˜ ํ–‰๋ ฌ๋กœ ๋ถ„ํ•ดํ•˜๋Š” ๊ฒƒ์ด๋‹ค. => ์ด๊ฒƒ์€ ๊ทธ๋ƒฅ ์„ ํ˜•๋Œ€์ˆ˜ํ•™์—์„œ์˜ ๊ฐœ๋…์œผ๋กœ ์œ ์ €์™€ ์•„์ดํ…œ์„ ์ •ํ•ด์ง„ ์ฐจ์›์œผ๋กœ ์ž„๋ฒ ๋”ฉํ•˜๊ณ ์‹ถ๋‹ค๋ฉด? # Truncated SVD? ๋Œ€ํ‘œ๊ฐ’์œผ๋กœ ์‚ฌ์šฉ๋  k๊ฐœ์˜ ํŠน์ด๊ฐ’๋งŒ ์‚ฌ์šฉํ•œ๋‹ค. ..
์ง€๊ธˆ๊นŒ์ง€ ์œ ์ €/์•„์ดํ…œ ๊ฐ„์˜ ์œ ์‚ฌ๋„๋ฅผ ๊ตฌํ•ด ์ถ”์ฒœ์„ ํ•˜๋Š” User-based๋‚˜ Item-based ๊ฐ™์€ Neighborhood-based CF์— ๋Œ€ํ•ด ๊ณต๋ถ€๋ฅผ ํ•ด๋ดค์Šต๋‹ˆ๋‹ค! ์ด๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ํ™œ์šฉํ•˜๋Š” Model-based CF์— ๋Œ€ํ•ด ์ ์–ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค! # ๋ชจ๋ธ ๊ธฐ๋ฐ˜ ํ˜‘์—… ํ•„ํ„ฐ๋ง? -SVD(Singular Value Decomposition) -MF(Matrix Factorization) / SGD, ALS, BPR ๋“ฑ์˜ ์ข…๋ฅ˜๊ฐ€ ์žˆ์Œ.. ์œ ์ €/์•„์ดํ…œ ๊ฐ„์˜ ์œ ์‚ฌ๋„์— ์˜์กดํ•˜๋Š” ๊ธฐ๋ฒ•๋“ค์€ ๋ฐ์ดํ„ฐ์˜ sparsity(๋ฐ์ดํ„ฐ์˜ ๋นˆ๊ณต๊ฐ„?)์— ์ทจ์•ฝํ•˜๊ณ  ์ถ”์ฒœ ๊ฒฐ๊ณผ๋ฅผ ์ƒ์„ฑํ•  ๋•Œ๋งˆ๋‹ค ๋งŽ์€ ์—ฐ์‚ฐ์„ ์š”๊ตฌํ•˜๊ฒŒ ๋ฉ๋‹ˆ๋‹ค! - Model-based CF๋Š” ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋ฉฐ ๋ฐ์ดํ„ฐ..
์ €๋ฒˆ ํฌ์ŠคํŒ…์— ๊ฒŒ์‹œํ•œ CF(ํ˜‘์—…ํ•„ํ„ฐ๋ง) ๋ฐฉ๋ฒ•์„ ๋ณด๋ฉด https://xod22.tistory.com/12 [K-Data x ๋Ÿฌ๋‹์Šคํ‘ผ์ฆˆ] 2-2. ํ˜‘์—… ํ•„ํ„ฐ๋ง(CF)์˜ ์›๋ฆฌ ์ €๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” ์ปจํ…์ธ  ๊ธฐ๋ฐ˜ ์ถ”์ฒœ์ธ CB(Content-based Recommendation)์— ๋Œ€ํ•ด ๊ณต๋ถ€ํ•ด๋ดค๋Š”๋ฐ์š”! ์ด๋ฒˆ์—๋Š” ๋งŽ์ด ์“ฐ์ด๋Š” ํ˜‘์—…ํ•„ํ„ฐ๋ง(CF)์— ๋Œ€ํ•ด ์ ์–ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค!! # ํ˜‘์—… ํ•„ํ„ฐ๋ง? : CF(Collaborative Filterin.. xod22.tistory.com ์œ ์‚ฌ๋„ ๊ณ„์‚ฐ์ด ํ•„์ˆ˜์ ์ธ๋ฐ์š”! ์˜ค๋Š˜์€ ์œ ์‚ฌ๋„ ๊ณ„์‚ฐ์„ ํ•˜๋Š” ์—ฌ๋Ÿฌ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์ž์„ธํ•˜๊ฒŒ ์ ์–ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค! ๊ทธ๋Ÿผ ์‹œ์ž‘! # 1) Cosine Similarity : ๋‘ ๋ฒกํ„ฐ์˜ ๊ฐ๋„๋ฅผ ์ด์šฉํ•˜์—ฌ ๊ตฌํ•  ์ˆ˜ ์žˆ๋Š” ์œ ์‚ฌ๋„. ์ง๊ด€์ ์œผ๋กœ ๋‘ ๋ฒกํ„ฐ๊ฐ€ ๊ฐ€๋ฆฌํ‚ค๋Š” ๋ฐฉํ–ฅ์ด ..
์ €๋ฒˆ ํฌ์ŠคํŒ…์—์„œ๋Š” ์ปจํ…์ธ  ๊ธฐ๋ฐ˜ ์ถ”์ฒœ์ธ CB(Content-based Recommendation)์— ๋Œ€ํ•ด ๊ณต๋ถ€ํ•ด๋ดค๋Š”๋ฐ์š”! ์ด๋ฒˆ์—๋Š” ๋งŽ์ด ์“ฐ์ด๋Š” ํ˜‘์—…ํ•„ํ„ฐ๋ง(CF)์— ๋Œ€ํ•ด ์ ์–ด๋ณด๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค!! # ํ˜‘์—… ํ•„ํ„ฐ๋ง? : CF(Collaborative Filtering) => ์œ ์ € A์™€ ๋น„์Šทํ•œ ์„ฑํ–ฅ์„ ๊ฐ–๋Š” ์œ ์ €๋“ค์ด ์„ ํ˜ธํ•˜๋Š” ์•„์ดํ…œ์„ ์ถ”์ฒœํ•œ๋‹ค. => ์•„์ดํ…œ์ด ๊ฐ€์ง„ ์†์„ฑ์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š์œผ๋ฉด์„œ๋„ ๋†’์€ ์„ฑ๋Šฅ์„ ๋ณด์ž„! # 1) User-based Collaborative Filtering : ๋‘ ์œ ์ €๊ฐ€ ์–ผ๋งˆ๋‚˜ ์œ ์‚ฌํ•œ ์•„์ดํ…œ์„ ์„ ํ˜ธํ•˜๋Š”๊ฐ€? ์œ ์ €๊ฐ„์˜ ์œ ์‚ฌ๋„๋ฅผ ๊ตฌํ•œ๋’ค, ๋‚˜์™€ ์œ ์‚ฌ๋„๊ฐ€ ๋†’์€ ์œ ์ €๋“ค์ด ์„ ํ˜ธํ•˜๋Š” ์•„์ดํ…œ์„ ์ถ”์ฒœํ•จ! ์˜ˆ์‹œ ) User B๊ฐ€ ์Šคํƒ€์›Œ์ฆˆ์— ๋งค๊ธด ํ‰์ ์„ ์˜ˆ์ธกํ•˜๊ณ  ์‹ถ์€ ์ƒํ™ฉ์—์„œ User A,B๊ฐ€ ๊ฐ ์˜ํ™”์— ๋งค๊ธด ํ‰์ ์„..
# ์ปจํ…์ธ  ๊ธฐ๋ฐ˜ ์ถ”์ฒœ? : CB(Content-based Recommendation) ์œ ์ € A๋ผ๋Š” ์‚ฌ๋žŒ์ด ๊ณผ๊ฑฐ์— ์„ ํ˜ธํ•œ ์•„์ดํ…œ์˜ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ง€๊ณ  ๋น„์Šทํ•œ ์•„์ดํ…œ์„ ์œ ์ € A์—๊ฒŒ ์ถ”์ฒœํ•œ๋‹ค. => ์•„์ดํ…œ์˜ ๋ฉ”ํƒ€๋ฐ์ดํ„ฐ์˜ ์˜ˆ) - ์˜ํ™” : ๋ฐฐ์šฐ, ๊ฐ๋…, ์˜ํ™”์žฅ๋ฅด - ์Œ์•… : ์•„ํ‹ฐ์ŠคํŠธ, ์žฅ๋ฅด, ๋ฆฌ๋“ฌ, ๋ฌด๋“œ - ๋ธ”๋กœ๊ทธ / ๋‰ด์Šค : ๋น„์Šทํ•œ ์ฃผ์ œ๋‚˜ ๋‚ด์šฉ์„ ๊ฐ€์ง„ ํ…์ŠคํŠธ(๋ฌธ์žฅ, ๋‹จ์–ด) - ์‚ฌ๋žŒ : ๊ณตํ†ต์˜ ์นœ๊ตฌ๋ฅผ ๋งŽ์ด ๊ฐ€์ง„ ๋‹ค๋ฅธ ์‚ฌ๋žŒ # Item Profile ์ถ”์ฒœ ๋Œ€์ƒ์ด ๋˜๋Š” ์•„์ดํ…œ์˜ Profile์„ ๋งŒ๋“ค์–ด์•ผํ•œ๋‹ค. Profile์€ ์•„์ดํ…œ์ด ๊ฐ€์ง€๊ณ  ์žˆ๋Š” ํŠน์ง•๋“ค๋กœ ๊ตฌ์„ฑ๋˜์žˆ๋Š”๋ฐ ์ด ์†์„ฑ๋“ค์„ Vectorํ˜•ํƒœ๋กœ ํ‘œํ˜„ํ•œ๋‹ค. * ๋ฌธ์„œ์˜ ๊ฒฝ์šฐ Item Profile : ์ค‘์š”ํ•œ ๋‹จ์–ด๋“ค์˜ ์ง‘ํ•ฉ์œผ๋กœ ํ‘œํ˜„ ๋‹จ์–ด์— ๋Œ€ํ•œ ์ค‘์š”๋„๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ์Šค์ฝ”์–ด..
# Offline Test - ์ƒˆ๋กœ์šด ์ถ”์ฒœ ๋ชจ๋ธ์„ ๊ฒ€์ฆํ•˜๊ธฐ ์œ„ํ•ด ๊ฐ€์žฅ ๋จผ์ € ํ•„์š”ํ•œ ๋‹จ๊ณ„ - ์œ ์ €๋กœ๋ถ€ํ„ฐ ์ˆ˜์ง‘ํ•œ ๋ฐ์ดํ„ฐ๋ฅผ Train/Valid/Test๋กœ ๋‚˜๋ˆ„์–ด ๋ชจ๋ธ์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ => ์ถ”์ฒœ๋ชจ๋ธ์„ ๊ณ ๋„ํ™”์‹œํ‚ฌ ์ˆ˜ ์žˆ๋‹ค # Online A/B Test - ์ถ”์ฒœ์‹œ์Šคํ…œ์˜ ๋ณ€๊ฒฝ ์ „ํ›„์˜ ์„ฑ๋Šฅ์„ ๋น„๊ตํ•˜๋Š” ๊ฒƒ์ด x - ๋™์‹œ์— ๋Œ€์กฐ๊ตฐ(A)์™€ ์‹คํ—˜๊ตฐ(B)์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ (๋‹จ, ๊ฐ™์€์‹œ๊ฐ„ ๋ฐ ๊ฐ™์€ ํ™˜๊ฒฝ์—์„œ ํ‰๊ฐ€ํ•ด์•ผํ•จ) - ์‹ค์ œ ์„œ๋น„์Šค๋ฅผ ํ†ตํ•ด ์–ป์–ด์ง€๋Š” ๊ฒฐ๊ณผ๋ฅผ ํ†ตํ•ด ์ตœ์ข… ์˜์‚ฌ๊ฒฐ์ • => ์‹œ๊ฐ„๋ณ„๋กœ ์œ ์˜๋ฏธํ•œ ์ฐจ์ด๋ฅผ ๊พธ์ค€ํžˆ ๋‚˜ํƒ€๋‚ด๋Š”์ง€ # ์ถ”์ฒœ ์‹œ์Šคํ…œ ๋ณ„ ํ‰๊ฐ€์ง€ํ‘œ 1) ๋žญํ‚น(Ranking) => ์ฃผ๋กœ NDCG@k / Recall@k : ์œ ์ €์—๊ฒŒ ์ ํ•ฉํ•œ ์•„์ดํ…œ Top K๊ฐœ๋ฅผ ์ถ”์ฒœ ์œ ์ €์˜ ์•„์ดํ…œ์— ๋Œ€ํ•œ ์„ ํ˜ธ๋„๋ฅผ ๊ตฌํ•  ํ•„์š” ์—†์ด ์†Œ๋น„์ž๊ฐ€ ์„ ํ˜ธํ•  ..
xod22
'๐Ÿ ๋จธ์‹ ๋Ÿฌ๋‹ | ๋”ฅ๋Ÿฌ๋‹/์ถ”์ฒœ์‹œ์Šคํ…œ' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๊ธ€ ๋ชฉ๋ก (2 Page)