AI κΈ°λ° μ°¨μΈλ λ¬Ένμν μμ€ν
κ΅μ‘ μ½ν
μΈ (λ¬Έν)μ λ©ν°λͺ¨λ¬ μλ² λ© κΈ°λ° κ²μ, μλ λΆλ₯, μ΄λ―Έμ§ μ°Έμ‘°ν λ¬Έν μμ±μ μν AI μμ€ν
κΈ°λ₯
μ€λͺ
μν
μμ°μ΄ λ¬Έν κ²μ
Qwen3-VL κΈ°λ° μλ―Έ κ²μ
API μλ£
μ μ¬ λ¬Έν μΆμ²
λ©ν°λͺ¨λ¬ μλ² λ© κΈ°λ° μΆμ²
API μλ£
RAG μ§μμλ΅
LangChain κΈ°λ° λ¬Έν Q&A
API μλ£
μλ λΆλ₯
κ΅μ‘κ³Όμ /μ±μ·¨κΈ°μ€/λμ΄λ λΆλ₯
μμ
μ΄λ―Έμ§ λ¬Έν μμ±
Fact Graph κΈ°λ° νκ° λ°©μ§ μμ±
μμ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Client Applications β
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββΌββββββββββββββββββββββββββββββββββββ
β itembank-api (FastAPI) β
β βββββββββββββββ¬ββββββββββββββ¬ββββββββββββββ β
β β /search/* β /rag/* β /health β β
β β μμ°μ΄ κ²μ β RAG μλ΅ β ν¬μ€μ²΄ν¬ β β
β ββββββββ¬βββββββ΄βββββββ¬βββββββ΄ββββββββββββββ β
βββββββββββΌββββββββββββββΌββββββββββββββββββββββββββββββββββββββ
β β
βββββββββββΌββββββββββββββΌββββββββββββββββββββββββββββββββββββββ
β Service Layer β
β βββββββββββββββββββ βββββββββββββββββββ ββββββββββββββββ β
β β Qwen3VLService β β EmbeddingServiceβ β LLMService β β
β β (μμ°μ΄ μΈμ½λ©) β β (λ²‘ν° κ²μ) β β (LangChain) β β
β ββββββββββ¬βββββββββ ββββββββββ¬βββββββββ ββββββββ¬ββββββββ β
βββββββββββββΌβββββββββββββββββββββΌβββββββββββββββββββΌββββββββββ
β β β
βββββββββββββΌβββββββββ ββββββββββΌβββββββββ βββββββΌββββββββββ
β Qwen3-VL-Embedding β β In-Memory NPZ β β OpenAI API β
β (GPU ~4.3GB) β β (176K vectors) β β β
ββββββββββββββββββββββ ββββββββββ¬βββββββββ βββββββββββββββββ
β
ββββββββββΌβββββββββ
β PostgreSQL β
β + pgvector β
βββββββββββββββββββ
iosys-generative/
βββ itembank-api/ # π REST API μλ²
β βββ api/
β β βββ core/ # μ€μ , μμ‘΄μ±
β β βββ models/ # Pydantic μ€ν€λ§
β β βββ routers/ # API μλν¬μΈνΈ
β β βββ services/ # λΉμ¦λμ€ λ‘μ§
β β β βββ qwen3vl.py # μμ°μ΄ μλ² λ© (Qwen3-VL)
β β β βββ embedding.py # λ²‘ν° κ²μ
β β β βββ database.py # DB μ°λ
β β β βββ llm.py # LLM μλΉμ€
β β βββ main.py
β βββ API.md # API λ¬Έμ
β βββ HANDOVER.md # μΈμμΈκ³ λ¬Έμ
β βββ README.md # API μλ² κ°μ΄λ
βββ poc/ # POC ꡬν (μ€ν/νκ°)
β βββ models/ # λ€μ΄λ‘λλ λͺ¨λΈ
β βββ results/ # μλ² λ© κ²°κ³Ό (npz)
β βββ scripts/ # μ€ν μ€ν¬λ¦½νΈ
β βββ POC-Report.md # POC μ΅μ’
λ³΄κ³ μ
β βββ HANDOVER.md # POC μΈμμΈκ³
βββ preprocessing/ # μ μ²λ¦¬ νμ΄νλΌμΈ
βββ data/ # μλ³Έ λ°μ΄ν°
βββ docs/ # νλ‘μ νΈ λ¬Έμ
β βββ 01 IOSYS-ITEMBANK-AI-001.md # λ§μ€ν° νλ
β βββ 05 ...-R02.md # Qwen3-VL 리μμΉ
β βββ 06 ...-POC.md # POC κ³νμ
βββ CLAUDE.md # AI μ΄μμ€ν΄νΈ μ§μΉ¨
λͺ¨λΈ
μ©λ
ν¬κΈ°
μ°¨μ
Qwen3-VL-Embedding-2B
λ©ν°λͺ¨λ¬ μλ² λ© (Primary)
4.3GB
2048
Qwen3-VL-Reranker-2B
μ¬μμν (μμ )
4GB
-
KURE-v1
νκ΅μ΄ ν
μ€νΈ μλ² λ© (Fallback)
2.2GB
1024
ꡬμ±μμ
κΈ°μ
λ²μ
API Framework
FastAPI
0.128.0
Vector DB
PostgreSQL + pgvector
16 + 0.7
LLM Framework
LangChain
1.2.7
Runtime
Python
3.12.3
Deep Learning
PyTorch + CUDA
2.5.1 + 12.1
ML Library
Transformers
5.0.0
Container
Docker Compose
-
GPU: NVIDIA RTX 2070 8GB μ΄μ
RAM: 32GB μ΄μ
Storage: 50GB μ΄μ
Python: 3.12+
# itembank-api λλ ν λ¦¬λ‘ μ΄λ
cd itembank-api
# κ°μνκ²½ νμ±ν
source .venv/bin/activate
# μλ² μμ
uvicorn api.main:app --host 0.0.0.0 --port 8000
# API ν
μ€νΈ
curl http://localhost:8000/health
μμ°μ΄ κ²μ ν
μ€νΈ
# μμ°μ΄λ‘ μ μ¬ λ¬Έν κ²μ
curl -X POST http://localhost:8000/search/text \
-H " Content-Type: application/json" \
-d ' {"query_text": "μΌκ°νμ λμ΄λ₯Ό ꡬνμμ€", "top_k": 5}'
# POC λλ ν λ¦¬λ‘ μ΄λ
cd poc
source .venv/bin/activate
# μλ² λ© μμ±
python scripts/generate_qwen_embeddings.py
# κ²μ νκ°
python scripts/evaluate_search.py --model all
Method
Endpoint
μ€λͺ
POST
/search/text
μμ°μ΄ κ²μ (Qwen3VL λͺ¨λΈ)
POST
/search/similar
μ μ¬ λ¬Έν κ²μ (item_id λλ μμ°μ΄)
GET
/search/items/{id}
λ¬Έν μμΈ μ‘°ν
Method
Endpoint
μ€λͺ
POST
/rag/query
RAG κΈ°λ° μ§μμλ΅
POST
/rag/generate
μ μ¬ λ¬Έν μμ±
μμΈ API λ¬Έμ: itembank-api/API.md
API μλ² (itembank-api)
μ§ν
κ°
λΉκ³
μμ°μ΄ κ²μ (첫 쿼리)
~25μ΄
λͺ¨λΈ λ‘λ© ν¬ν¨
μμ°μ΄ κ²μ (μ΄ν)
0.8-1.0μ΄
λͺ©ν λ¬μ±
μλ² λ© μ
176,443κ°
μ κ³Όλͺ© λ¬Έν
GPU λ©λͺ¨λ¦¬
~4.3GB
RTX 2070 SUPER
μ§ν
λͺ©ν
κ²°κ³Ό
μν
P95 Latency
β€200ms
30.5ms
β
VRAM Usage
β€8GB
4.3GB
β
MRR
β₯0.65
0.74
β
Top-5 Recall
β₯80%
40.4%
β οΈ
Top-K Recall λ―Έλ¬μ μλ μμ±λ Ground Truthμ νκ³λ‘ νλ¨λ¨
Phase 1: λ¬Έν 벑ν°ν λ° κ²μ β
μλ£
Phase 2: μλ λΆλ₯ μμ€ν
(μμ )
Phase 3: μ΄λ―Έμ§ λ¬Έν μμ± (μμ )
Phase 4: ν΅ν© νλ«νΌ
API μλ² (itembank-api)
λ μ§
λ³κ²½ λ΄μ©
2026-01-29
itembank-api: Qwen3-VL μμ°μ΄ κ²μ ν΅ν©, Python 3.12 μ
κ·Έλ μ΄λ
2026-01-29
itembank-api: FastAPI μλ² κ΅¬μΆ, λ¬Έμν
2026-01-27
POC: Qwen3-VL-Embedding νκ° μλ£
Private - IOSYS Internal Use Only