feat(backend): add provider/model metrics for LLM A/B experiment#60
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
📌 개요
LLM 성능 비교(A/B 테스트)를 위한 관측(Observability) 기반을 구축했습니다.
기존에는 LLM 실행 시간에 대한 측정은 있었지만,
어떤 Provider(OpenAI / Bedrock) 및 어떤 Model이 사용되었는지 구분할 수 없었습니다.
이번 변경을 통해 provider 및 model 정보를 모든 핵심 METRIC 로그에 포함시켜,
CloudWatch에서 LLM 성능을 정확히 비교·분석할 수 있도록 구조를 개선했습니다.
✨ 주요 변경 사항
Backend — LLM Metrics 구조 확장
1️⃣ ask_llm_total span에 provider/model 태그 추가
ask_llm_total)에 provider 및 model 정보 포함예시:
METRIC|event=span|name=ask_llm_total|ms=...|provider=openai|model=gpt-4o-mini
2️⃣ request-level METRIC 로그 확장
METRIC|event=request로그에 provider/model 정보 추가예시:
METRIC|event=request|path=/api/chat/...|provider=openai|model=gpt-4o-mini|ms_total=...
3️⃣ A/B 테스트 준비 구조 정비
🎯 변경 목적
🧪 테스트 사항
🚀 다음 단계
ℹ️ 참고
본 PR은 LLM A/B 실험을 위한 관측 구조 준비 단계입니다.
실제 Bedrock LLM 호출 및 Provider 분기 로직은 이후 PR에서 반영 예정입니다.