This project implements a RAG (Retrieval-Augmented Generation) system using Quarkus, LangChain4j and OpenAI to create an intelligent chatbot that can answer questions based on ingested documents.
rag/
├── frontend/ # Vue.js 3 + Vuetify frontend
├── src/main/java/ # Quarkus backend (Java 21)
├── src/main/resources/
│ ├── rag/ # Sample documents for ingestion
│ └── db/migration/ # Flyway migrations (PostgreSQL)
├── docs/ # Documentation (MCP, TestPlan)
└── pom.xml
The src/main/resources/rag directory contains sample documents for
ingestion and system testing. You can add your own documents to this
directory to expand the chatbot's knowledge.
- Java 21 - Programming language
- Quarkus 3.31.4 - Framework for cloud-native Java applications
- LangChain4j - Framework for AI integration
- OpenAI - LLM (gpt-4o-mini) and embeddings (text-embedding-3-small)
- PostgreSQL + PGVector - Vector database for embeddings (via LangChain4j)
- Redis - Cache and memory management
- Maven - Dependency management
- Java 21 or higher
- Maven 3.8+
- Docker (for PostgreSQL+PGVector and Redis via Quarkus Dev Services)
- OPENAI_API_KEY - Required for AI endpoints
# Required: OpenAI API key for chat and embeddings
export OPENAI_API_KEY=your-openai-api-key
# Clone the repository
git clone https://github.com/rodrigoprestesmachado/rag.git
cd ragVITE_GOOGLE_CLIENT_ID) are embedded in the code during the build.
# Navigate to the frontend directory
cd frontend
# Configure environment variables (create .env file if it doesn't exist)
# See frontend/README.md for details on required variables
# Install dependencies (if you haven't already)
npm install
# Build the frontend
npm run buildNote: The build generates static files in src/main/resources/META-INF/resources/ that will be served by Quarkus. If you modify environment variables in frontend/.env, you will need to do a new build for changes to take effect.
# Return to project root
cd ..
# Run the application in development mode
./mvnw quarkus:devNote: PostgreSQL (with PGVector extension) and Redis are started automatically via Quarkus Dev Services. Docker must be installed and running for Quarkus to create and manage these containers.
If you want to test the chat interface, simply press the w key in the
terminal when the application is running and Quarkus will open the
web interface at: http://localhost:8081/.
Package and run the application:
# Compile
./mvnw package
# Run JAR
java -jar target/quarkus-app/quarkus-run.jar
# Or create and run uber-jar
./mvnw package -Dquarkus.package.jar.type=uber-jar
java -jar target/*-runner.jar# Build the application
./mvnw package
# Build Docker image
docker build -f src/main/docker/Dockerfile.jvm -t rag:jvm .
# Run container (requires OPENAI_API_KEY and PostgreSQL/Redis URLs)
docker run -i --rm -p 8081:8081 \
-e OPENAI_API_KEY=$OPENAI_API_KEY \
-e QUARKUS_DATASOURCE_JDBC_URL=jdbc:postgresql://host.docker.internal:5432/orion_rag \
-e QUARKUS_DATASOURCE_REACTIVE_URL=postgresql://host.docker.internal:5432/orion_rag \
-e QUARKUS_REDIS_HOSTS=redis://host.docker.internal:6379 \
rag:jvm# Native build
./mvnw package -Dnative -Dquarkus.native.container-build=true
# Build Docker image
docker build -f src/main/docker/Dockerfile.native -t rag:native .
# Run container (same env vars as JVM)
docker run -i --rm -p 8081:8081 \
-e OPENAI_API_KEY=$OPENAI_API_KEY \
-e QUARKUS_DATASOURCE_JDBC_URL=jdbc:postgresql://host.docker.internal:5432/orion_rag \
-e QUARKUS_DATASOURCE_REACTIVE_URL=postgresql://host.docker.internal:5432/orion_rag \
-e QUARKUS_REDIS_HOSTS=redis://host.docker.internal:6379 \
rag:nativeCreate a docker-compose.yml file:
version: '3.8'
services:
rag:
build:
context: .
dockerfile: src/main/docker/Dockerfile.jvm
ports:
- "8081:8081"
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- QUARKUS_DATASOURCE_JDBC_URL=jdbc:postgresql://postgres:5432/orion_rag
- QUARKUS_DATASOURCE_REACTIVE_URL=postgresql://postgres:5432/orion_rag
- QUARKUS_REDIS_HOSTS=redis://redis:6379
depends_on:
- redis
- postgres
redis:
image: redis:7-alpine
ports:
- "6379:6379"
postgres:
image: pgvector/pgvector:pg17
environment:
POSTGRES_USER: quarkus
POSTGRES_PASSWORD: quarkus
POSTGRES_DB: orion_rag
ports:
- "5432:5432"Run with:
docker-compose up -dPara publicar a aplicação no Fly.io, use o Dockerfile (build JVM) e o fly.toml na raiz do projeto. O guia completo (secrets, Postgres, Redis, health check) está em docs/DEPLOY-FLY.md.
fly launch
fly secrets set OPENAI_API_KEY=... QUARKUS_DATASOURCE_JDBC_URL=... # ver DEPLOY-FLY.md
fly deployThe project follows Hexagonal Architecture (Clean Architecture) principles, organizing the code in well-defined layers:
src/main/java/dev/orion/rag/
├── domain/ # Application core
│ ├── model/ # Domain entities
│ ├── port/ # Interfaces/Contracts
│ └── usecase/ # Use cases/Business rules
├── application/ # Application layer
│ ├── mcp/ # MCP tools (course-aware Q&A)
│ └── rest/ # REST controllers
└── infrastructure/ # Infrastructure layer
├── adapter/ # External adapters (AI, etc.)
├── repository/ # Repository implementations
├── service/ # Infrastructure services
└── util/ # Utilities
Hexagonal Architecture (also known as Ports and Adapters) is an architectural pattern that promotes separation of concerns and loose coupling between application layers. This project implements the following concepts:
The application core, containing pure business logic independent of external frameworks:
- Domain Entities: Classes such as
ChatMessage,RagQuery,RagResponse,ConversationMemory - Use Cases: Orchestrate business logic (
ChatbotUseCase,AskQuestionUseCase) - Ports (Interfaces): Contracts that define how the domain communicates with the external world
domain/
├── model/
│ ├── AIRequest.java # AI request
│ ├── ChatMessage.java # Chat message
│ ├── ConversationMemory.java # Conversation memory
│ ├── RagQuery.java # RAG query
│ └── RagResponse.java # RAG response
├── port/
│ ├── AIService.java # Interface for AI services
│ ├── EmbeddingRepository.java # Interface for embedding repository
│ └── MemoryService.java # Interface for memory services
└── usecase/
├── AskQuestionUseCase.java # Use case: questions
├── ChatbotUseCase.java # Use case: chatbot
└── IngestDocumentsUseCase.java # Use case: ingestion
Coordinates interaction between the domain and the external world:
- REST Controllers: Create API endpoints (
RagController)
Implements technical details and external integrations:
- Adapters: Concrete implementations of ports
(
AIServiceAdapter,LangChainAIService) - Repositories: Implements data persistence. In this application, repositories
are responsible for storing and retrieving embedding information and
conversation history (
EmbeddingRepositoryImpl,MemoryServiceImpl) - Services: Implementation of external application services. The system
requires Apache PDFBox (
PDFExtractorService) for proper PDF extraction.
- Testability: The domain can be tested in isolation through port mocks
- Flexibility: Easy switching of AI providers (Ollama → OpenAI → Azure)
- Maintainability: Framework changes do not affect business logic
- Independence: The application core does not depend on external libraries
HTTP Request → REST Controller → Use Case → Domain Logic → Port Interface → Adapter → External Service
↓
HTTP Response ← REST Controller ← Use Case ← Domain Logic ← Port Interface ← Adapter ← External Service
Hexagonal architecture is especially valuable in RAG systems due to the evolutionary and experimental nature of AI:
- Model Experimentation: Facilitates testing with different LLMs (Ollama, OpenAI, Claude) without changing business logic
- Multiple Embedding Strategies: Allows comparing different vectorization algorithms (sentence-transformers, OpenAI embeddings, etc.)
- Interchangeable Vector Databases: Easy support for PostgreSQL+PGVector, Pinecone, Weaviate or Qdrant
- Chunking Strategies: Implementation of different approaches for document splitting
- Adaptable Memory: Switching between Redis, relational database or in-process memory
- Document Processing: Extensibility for PDF, Word, HTML, etc.
Hexagonal RAG System:
┌─────────────────────────┐
│ REST Controllers │ ← Interface Layer
│ MCP Tools │
├─────────────────────────┤
│ Use Cases │ ← RAG Orchestration
│ • ChatbotUseCase │
│ • AskQuestionUseCase │
│ • IngestUseCase │
├─────────────────────────┤
│ Ports │ ← Contracts
│ • AIService │
│ • EmbeddingRepository │
│ • MemoryService │
├─────────────────────────┤
│ Adapters │ ← Implementations
│ • OpenAIAdapter │
│ • PGVector (Embedding) │
│ • RedisAdapter │
└─────────────────────────┘
- ChatbotUseCase: Implements conversations with context memory
- AskQuestionUseCase: Answers questions based on documents
- IngestDocumentsUseCase: Processes and indexes documents
// Domain defines the contract (Port)
public interface AIService {
Multi<String> generateResponse(String prompt, List<ChatMessage> context);
}
// Infrastructure implements different adapters
@ApplicationScoped
public class LangChainAIService implements AIService { ... } // OpenAI, Azure, etc.
// Use case remains unchanged
@ApplicationScoped
public class ChatbotUseCase {
@Inject AIService aiService; // Injection by interface
}// Unit test using port mock
@Test
void shouldGenerateResponseWithMemory() {
// Given
AIService mockAI = Mockito.mock(AIService.class);
MemoryService mockMemory = Mockito.mock(MemoryService.class);
ChatbotUseCase useCase = new ChatbotUseCase(null, mockAI, mockMemory);
// When & Then - test only business logic
// without external dependencies
}// New feature: add support for multiple embeddings
public interface EmbeddingRepository {
// Existing method
List<Document> findSimilar(String query, int limit);
// New method - does not break existing implementations
List<Document> findSimilarWithMetadata(String query, int limit, Map<String, Object> filters);
}# Conversation with context maintained per session
curl "http://localhost:8081/ai/chatbot?session=user123&prompt=Hello, how can you help me?"# Query based on ingested documents
curl "http://localhost:8081/ai/ask?session=user123&prompt=What is Vue.js?"# Get conversation history
curl "http://localhost:8081/ai/memory?session=user123"# Retrieve semantically relevant course content for a query (used by MCP tools)
curl "http://localhost:8081/ai/context?session=user123&prompt=What%20are%20variables%20in%20JavaScript?&maxResults=5"// JavaScript/Frontend
const response = await fetch(
'http://localhost:8081/ai/chatbot?session=user123&prompt=Explain generative AI'
);
const reader = response.body.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const chunk = decoder.decode(value);
console.log(chunk); // Streaming response
}# Application port
quarkus.http.port=8081
# OpenAI (LLM + embeddings)
quarkus.langchain4j.openai.api-key=${OPENAI_API_KEY}
quarkus.langchain4j.openai.chat-model.model-name=gpt-4o-mini
quarkus.langchain4j.openai.embedding-model.model-name=text-embedding-3-small
quarkus.langchain4j.embedding-model.provider=openai
# PostgreSQL + PGVector (vector database for embeddings)
quarkus.datasource.db-kind=postgresql
quarkus.langchain4j.pgvector.dimension=1536
# Dev Services starts PostgreSQL+PGVector automatically when Docker is available
# Redis (Cache/Memory)
quarkus.redis.devservices.enabled=true
# RAG configuration
rag.location=src/main/resources/rag
rag.context=JavaScript
# Memory Management
memory.default.max-messages=100
memory.ttl.hours=48# Run all tests
./mvnw test
# Integration tests
./mvnw verify -Dskip.integration.tests=false
# Tests with native profile
./mvnw verify -DnativeThe Quarkus backend includes an MCP (Model Context Protocol) server for integrating the RAG system with LLM platforms (e.g., Claude, Cursor). It exposes tools for course-aware question answering via HTTP/SSE.
Endpoint: http://localhost:8081/mcp/sse (or /mcp for Streamable HTTP)
Tools:
retrieve_course_context— Retrieves semantically relevant course content for a queryask_course_question— Asks a question and returns the RAG-generated answer
See docs/MCP.md for Cursor/Claude configuration.
- Fork the project
- Create a branch for your feature (
git checkout -b feature/new-feature) - Commit your changes (
git commit -m 'Add new feature') - Push to the branch (
git push origin feature/new-feature) - Open a Pull Request
This project contains confidential and proprietary information. Unauthorized copying, distribution, or use of this file or its contents is strictly prohibited.
© 2025 Rodrigo Prestes Machado. All rights reserved.