A production-grade RAG (Retrieval-Augmented Generation) pipeline implemented identically on AWS, Azure, and GCP using a shared adapter pattern. Same core/ logic, three cloud backends, zero code duplication.
core/
interfaces.py # Embedder, Retriever, Generator ABCs
chunker.py # tiktoken 512-token chunks, 50 overlap
prompt.py # shared system prompt + build_prompt()
adapters/
aws/ # BedrockEmbedder, OpenSearchRetriever, BedrockGenerator
azure/ # AzureFoundryEmbedder, AISearchRetriever, AzureFoundryGenerator
gcp/ # VertexEmbedder, FirestoreRetriever, GeminiGenerator
| Cloud | Ingest | Embed | Vector store | Query | Region |
|---|---|---|---|---|---|
| AWS | Glue Python Shell | Bedrock Titan V2 (1024-dim) | OpenSearch Serverless | Lambda + API GW | ap-southeast-2 |
| Azure | Azure ML Scheduled Job | AI Foundry text-embedding-3-large (3072-dim) | AI Search (Basic) | Azure Functions | australiaeast |
| GCP | Vertex AI Custom Training | text-embedding-004 (768-dim) | Firestore vector search | Cloud Functions 2nd gen | australia-southeast1 |
Azure AI Search — 100 documents indexed in mcrag-docs:

Streamlit app (local) — Azure RAG answering a question:

Deployed on Hugging Face Spaces — Azure:

Deployed on Hugging Face Spaces — GCP:

- Python 3.12+
- Terraform >= 1.7
- Cloud CLIs:
aws(configured),az(logged in),gcloud(logged in) pip install -r requirements.txtfor the Streamlit app
cp .env.example .env # fill in values as you deploy each cloudaws configure # enter Access Key ID, Secret, region ap-southeast-2
# or use a named profile:
aws configure --profile mcrag
export AWS_PROFILE=mcragRequired IAM permissions for the deploying user/role:
AdministratorAccess(easiest for initial deploy), or scoped to:iam:*,lambda:*,apigateway:*,glue:*,s3:*,aoss:*,bedrock:*,cloudwatch:*,events:*
cd aws/terraform
terraform init
terraform apply -var-file=envs/dev.tfvarsCreates: OpenSearch Serverless collection, Glue job, Lambda, API Gateway, CloudWatch alarms.
python aws/scripts/update_env.pySets OPENSEARCH_ENDPOINT, RAG_API_ENDPOINT, RAG_API_KEY.
Ingest runs automatically via EventBridge daily at 02:00 UTC.
To trigger manually from the AWS Console → Glue → Jobs → mcrag-* → Run.
python scripts/query_cli.py --cloud aws --question "What are recent advances in LLMs?"aws/scripts/update_env.py writes the three deployment values automatically after terraform apply. The rest are defaults you can override in .env.
| Variable | Source | Description |
|---|---|---|
OPENSEARCH_ENDPOINT |
auto — update_env.py |
AOSS collection endpoint |
RAG_API_ENDPOINT |
auto — update_env.py |
API Gateway invoke URL |
RAG_API_KEY |
auto — update_env.py |
API Gateway API key |
AWS_REGION |
manual | AWS region (default: ap-southeast-2) |
OPENSEARCH_INDEX |
manual | Index name (default: rag-docs) |
GENERATION_MODEL_ID |
manual | Bedrock model (default: amazon.nova-pro-v1:0) |
TOP_K |
manual | Number of chunks to retrieve (default: 5) |
az login
az account set --subscription <subscription-id>
az account show # confirm correct subscriptionThe deploying identity needs the Contributor role on the subscription (or resource group), plus Cognitive Services Contributor for AI Foundry model deployments.
Register required resource providers if not already enabled:
az provider register --namespace Microsoft.MachineLearningServices
az provider register --namespace Microsoft.CognitiveServices
az provider register --namespace Microsoft.Search
az provider register --namespace Microsoft.Web
az provider register --namespace Microsoft.InsightsThe query function needs Linux .so wheels. Build them locally (works on Windows too):
python azure/scripts/build_function.pyThis cross-compiles manylinux2014_x86_64 wheels and zips the function into azure/build/.
cd azure/terraform
terraform init
terraform applyCreates: Resource Group, Storage Account, AI Foundry (text-embedding-3-large + gpt-4o), AI Search (Basic), Azure ML workspace + compute cluster, Azure Function App.
python azure/scripts/update_env.pySets AZURE_AI_FOUNDRY_ENDPOINT, AZURE_AI_FOUNDRY_KEY, AZURE_SEARCH_ENDPOINT, AZURE_SEARCH_KEY, AZURE_FUNCTION_ENDPOINT, AZURE_FUNCTION_CODE.
python azure/scripts/create_schedule.pySchedules the Azure ML job to run daily at 02:00 UTC, fetching arXiv papers and indexing them into AI Search.
python scripts/query_cli.py --cloud azure --question "What are recent advances in LLMs?"azure/scripts/update_env.py writes all deployment values automatically after terraform apply.
| Variable | Source | Description |
|---|---|---|
AZURE_AI_FOUNDRY_ENDPOINT |
auto — update_env.py |
Cognitive Services endpoint |
AZURE_AI_FOUNDRY_KEY |
auto — update_env.py |
Cognitive Services API key |
AZURE_SEARCH_ENDPOINT |
auto — update_env.py |
AI Search endpoint |
AZURE_SEARCH_KEY |
auto — update_env.py |
AI Search admin key |
AZURE_FUNCTION_ENDPOINT |
auto — update_env.py |
Function App URL |
AZURE_FUNCTION_CODE |
auto — update_env.py |
Function-level auth key |
AZURE_EMBED_DEPLOYMENT |
manual | Embedding deployment name (default: text-embedding-3-large) |
AZURE_CHAT_DEPLOYMENT |
manual | Chat deployment name (default: gpt-4o) |
AZURE_SEARCH_INDEX |
manual | Index name (default: mcrag-docs) |
gcloud auth application-default login
gcloud config set project <your-project-id>cd gcp/terraform
terraform init
terraform apply -var-file=envs/dev.tfvarsEdit gcp/terraform/envs/dev.tfvars to set your project_id first.
Creates: Firestore (Native mode) + vector index, Vertex AI staging bucket, Cloud Function 2nd gen, service account + IAM bindings.
python gcp/scripts/update_env.pySets GCP_PROJECT, GCP_REGION, GCP_FIRESTORE_COLLECTION, GCP_FUNCTION_URL.
python gcp/scripts/run_vertex_job.py --max-results 1000Submits a Vertex AI Custom Training job that fetches arXiv papers, embeds them via text-embedding-004, and writes them to Firestore. Monitor at the GCP Console → Vertex AI → Training.
python scripts/query_cli.py --cloud gcp --question "What are recent advances in LLMs?"gcp/scripts/update_env.py writes all deployment values automatically after terraform apply.
| Variable | Source | Description |
|---|---|---|
GCP_PROJECT |
auto — update_env.py |
GCP project ID |
GCP_REGION |
auto — update_env.py |
Region (default: australia-southeast1) |
GCP_FIRESTORE_COLLECTION |
auto — update_env.py |
Firestore collection (default: mcrag-docs) |
GCP_EMBED_MODEL |
auto — update_env.py |
Embedding model (default: text-embedding-004) |
GCP_CHAT_MODEL |
auto — update_env.py |
Gemini model (default: gemini-2.5-flash) |
GCP_FUNCTION_URL |
auto — update_env.py |
Cloud Function invoke URL |
The Streamlit app (app/) is deployed on HF Spaces as a public portfolio demo.
- Go to huggingface.co/new-space
- Name it (e.g.
multi-cloud-rag-demo) - Under SDK, select Docker → Streamlit
- Set visibility to Public
- Click Create Space
HF_TOKEN=hf_... # Settings → Access Tokens → Write token
HF_SPACE=<username>/multi-cloud-rag-demo # <username>/<space-name>
pip install -r requirements-dev.txtpython scripts/upload_to_hf_space.pyReads HF_TOKEN and HF_SPACE from .env, uploads app/ to the Space repo. HF detects the push and rebuilds automatically (1–2 minutes).
Go to your Space → Settings → Variables and secrets and add:
| Secret | Value | Used by |
|---|---|---|
RAG_API_ENDPOINT |
API Gateway invoke URL | pages/1_AWS.py |
RAG_API_KEY |
API Gateway API key | pages/1_AWS.py |
AZURE_FUNCTION_ENDPOINT |
Azure Function URL | pages/2_Azure.py |
AZURE_FUNCTION_CODE |
Azure Function auth key | pages/2_Azure.py |
GCP_FUNCTION_URL |
Cloud Function URL | pages/3_GCP.py |
Click Restart Space after adding secrets.
Open https://huggingface.co/spaces/<username>/multi-cloud-rag-demo:
- Each cloud page shows a green "Endpoint configured" notice when its secret is set
- Type a question and press Enter — the app queries the cloud backend and displays the answer with arXiv source IDs
- Pages without secrets set show a warning and stop gracefully
| Variable | Description |
|---|---|
HF_TOKEN |
HF write token (hf_...) |
HF_SPACE |
Space repo ID (e.g. <username>/multi-cloud-rag-demo) |
pip install -r requirements.txt
streamlit run app/Home.pyReads from .env automatically. Each cloud page is independent — pages for clouds without .env values set will show a warning.