multi-cloud-serverless-rags

A production-grade RAG (Retrieval-Augmented Generation) pipeline implemented identically on AWS, Azure, and GCP using a shared adapter pattern. Same core/ logic, three cloud backends, zero code duplication.

Architecture

core/
  interfaces.py   # Embedder, Retriever, Generator ABCs
  chunker.py      # tiktoken 512-token chunks, 50 overlap
  prompt.py       # shared system prompt + build_prompt()
adapters/
  aws/            # BedrockEmbedder, OpenSearchRetriever, BedrockGenerator
  azure/          # AzureFoundryEmbedder, AISearchRetriever, AzureFoundryGenerator
  gcp/            # VertexEmbedder, FirestoreRetriever, GeminiGenerator

Cloud	Ingest	Embed	Vector store	Query	Region
AWS	Glue Python Shell	Bedrock Titan V2 (1024-dim)	OpenSearch Serverless	Lambda + API GW	ap-southeast-2
Azure	Azure ML Scheduled Job	AI Foundry text-embedding-3-large (3072-dim)	AI Search (Basic)	Azure Functions	australiaeast
GCP	Vertex AI Custom Training	text-embedding-004 (768-dim)	Firestore vector search	Cloud Functions 2nd gen	australia-southeast1

Screenshots

Azure AI Search — 100 documents indexed in mcrag-docs:

Streamlit app (local) — Azure RAG answering a question:

Deployed on Hugging Face Spaces — Azure:

Deployed on Hugging Face Spaces — GCP:

Prerequisites

Python 3.12+
Terraform >= 1.7
Cloud CLIs: aws (configured), az (logged in), gcloud (logged in)
pip install -r requirements.txt for the Streamlit app

cp .env.example .env   # fill in values as you deploy each cloud

AWS Setup

1. Configure credentials

aws configure          # enter Access Key ID, Secret, region ap-southeast-2
# or use a named profile:
aws configure --profile mcrag
export AWS_PROFILE=mcrag

Required IAM permissions for the deploying user/role:

AdministratorAccess (easiest for initial deploy), or scoped to: iam:*, lambda:*, apigateway:*, glue:*, s3:*, aoss:*, bedrock:*, cloudwatch:*, events:*

2. Deploy infrastructure

cd aws/terraform
terraform init
terraform apply -var-file=envs/dev.tfvars

Creates: OpenSearch Serverless collection, Glue job, Lambda, API Gateway, CloudWatch alarms.

3. Populate `.env`

python aws/scripts/update_env.py

Sets OPENSEARCH_ENDPOINT, RAG_API_ENDPOINT, RAG_API_KEY.

4. Run ingest

Ingest runs automatically via EventBridge daily at 02:00 UTC.
To trigger manually from the AWS Console → Glue → Jobs → mcrag-* → Run.

5. Query

python scripts/query_cli.py --cloud aws --question "What are recent advances in LLMs?"

AWS `.env` variables

aws/scripts/update_env.py writes the three deployment values automatically after terraform apply. The rest are defaults you can override in .env.

Variable	Source	Description
`OPENSEARCH_ENDPOINT`	auto — `update_env.py`	AOSS collection endpoint
`RAG_API_ENDPOINT`	auto — `update_env.py`	API Gateway invoke URL
`RAG_API_KEY`	auto — `update_env.py`	API Gateway API key
`AWS_REGION`	manual	AWS region (default: `ap-southeast-2`)
`OPENSEARCH_INDEX`	manual	Index name (default: `rag-docs`)
`GENERATION_MODEL_ID`	manual	Bedrock model (default: `amazon.nova-pro-v1:0`)
`TOP_K`	manual	Number of chunks to retrieve (default: `5`)

Azure Setup

1. Configure credentials

az login
az account set --subscription <subscription-id>
az account show   # confirm correct subscription

The deploying identity needs the Contributor role on the subscription (or resource group), plus Cognitive Services Contributor for AI Foundry model deployments.

Register required resource providers if not already enabled:

az provider register --namespace Microsoft.MachineLearningServices
az provider register --namespace Microsoft.CognitiveServices
az provider register --namespace Microsoft.Search
az provider register --namespace Microsoft.Web
az provider register --namespace Microsoft.Insights

2. Build the function package

The query function needs Linux .so wheels. Build them locally (works on Windows too):

python azure/scripts/build_function.py

This cross-compiles manylinux2014_x86_64 wheels and zips the function into azure/build/.

3. Deploy infrastructure

cd azure/terraform
terraform init
terraform apply

Creates: Resource Group, Storage Account, AI Foundry (text-embedding-3-large + gpt-4o), AI Search (Basic), Azure ML workspace + compute cluster, Azure Function App.

4. Populate `.env`

python azure/scripts/update_env.py

Sets AZURE_AI_FOUNDRY_ENDPOINT, AZURE_AI_FOUNDRY_KEY, AZURE_SEARCH_ENDPOINT, AZURE_SEARCH_KEY, AZURE_FUNCTION_ENDPOINT, AZURE_FUNCTION_CODE.

5. Create the daily ingest schedule

python azure/scripts/create_schedule.py

Schedules the Azure ML job to run daily at 02:00 UTC, fetching arXiv papers and indexing them into AI Search.

6. Query

python scripts/query_cli.py --cloud azure --question "What are recent advances in LLMs?"

Azure `.env` variables

azure/scripts/update_env.py writes all deployment values automatically after terraform apply.

Variable	Source	Description
`AZURE_AI_FOUNDRY_ENDPOINT`	auto — `update_env.py`	Cognitive Services endpoint
`AZURE_AI_FOUNDRY_KEY`	auto — `update_env.py`	Cognitive Services API key
`AZURE_SEARCH_ENDPOINT`	auto — `update_env.py`	AI Search endpoint
`AZURE_SEARCH_KEY`	auto — `update_env.py`	AI Search admin key
`AZURE_FUNCTION_ENDPOINT`	auto — `update_env.py`	Function App URL
`AZURE_FUNCTION_CODE`	auto — `update_env.py`	Function-level auth key
`AZURE_EMBED_DEPLOYMENT`	manual	Embedding deployment name (default: `text-embedding-3-large`)
`AZURE_CHAT_DEPLOYMENT`	manual	Chat deployment name (default: `gpt-4o`)
`AZURE_SEARCH_INDEX`	manual	Index name (default: `mcrag-docs`)

GCP Setup

1. Authenticate

gcloud auth application-default login
gcloud config set project <your-project-id>

2. Deploy infrastructure

cd gcp/terraform
terraform init
terraform apply -var-file=envs/dev.tfvars

Edit gcp/terraform/envs/dev.tfvars to set your project_id first.
Creates: Firestore (Native mode) + vector index, Vertex AI staging bucket, Cloud Function 2nd gen, service account + IAM bindings.

3. Populate `.env`

python gcp/scripts/update_env.py

Sets GCP_PROJECT, GCP_REGION, GCP_FIRESTORE_COLLECTION, GCP_FUNCTION_URL.

4. Run bulk ingest

python gcp/scripts/run_vertex_job.py --max-results 1000

Submits a Vertex AI Custom Training job that fetches arXiv papers, embeds them via text-embedding-004, and writes them to Firestore. Monitor at the GCP Console → Vertex AI → Training.

5. Query

python scripts/query_cli.py --cloud gcp --question "What are recent advances in LLMs?"

GCP `.env` variables

gcp/scripts/update_env.py writes all deployment values automatically after terraform apply.

Variable	Source	Description
`GCP_PROJECT`	auto — `update_env.py`	GCP project ID
`GCP_REGION`	auto — `update_env.py`	Region (default: `australia-southeast1`)
`GCP_FIRESTORE_COLLECTION`	auto — `update_env.py`	Firestore collection (default: `mcrag-docs`)
`GCP_EMBED_MODEL`	auto — `update_env.py`	Embedding model (default: `text-embedding-004`)
`GCP_CHAT_MODEL`	auto — `update_env.py`	Gemini model (default: `gemini-2.5-flash`)
`GCP_FUNCTION_URL`	auto — `update_env.py`	Cloud Function invoke URL

Hugging Face Spaces

The Streamlit app (app/) is deployed on HF Spaces as a public portfolio demo.

1. Create a Space

Go to huggingface.co/new-space
Name it (e.g. multi-cloud-rag-demo)
Under SDK, select Docker → Streamlit
Set visibility to Public
Click Create Space

2. Configure `.env`

HF_TOKEN=hf_...                          # Settings → Access Tokens → Write token
HF_SPACE=<username>/multi-cloud-rag-demo  # <username>/<space-name>

3. Install dev dependencies

pip install -r requirements-dev.txt

4. Upload the app

python scripts/upload_to_hf_space.py

Reads HF_TOKEN and HF_SPACE from .env, uploads app/ to the Space repo. HF detects the push and rebuilds automatically (1–2 minutes).

5. Add secrets

Go to your Space → Settings → Variables and secrets and add:

Secret	Value	Used by
`RAG_API_ENDPOINT`	API Gateway invoke URL	`pages/1_AWS.py`
`RAG_API_KEY`	API Gateway API key	`pages/1_AWS.py`
`AZURE_FUNCTION_ENDPOINT`	Azure Function URL	`pages/2_Azure.py`
`AZURE_FUNCTION_CODE`	Azure Function auth key	`pages/2_Azure.py`
`GCP_FUNCTION_URL`	Cloud Function URL	`pages/3_GCP.py`

Click Restart Space after adding secrets.

6. Verify

Open https://huggingface.co/spaces/<username>/multi-cloud-rag-demo:

Each cloud page shows a green "Endpoint configured" notice when its secret is set
Type a question and press Enter — the app queries the cloud backend and displays the answer with arXiv source IDs
Pages without secrets set show a warning and stop gracefully

HF `.env` variables

Variable	Description
`HF_TOKEN`	HF write token (`hf_...`)
`HF_SPACE`	Space repo ID (e.g. `<username>/multi-cloud-rag-demo`)

Local development

pip install -r requirements.txt
streamlit run app/Home.py

Reads from .env automatically. Each cloud page is independent — pages for clouds without .env values set will show a warning.

Name		Name	Last commit message	Last commit date
Latest commit History 181 Commits
adapters		adapters
app		app
aws		aws
azure		azure
core		core
docs		docs
gcp		gcp
scripts		scripts
.amlignore		.amlignore
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
requirements-azure.txt		requirements-azure.txt
requirements-dev.txt		requirements-dev.txt
requirements-gcp.txt		requirements-gcp.txt

Folders and files

Latest commit

History

Repository files navigation

multi-cloud-serverless-rags

Architecture

Screenshots

Prerequisites

AWS Setup

1. Configure credentials

2. Deploy infrastructure

3. Populate .env

4. Run ingest

5. Query

AWS .env variables

Azure Setup

1. Configure credentials

2. Build the function package

3. Deploy infrastructure

4. Populate .env

5. Create the daily ingest schedule

6. Query

Azure .env variables

GCP Setup

1. Authenticate

2. Deploy infrastructure

3. Populate .env

4. Run bulk ingest

5. Query

GCP .env variables

Hugging Face Spaces

1. Create a Space

2. Configure .env

3. Install dev dependencies

4. Upload the app

5. Add secrets

6. Verify

HF .env variables

Local development

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

3. Populate `.env`

AWS `.env` variables

4. Populate `.env`

Azure `.env` variables

3. Populate `.env`

GCP `.env` variables

2. Configure `.env`

HF `.env` variables

Packages