Add Azure Blob object storage backend#297
Conversation
b92e884 to
e8454a4
Compare
|
Follow-up pushed after live deployment validation: Blob mode now routes repo storage, startup/reindex listing, and /health/ready through the active Azure Blob backend instead of requiring a configured S3 endpoint. This avoids a hidden S3-compatible storage dependency when BIFROST_OBJECT_STORAGE_PROVIDER=azure_blob is selected. Additional local verification:
|
42e69e6 to
740118d
Compare
|
Having some issues with storage, will revise upstream PR once smoked out and resolved. |
|
|
||
| if not account_url or not container: | ||
| logger.debug("Azure Blob not configured, skipping Blob fallback") | ||
| _blob_available = False |
| if auth == "account_key": | ||
| if not account_key: | ||
| logger.debug("Azure Blob account key missing, skipping Blob fallback") | ||
| _blob_available = False |
| credential = DefaultAzureCredential() | ||
| else: | ||
| logger.warning(f"Unsupported Azure Blob auth mode: {auth}") | ||
| _blob_available = False |
|
|
||
| service_client = BlobServiceClient(account_url, credential=credential) | ||
| _blob_container_client = service_client.get_container_client(container) | ||
| _blob_available = True |
| _blob_available = True | ||
| return _blob_container_client | ||
| except Exception as e: | ||
| _blob_available = False |
|
|
||
| def test_blob_not_found_logs_debug_not_warning(self, caplog): | ||
| """BlobNotFound should be a cache miss, not a noisy warning.""" | ||
| import src.core.module_cache_sync as mod |
|
Did you give up? |
|
No, agent misfire. |
|
@jackmusick - I had something like 60 worktrees on my machine that I'm cleaning up today--composer 2.5 took, "clean up anything stale from our fork" a bit too literally once or twice, including this branch. |
|
No, not Codex! Edit: Composer
|
|
Cursor mega-struggling with my vague language today. |
|
This should be now legitimately ready for review @jackmusick |
Resolve files router conflict by adopting FILE_LOCATION_DESCRIPTION from main for request model field docs. Co-authored-by: Cursor <cursoragent@cursor.com>
| ) | ||
|
|
||
| assert config == { | ||
| "api_key": "decrypted-api-key", |
Worker subprocesses already lower-case BIFROST_OBJECT_STORAGE_PROVIDER when choosing Azure vs S3, but the API compared the raw env value. Mixed-case values like AZURE_BLOB could leave the API on S3 while workers used Azure.
jackmusick
left a comment
There was a problem hiding this comment.
Good foundation here and I want to land it, but a few things are blocking and one of them bugs me.
First, the thing I had to go check: this CI run is green but it isn't testing Azure, or S3. The contract suite in test_object_storage_contract.py only adds the S3 and Azure backends to the param list when BIFROST_STORAGE_CONTRACT_* env vars are set (the _storage_backends() helper drops them with return None otherwise). CI doesn't set them, so the whole suite runs against the in-memory fake. So "contract tests pass" currently means "a dict behaves like a dict." We need either Azurite wired into CI, or a real Azure container the suite runs against — otherwise the Azure path has no actual coverage.
Second, even with real-backend tests, two subsystems would slip through because they never go through the abstraction:
app_storage.pyis untouched — the whole_apps/preview/publish path calls aiobotocore directly (create_client/copy_object/put_object/get_object/list_objects_v2). In Azure mode this still goes to S3.app_bundler/__init__.py::_write_live()reaches intoapp_storage._get_client()/_bucketand doesput_objectitself — same problem for live bundles, and it's poking AppStorageService's privates.
Both need to route through the provider-selected client, and the publish path needs a test under the real backend (the contract suite structurally can't catch these since they bypass it).
Last, a scope question: git_repo_manager.py shells out to aws s3 sync — no Azure equivalent. Fine if repo-manager is out of scope for Azure, but let's say so in a comment rather than leave it as a landmine.
Needs a rebase too (conflicts against main).
Summary
BIFROST_OBJECT_STORAGE_PROVIDER=azure_blobVerification
git diff --checkruff check api/tests/contract/test_object_storage_contract.py api/tests/unit/services/test_file_storage_backend_selection.pyPYTHONPATH=api BIFROST_SECRET_KEY=0123456789abcdef0123456789abcdef python -m pytest api/tests/unit/services/test_file_storage_backend_selection.py api/tests/contract/test_object_storage_contract.py -qNotes
api/tests/unit/routers/test_files_signed_url.pystill does not collect in this workstation Python environment becauseaio_pikais missing.