Security: fix SSRF in POST /api/imports — validate VideoUrl against YouTube allowlist#178
Open
davidortinau wants to merge 4 commits into
Open
Security: fix SSRF in POST /api/imports — validate VideoUrl against YouTube allowlist#178davidortinau wants to merge 4 commits into
davidortinau wants to merge 4 commits into
Conversation
- Orchestration log: DX24 vocab crash fix workflow (commit c9b1d0a) - Session log: ExposureCount NULL→0 backfill, verified on device - Decision merge: inbox → decisions.md (3 entries: lexical patch, validation strategy, directive) - Deleted: inbox files + jayne-locale-screenshots (deduped) - Wash history: append CRITICAL RULE for SQLite migration defense-in-depth (DEFAULT + idempotent backfill) - No archival needed (no entries >30d old in decisions.md) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Orchestrated multi-agent session for new generic data import feature architecture: - Zoe (Lead): Full MVP architecture covering UX flows, content detection (heuristic-first, AI fallback), data model (no new tables), parsing pipeline, AI integration, service layer, scope phasing. Placement revision: separate `/import-content` page (not tab on existing `/import`), per Captain directive. - Wash (Data Layer Scout): Surveyed existing YouTube pipeline pattern, found file import UI/service gap, confirmed no schema changes needed, identified dedup inconsistency for standardization. - River (AI Strategy): Designed 5 AI prompt tasks with confidence thresholds, heuristic fallbacks, structured DTOs via `SendPrompt<T>`, permissive grading principles. - Kaylee (UI Scout): Documented form patterns, file picker abstractions, resource lookup UI, navigation placement. Recommendations for ContentImportService, InputFile, preview table reuse. - Copilot Directive: Confirmed scope separation — Video Subscriptions (existing) separate from generic content import (new) to keep features focused and growth paths independent. Artifacts: - 5 orchestration logs (one per agent) in .squad/orchestration-log/ - Session log in .squad/log/2026-04-24T22:31-data-import-architecture-plan.md - 6 decisions merged into .squad/decisions.md (168KB total) - All agent histories updated with import-plan context Next: Implementation team begins ContentImportService + ImportContent.razor page. River engineers 3 prompt templates (Format Inference, Content Classification, Extraction). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Captain resolved 7 open questions from architecture proposal: 1. Dedup default: Skip and reuse existing (safest, matches YouTube pipeline) 2. Source attribution: Deferred to v2 3. Single-column imports: Use AI to translate on import (new mvp-single-column-translate work item) 4. Mobile clipboard paste: Not needed for MVP 5. Replace-all mode: Deferred to v2 6. Renamed YouTube page: 'Media Import' at /media-import (not 'Video Subscriptions') 7. Separate-page placement: Confirmed — /import-content independent from /media-import Updated decisions.md with final rulings appended and prior superseded mentions annotated. All prior architecture content preserved; only updated with Captain's decisions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…n POST /api/imports Add a YouTube URL allowlist guard in StartImport before the import record is created or the pipeline is started. Any URL that is not an https:// URL on www.youtube.com, youtube.com, or youtu.be is rejected with 400 Bad Request. - ImportEndpoints.cs: insert 6-line guard after the null check (lines 67-72) - TestJwtGenerator.cs: add optional userProfileId parameter so integration tests can supply the claim required by the /api/imports endpoint - ImportEndpointsSsrfTests.cs: 13 new tests covering Azure IMDS, RFC-1918 addresses, loopback, non-https YouTube, non-YouTube HTTPS hosts, unparseable input, and three valid YouTube URLs that must pass the guard Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
An authenticated user could submit any URL to
POST /api/importsand the server would make an outbound HTTP request to it before performing any host or scheme validation. This includes Azure IMDS (http://169.254.169.254/metadata/instance), RFC-1918 addresses, and any other internal network resource — potentially exposing managed-identity credentials or internal services.Fix
Added a YouTube allowlist guard in
StartImport(before theVideoImportrecord is created or the pipeline is started). AnyVideoUrlthat does not satisfy all three conditions is rejected immediately with400 Bad Request:httpswww.youtube.com,youtube.com, oryoutu.beThe fix is API-boundary-only — no changes to
VideoImportPipelineServiceor downstream services were needed.Changes
src/SentenceStudio.Api/ImportEndpoints.cstests/SentenceStudio.Api.Tests/ImportEndpointsSsrfTests.cstests/SentenceStudio.Api.Tests/Infrastructure/TestJwtGenerator.csuserProfileIdparam added toGenerateTokenTests
ImportEndpointsSsrfTestscovers:Rejection (400) cases:
http://169.254.169.254/metadata/instance,http://169.254.169.254/http://10.0.0.1/internal,http://192.168.1.1/adminhttp://localhost/evilhttp://www.youtube.com/...https://evil.com/...ftp://www.youtube.com/...not-a-url-at-allAcceptance (non-400) cases:
https://www.youtube.com/watch?v=...https://youtube.com/watch?v=...https://youtu.be/...Verification
IChatClient/AiServiceDI error that exists onmainbefore this branch — not caused by this change