feat(shell): stage shell-task File inputs under NAMED_DIR#1229
Open
k1sauce wants to merge 2 commits into
Open
Conversation
Follow-up to flyteorg#1220 (which added the file_input_layout knob to ContainerTask but did not wire it into shell tasks). Shell tasks reference inputs through a glob (/var/inputs/<name>/*), so a single File must be staged inside a per-input directory under its original basename — otherwise CoPilot stages it extensionless and tools that sniff format by extension reject it (e.g. salmon on *.fastq.gz staged as "0"). - shell/_runtime.py: _ShellContainerTask now passes file_input_layout="NAMED_DIR" to ContainerTask, so shell tasks always stage File/list[File] into per-input dirs under their original basenames (the renderer already globs them). - _container.py: local docker staging is now layout-aware so `--local` matches remote CoPilot — NAMED_DIR keeps original basenames (index-prefix dedup on collision), DIRECT uses bare indices (0, 1, ...). Replaces the prior unconditional name-preserving behavior. - pyproject.toml: bump flyteidl2 pin to 2.0.24 — shell tasks always request NAMED_DIR, and the emit raises on older flyteidl2 that lacks DataLoadingConfig.file_input_layout. - tests: lock DIRECT->indices for raw container list[File]; exercise name/extension preservation and collision dedup via the public file_input_layout="NAMED_DIR" param; assert single File renders to a glob; add data_loading_config emit tests (DIRECT unset, shell emits NAMED_DIR). Signed-off-by: Kyle Hazen <kyle@union.ai>
samhita-alla
approved these changes
Jun 19, 2026
machichima
reviewed
Jun 23, 2026
Comment on lines
+290
to
+291
| if target.exists(): | ||
| target = pathlib.Path(local_dir) / f"{i}_{base}" |
Member
There was a problem hiding this comment.
I'm thinking about the collision here. If we have following files:
- file.txt
- 2_file.txt (somehow user's file name is like this)
- file.txt -> already exists, change to 2_file.txt (collision again, but we do not handle it)
Then 3. will produce duplicate target
Contributor
There was a problem hiding this comment.
you're right. would you mind fixing this, please?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to #1220 (which added the file_input_layout knob to ContainerTask but did not wire it into shell tasks). Shell tasks reference inputs through a glob (/var/inputs//*), so a single File must be staged inside a per-input directory under its original basename — otherwise CoPilot stages it extensionless and tools that sniff format by extension reject it (e.g. salmon on *.fastq.gz staged as "0").
--localmatches remote CoPilot — NAMED_DIR keeps original basenames (index-prefix dedup on collision), DIRECT uses bare indices (0, 1, ...). Replaces the prior unconditional name-preserving behavior.