Skip to content

feat(shell): stage shell-task File inputs under NAMED_DIR#1229

Open
k1sauce wants to merge 2 commits into
flyteorg:mainfrom
k1sauce:kyle/fix-files-staging
Open

feat(shell): stage shell-task File inputs under NAMED_DIR#1229
k1sauce wants to merge 2 commits into
flyteorg:mainfrom
k1sauce:kyle/fix-files-staging

Conversation

@k1sauce

@k1sauce k1sauce commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Follow-up to #1220 (which added the file_input_layout knob to ContainerTask but did not wire it into shell tasks). Shell tasks reference inputs through a glob (/var/inputs//*), so a single File must be staged inside a per-input directory under its original basename — otherwise CoPilot stages it extensionless and tools that sniff format by extension reject it (e.g. salmon on *.fastq.gz staged as "0").

  • shell/_runtime.py: _ShellContainerTask now passes file_input_layout="NAMED_DIR" to ContainerTask, so shell tasks always stage File/list[File] into per-input dirs under their original basenames (the renderer already globs them).
  • _container.py: local docker staging is now layout-aware so --local matches remote CoPilot — NAMED_DIR keeps original basenames (index-prefix dedup on collision), DIRECT uses bare indices (0, 1, ...). Replaces the prior unconditional name-preserving behavior.
  • pyproject.toml: bump flyteidl2 pin to 2.0.24 — shell tasks always request NAMED_DIR, and the emit raises on older flyteidl2 that lacks DataLoadingConfig.file_input_layout.
  • tests: lock DIRECT->indices for raw container list[File]; exercise name/extension preservation and collision dedup via the public file_input_layout="NAMED_DIR" param; assert single File renders to a glob; add data_loading_config emit tests (DIRECT unset, shell emits NAMED_DIR).

k1sauce and others added 2 commits June 18, 2026 17:23
Follow-up to flyteorg#1220 (which added the file_input_layout knob to ContainerTask
but did not wire it into shell tasks). Shell tasks reference inputs through a
glob (/var/inputs/<name>/*), so a single File must be staged inside a per-input
directory under its original basename — otherwise CoPilot stages it
extensionless and tools that sniff format by extension reject it (e.g. salmon
on *.fastq.gz staged as "0").

- shell/_runtime.py: _ShellContainerTask now passes file_input_layout="NAMED_DIR"
  to ContainerTask, so shell tasks always stage File/list[File] into per-input
  dirs under their original basenames (the renderer already globs them).
- _container.py: local docker staging is now layout-aware so `--local` matches
  remote CoPilot — NAMED_DIR keeps original basenames (index-prefix dedup on
  collision), DIRECT uses bare indices (0, 1, ...). Replaces the prior
  unconditional name-preserving behavior.
- pyproject.toml: bump flyteidl2 pin to 2.0.24 — shell tasks always request
  NAMED_DIR, and the emit raises on older flyteidl2 that lacks
  DataLoadingConfig.file_input_layout.
- tests: lock DIRECT->indices for raw container list[File]; exercise
  name/extension preservation and collision dedup via the public
  file_input_layout="NAMED_DIR" param; assert single File renders to a glob;
  add data_loading_config emit tests (DIRECT unset, shell emits NAMED_DIR).

Signed-off-by: Kyle Hazen <kyle@union.ai>
@k1sauce k1sauce marked this pull request as ready for review June 19, 2026 00:24
Comment on lines +290 to +291
if target.exists():
target = pathlib.Path(local_dir) / f"{i}_{base}"

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking about the collision here. If we have following files:

  1. file.txt
  2. 2_file.txt (somehow user's file name is like this)
  3. file.txt -> already exists, change to 2_file.txt (collision again, but we do not handle it)

Then 3. will produce duplicate target

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right. would you mind fixing this, please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants