Skip to content

[e2e] Add nightly e2e test for submitting examples to flink standalone cluster#708

Open
matrixsparse wants to merge 1 commit into
apache:mainfrom
matrixsparse:feature/e2e-test-flink-standalone
Open

[e2e] Add nightly e2e test for submitting examples to flink standalone cluster#708
matrixsparse wants to merge 1 commit into
apache:mainfrom
matrixsparse:feature/e2e-test-flink-standalone

Conversation

@matrixsparse
Copy link
Copy Markdown
Contributor

Purpose of change

Add automated e2e test for submitting Java/Python quickstart examples to a Flink standalone cluster, replacing the current manual verification process before each release.

Closes #642

Changes

  • e2e-test/test-scripts/test_submit_examples_to_flink.sh: Test script that installs Flink via install.sh, starts a standalone cluster, submits all 6 examples (3 Java + 3 Python), verifies submission success, and cleans up.
  • .github/workflows/nightly-e2e.yml: Nightly GitHub Actions workflow that runs the test daily at UTC 00:00, with manual trigger support.

Key design decisions

  • Uses tools/install.sh --non-interactive (from [tools]Import Wizard for Installation Setup #599) for Flink installation
  • Validates job submission success (not full execution), since examples depend on LLM APIs
  • Each example tested independently; one failure doesn't block others
  • Flink logs archived as artifacts on failure for debugging

@matrixsparse
Copy link
Copy Markdown
Contributor Author

Hi @wenjin272, this PR implements the CI pipeline for #642 as discussed. Could you PTAL when you have time?

@matrixsparse matrixsparse force-pushed the feature/e2e-test-flink-standalone branch from 8189bc8 to 704e45c Compare May 26, 2026 17:23
@github-actions github-actions Bot added doc-label-missing The Bot applies this label either because none or multiple labels were provided. fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue. labels May 26, 2026
Copy link
Copy Markdown
Collaborator

@weiqingy weiqingy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on — script reads cleanly. A few questions inline.

on:
schedule:
- cron: '0 0 * * *'
workflow_dispatch:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nightly + manual dispatch means a regression in examples/**, python/flink_agents/examples/**, or tools/install.sh can sit undetected for up to 24h. Would a path-filtered pull_request: trigger for those paths make sense here, with the cron staying as the safety net for transitive-dep changes? The Flink download + full build is non-trivial wall time per PR, so the nightly-only choice is defensible too — curious which trade-off you prefer.

failed=$((failed + 1))
fi
done
printf "\nTotal: %d Passed: %d Failed: %d\n" "$total" "$passed" "$failed"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If install_flink, build_project, stage_dist_jars, or start_cluster dies under set -e, no result is ever recorded, so print_summary walks an empty RESULT_NAMES and prints Total: 0 Passed: 0 Failed: 0 before cleanup propagates the original non-zero exit code. The CI job still fails on the exit code, but a person scanning the log sees a "zero failures" summary right before the red X, which is misleading when triaging a 45-minute nightly run.

One way it could read, if useful:

if (( total == 0 )); then
    log_error "Test setup failed before any example was submitted"
    return
fi

right above the existing if (( failed > 0 )) check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-label-missing The Bot applies this label either because none or multiple labels were provided. fixVersion/0.3.0 The feature or bug should be implemented/fixed in the 0.3.0 version. priority/major Default priority of the PR or issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Tech Debt] Add e2e test for submitting example to flink standalone cluster.

2 participants