Skip to content

Add 6.1.0 and FIRST_CHECK support#467

Open
denys-fridman wants to merge 7 commits into
mlcommons:masterfrom
denys-fridman:dfridman/first-eval-samples
Open

Add 6.1.0 and FIRST_CHECK support#467
denys-fridman wants to merge 7 commits into
mlcommons:masterfrom
denys-fridman:dfridman/first-eval-samples

Conversation

@denys-fridman

@denys-fridman denys-fridman commented Jun 30, 2026

Copy link
Copy Markdown
Contributor
  • Adds 6.1.0 folder for compliance_checker and rcp_checker
  • Removes llama 405b and dlrm from 6.1.0
  • Adds FIRST_CHECK to mlp_compliance.py that runs only on the first occurrence of a key. Use it in closed_deepseekv3_671b.yaml to require the first eval_accuracy samples_count equals GBS*floor(42+24576/GBS). Related PR in training repo: Deepseek: skip first N evals training#891

… rule

Add FIRST_CHECK to mlp_compliance.py that runs only on the first occurrence
of a key. Use it in closed_deepseekv3_671b.yaml to require the first
eval_accuracy samples_count equals GBS*floor(42+24576/GBS).
@denys-fridman denys-fridman requested review from a team as code owners June 30, 2026 05:38
@github-actions

Copy link
Copy Markdown

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

…ck there

Copy training_6.0.0 to training_6.1.0. Remove FIRST_CHECK for first_eval_samples
from 6.0.0 closed DeepSeek V3 config, keeping it only in 6.1.0.
Add 6.1.0 to mlp_parser dispatch, rcp_checker supported set, package_checker
version sets, repo_checker choices, result_summarizer config, and add
verify_for_v6.1_training.sh script.
Drop config and RCP files for both benchmarks, remove them from the
closed/open common benchmark allowlists, and fix the common.yaml POST
actions to reference training_6.1.0 instead of training_6.0.0.
@denys-fridman denys-fridman changed the title Add FIRST_CHECK support and DeepSeek V3 first_eval_samples compliance… Add 6.1.0 and FIRST_CHECK support Jun 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant