add safe version of numpy.load#5500
Conversation
Signed-off-by: dimapihtar <dpykhtar@nvidia.com>
Signed-off-by: dimapihtar <dpykhtar@nvidia.com>
Signed-off-by: dimapihtar <dpykhtar@nvidia.com>
Signed-off-by: dimapihtar <dpykhtar@nvidia.com>
|
/claude review |
Signed-off-by: dimapihtar <dpykhtar@nvidia.com>
|
/ok to test 7ecc412 |
|
/claude review |
Signed-off-by: dimapihtar <dpykhtar@nvidia.com>
There was a problem hiding this comment.
Two additional items beyond the inline comment:
-
SafeUnpicklerallowlist missing numpy types —SafeUnpickler._SAFE_CLASSESwas built for FP8 checkpoint data and contains only builtins/torch/TE types. Ifpickle.loadis actually triggered duringsafe_numpy_load(e.g., an.npyfile with object-dtype arrays), numpy's reconstruction types (numpy.core.multiarray._reconstruct,numpy.ndarray,numpy.dtype) are not in the allowlist and will raiseUnpicklingError. For the current call sites (integer index arrays in.npyformat), the pickle path is likely never hit, so this won't break anything today — but it means the safety wrapper would reject legitimate numpy pickle data too. Worth confirming this is the intended behavior (block all pickle in numpy loads) and documenting it if so. -
No dedicated test for
safe_numpy_load— The test change intest_builder.pyswitches tosafe_numpy_loadbut doesn't verify the safety mechanism itself. Since this is a security hardening change, consider adding a unit test that confirmssafe_numpy_loadblocks a crafted malicious pickle payload while still correctly loading a normal numpy array.
Signed-off-by: dimapihtar <dpykhtar@nvidia.com>
Signed-off-by: dimapihtar <dpykhtar@nvidia.com>
Signed-off-by: dimapihtar <dpykhtar@nvidia.com>
|
/ok to test 6bbcce6 |
Signed-off-by: dimapihtar <dpykhtar@nvidia.com>
|
/ok to test e47712b |
What does this PR do ?
Security team request. Adds safe version of
numpy.loadwhich calls vulnerablepickle.load.Issue tracking
For PRs from open-source community contributors:
Linked issue:
Contribution process
Pre-checks
Code review
Feel free to message or comment @NVIDIA/mcore-oncall to help accelerate your merge into main. The less complex your PR is, the faster it will be approved and merged!
All PRs start as draft. If you open a non-draft PR, it will be automatically converted to draft.
Step 1: Mark PR as "Ready for Review"
.github/CODEOWNERS.Final Review might get declined if these requirements are not fulfilled.
Step 2: Final Review
For PRs that change
megatron/core, once all expert reviewers have approved, theFinal Reviewlabel is applied automatically and final reviewers are assigned.For PRs outside
megatron/core, this step is skipped.Step 3: Approved
Once all required reviewers have approved, the
Approvedlabel is applied automatically.Merge
Any member of mcore-engineers will be able to merge your PR.