SOF-7914 Update: use multiple MLFFs in Relax NB#327
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds MLFF factory and install-profile selection, unified Pyodide patch registry and common torch patches, backend-specific Pyodide adapters/loaders for MACE/UMA/MatterSim/NequIP, package-install deduplication and selective reinstall, notebooks for MLFF relaxation and energy analysis, and LFS model/wheel pointers. ChangesMLFF Structure Relaxation Workflow
Sequence Diagram(s)sequenceDiagram
participant User as Notebook User
participant Notebook
participant Install as Install Handler
participant Patches
participant MLFF as MLFF Factory
participant ASE
participant Calc as MLFF Calculator
User->>Notebook: select MLFF_NAME, RELAXATION_PARAMETERS
Notebook->>MLFF: get_mlff_install_profiles(MLFF_NAME)
MLFF-->>Notebook: install profile string
Notebook->>Install: install_packages(profile)
Install-->>Notebook: packages installed
rect rgba(200, 150, 200, 0.5)
Notebook->>Patches: apply_all_patches(MLFF_NAME) [if Pyodide]
Patches->>Patches: apply_common_torch_patches()
Patches->>Patches: apply_mlff_patches(MLFF_NAME)
Patches-->>Notebook: patches applied
end
Notebook->>Notebook: load Material from FOLDER or Materials.get_by_name_first_match
Notebook->>Notebook: convert Material to ASE atoms
Notebook->>MLFF: create_mlff_calculator(MLFF_NAME, MLFF_SETTINGS)
MLFF-->>Calc: instantiated calculator
Notebook->>ASE: atoms.set_calculator(Calc)
Notebook->>ASE: BFGS relaxation with energy callback
ASE->>Calc: compute_energy(atoms)
Calc-->>ASE: energy value
ASE-->>Notebook: relaxed atoms
Notebook->>Notebook: compute interlayer distances and delta energies
Notebook->>User: display relaxed structure and metrics
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (2)
src/py/mat3ra/notebooks_utils/pyodide/packages/torch.py (2)
1118-1126: 💤 Low valueGuard check never matches —
_original_scriptattribute is never set.The check
hasattr(torch.jit, "_original_script")is intended to prevent double-patching, but nothing ever setstorch.jit._original_script. The_orig_scriptlocal variable is unused. If bothpatch_mattersim_deps()andpatch_nequip_deps()are called,torch.jit.scriptgets wrapped twice.♻️ Proposed fix to make guard effective
if not hasattr(torch.jit, "_original_script"): - _orig_script = torch.jit.script + torch.jit._original_script = torch.jit.script def _noop_script(obj=None, *a, **k): if obj is not None: return obj return lambda fn: fn torch.jit.script = _noop_script🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/py/mat3ra/notebooks_utils/pyodide/packages/torch.py` around lines 1118 - 1126, The guard currently checks hasattr(torch.jit, "_original_script") but never sets that attribute, so torch.jit.script can be wrapped multiple times (see _orig_script, torch.jit.script and the patch_mattersim_deps/patch_nequip_deps callers); fix by storing the original function onto torch.jit (e.g. assign torch.jit._original_script = _orig_script) the first time you patch and then replace torch.jit.script with the noop only if torch.jit._original_script is not present, and when unpatching or restoring use torch.jit._original_script to restore the original implementation.
1334-1451: ⚖️ Poor tradeoffSignificant code duplication with
patch_fairchem_deps().The
_hydra_instantiatefunction (lines 1347-1369), omegaconf stubs (lines 1386-1405), and tqdm stubs (lines 1443-1451) are nearly identical to implementations inpatch_fairchem_deps(). Consider extracting shared helpers to reduce maintenance burden.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/py/mat3ra/notebooks_utils/pyodide/packages/torch.py` around lines 1334 - 1451, Duplicate logic (hydra instantiate, omegaconf stubs, tqdm stubs) appears here and in patch_fairchem_deps(); extract shared helpers to centralize behavior: create reusable functions (e.g., make_hydra_instantiate(), make_omegaconf_stub(), make_tqdm_stub() or a generic make_stub_module wrapper that returns configured modules) and call them from both locations instead of reimplementing _hydra_instantiate, omegaconf_mod creation, and _tqdm_passthrough; reference existing symbols _hydra_instantiate, omegaconf_mod, tqdm_mod, _tqdm_passthrough, _make_stub_module and the function patch_fairchem_deps to locate where to replace duplicated blocks. Ensure the new helpers preserve current behavior (recursive instantiation, OmegaConf.create/to_container/register_new_resolver, tqdm passthrough) and update imports/assignments so sys.modules entries remain identical.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@other/experiments/jupyterlite/relax_structure_with_mlff.ipynb`:
- Line 294: The current calculation uses area =
ase_original_interface.get_volume() / ase_original_interface.cell[2, 2], which
assumes an orthogonal cell; replace it by computing the in-plane area from the
lattice vectors (use ase_original_interface.cell and take the cross product of
the first two cell vectors) so the area works for tilted/non-orthogonal cells
(i.e., compute norm(cross(cell[0], cell[1])) and assign that to area).
- Around line 281-301: Before computing energies and dividing by atom counts,
validate that filter_atoms_by_tag returned non-empty Atom objects for
SUBSTRATE_TAG and FILM_TAG: check substrate_relaxed and layer_relaxed (and
optionally substrate_original/layer_original) to ensure n_substrate and n_layer
(from get_global_number_of_atoms()) are > 0 and raise a clear exception if not;
update the block around filter_atoms_by_tag, calculate_energy, and the
effective_delta_relaxed calculation to perform these checks and fail fast with a
descriptive error mentioning the missing tag (SUBSTRATE_TAG or FILM_TAG) and the
interface variables ase_original_interface / ase_final_interface so callers can
locate the problem.
- Line 314: The displayed conversion uses the wrong factor (dividing by 0.16);
update the f-string that prints effective_delta_relaxed (the string containing
"Effective relaxed Delta per area") to convert eV/Ų to J/m² using the correct
factor 16.021766 (i.e., multiply effective_delta_relaxed by 16.021766 when
showing J/m²) so the message shows accurate units.
In `@src/py/mat3ra/notebooks_utils/pyodide/packages/mattersim.py`:
- Around line 20-22: resolved_model_path may be None when both model_path and
checkpoint are unset, causing
MatterSimCalculator.from_checkpoint(load_path=str(resolved_model_path), ...) to
receive the string "None"; add a guard after computing resolved_model_path
(checking model_path and checkpoint) and before calling
MatterSimCalculator.from_checkpoint: if resolved_model_path is falsy, raise a
clear ValueError (or provide a valid default) explaining that a model_path or
checkpoint is required (mentioning model_path/checkpoint and
MatterSimCalculator.from_checkpoint in the message) so the failure is explicit
and debuggable.
In `@src/py/mat3ra/notebooks_utils/pyodide/packages/nequip.py`:
- Around line 38-41: The code currently sets resolved_model_path = model_path or
checkpoint and then calls load_nequip_model(str(resolved_model_path)), which
ends up passing "None" to torch.load if both model_path and checkpoint are
unset; update the non-Pyodide path to validate inputs and raise a ValueError
when both model_path and checkpoint are falsy before calling load_nequip_model;
i.e., check resolved_model_path (or check model_path and checkpoint) and raise a
clear ValueError mentioning that a model path or checkpoint is required, then
proceed to call load_nequip_model and create_nequip_calculator_from_model as
before.
In `@src/py/mat3ra/notebooks_utils/pyodide/packages/torch.py`:
- Around line 584-585: The current numba_mod.njit implementation returns the
same wrapper in both branches, losing proper decorator semantics; change
numba_mod.njit so that if called directly as `@njit` (i.e., first positional arg
exists and is callable) it returns that function unchanged, otherwise it returns
a decorator that accepts a function and returns it; keep numba_mod.jit aliased
to numba_mod.njit. Use the symbols numba_mod.njit and numba_mod.jit to locate
and update the logic.
- Around line 1190-1208: The _scatter function currently calls index.max() when
dim_size is None which raises on empty tensors; update the dim_size
determination in _scatter to check for an empty index (e.g., index.numel() == 0)
and handle that case before calling index.max(): if index is empty set dim_size
to 0 (or another appropriate minimal size for your use-case) so subsequent
tensor allocations like out = _torch.zeros(...) and count logic don't call max
on an empty tensor; adjust the mean branch to handle dim_size == 0 (early return
of empty out) so functions _scatter, variable index, and count handling remain
safe for empty inputs.
In `@src/py/mat3ra/notebooks_utils/pyodide/packages/uma.py`:
- Around line 23-26: create_uma_calculator currently sets resolved_model_path =
model_path or checkpoint and immediately calls
load_predict_unit(str(resolved_model_path), ...), which will pass the string
"None" if both are missing; add an explicit guard in create_uma_calculator that,
when NOT running under Pyodide, checks that either model_path or checkpoint is
provided (i.e., resolved_model_path is not None) and raises a clear error (or
returns early) before calling load_predict_unit; reference the
resolved_model_path variable and the load_predict_unit and FAIRChemCalculator
creation so the check happens right before the load_predict_unit invocation.
---
Nitpick comments:
In `@src/py/mat3ra/notebooks_utils/pyodide/packages/torch.py`:
- Around line 1118-1126: The guard currently checks hasattr(torch.jit,
"_original_script") but never sets that attribute, so torch.jit.script can be
wrapped multiple times (see _orig_script, torch.jit.script and the
patch_mattersim_deps/patch_nequip_deps callers); fix by storing the original
function onto torch.jit (e.g. assign torch.jit._original_script = _orig_script)
the first time you patch and then replace torch.jit.script with the noop only if
torch.jit._original_script is not present, and when unpatching or restoring use
torch.jit._original_script to restore the original implementation.
- Around line 1334-1451: Duplicate logic (hydra instantiate, omegaconf stubs,
tqdm stubs) appears here and in patch_fairchem_deps(); extract shared helpers to
centralize behavior: create reusable functions (e.g., make_hydra_instantiate(),
make_omegaconf_stub(), make_tqdm_stub() or a generic make_stub_module wrapper
that returns configured modules) and call them from both locations instead of
reimplementing _hydra_instantiate, omegaconf_mod creation, and
_tqdm_passthrough; reference existing symbols _hydra_instantiate, omegaconf_mod,
tqdm_mod, _tqdm_passthrough, _make_stub_module and the function
patch_fairchem_deps to locate where to replace duplicated blocks. Ensure the new
helpers preserve current behavior (recursive instantiation,
OmegaConf.create/to_container/register_new_resolver, tqdm passthrough) and
update imports/assignments so sys.modules entries remain identical.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c418c69d-d06c-401b-8022-3e0ede00086f
📒 Files selected for processing (7)
other/experiments/jupyterlite/relax_structure_with_mlff.ipynbsrc/py/mat3ra/notebooks_utils/mlff.pysrc/py/mat3ra/notebooks_utils/pyodide/packages/mace.pysrc/py/mat3ra/notebooks_utils/pyodide/packages/mattersim.pysrc/py/mat3ra/notebooks_utils/pyodide/packages/nequip.pysrc/py/mat3ra/notebooks_utils/pyodide/packages/torch.pysrc/py/mat3ra/notebooks_utils/pyodide/packages/uma.py
| "substrate_original = filter_atoms_by_tag(ase_original_interface, SUBSTRATE_TAG)\n", | ||
| "layer_original = filter_atoms_by_tag(ase_original_interface, FILM_TAG)\n", | ||
| "substrate_relaxed = filter_atoms_by_tag(ase_final_interface, SUBSTRATE_TAG)\n", | ||
| "layer_relaxed = filter_atoms_by_tag(ase_final_interface, FILM_TAG)\n", | ||
| "\n", | ||
| "original_substrate_energy = calculate_energy(substrate_original, calculator)\n", | ||
| "original_layer_energy = calculate_energy(layer_original, calculator)\n", | ||
| "relaxed_substrate_energy = calculate_energy(substrate_relaxed, calculator)\n", | ||
| "relaxed_layer_energy = calculate_energy(layer_relaxed, calculator)\n", | ||
| "\n", | ||
| "delta_original = calculate_delta_energy(original_energy, original_substrate_energy, original_layer_energy)\n", | ||
| "delta_relaxed = calculate_delta_energy(relaxed_energy, relaxed_substrate_energy, relaxed_layer_energy)\n", | ||
| "\n", | ||
| "area = ase_original_interface.get_volume() / ase_original_interface.cell[2, 2]\n", | ||
| "n_interface = ase_final_interface.get_global_number_of_atoms()\n", | ||
| "n_substrate = substrate_relaxed.get_global_number_of_atoms()\n", | ||
| "n_layer = layer_relaxed.get_global_number_of_atoms()\n", | ||
| "effective_delta_relaxed = (\n", | ||
| " relaxed_energy / n_interface\n", | ||
| " - (relaxed_substrate_energy / n_substrate + relaxed_layer_energy / n_layer)\n", | ||
| " ) / (2 * area)\n", |
There was a problem hiding this comment.
Add explicit tag validation before energy decomposition.
If the interface lacks one of the expected tags, n_substrate or n_layer can be zero, causing failures in Line 300 and misleading energy outputs. Fail fast with a clear error.
💡 Proposed fix
substrate_original = filter_atoms_by_tag(ase_original_interface, SUBSTRATE_TAG)
layer_original = filter_atoms_by_tag(ase_original_interface, FILM_TAG)
substrate_relaxed = filter_atoms_by_tag(ase_final_interface, SUBSTRATE_TAG)
layer_relaxed = filter_atoms_by_tag(ase_final_interface, FILM_TAG)
+
+if len(substrate_original) == 0 or len(layer_original) == 0:
+ raise ValueError(
+ "Missing substrate/film tags in the original interface. "
+ "Expected non-empty groups for SUBSTRATE_TAG and FILM_TAG."
+ )
+if len(substrate_relaxed) == 0 or len(layer_relaxed) == 0:
+ raise ValueError(
+ "Missing substrate/film tags in the relaxed interface. "
+ "Expected non-empty groups for SUBSTRATE_TAG and FILM_TAG."
+ )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@other/experiments/jupyterlite/relax_structure_with_mlff.ipynb` around lines
281 - 301, Before computing energies and dividing by atom counts, validate that
filter_atoms_by_tag returned non-empty Atom objects for SUBSTRATE_TAG and
FILM_TAG: check substrate_relaxed and layer_relaxed (and optionally
substrate_original/layer_original) to ensure n_substrate and n_layer (from
get_global_number_of_atoms()) are > 0 and raise a clear exception if not; update
the block around filter_atoms_by_tag, calculate_energy, and the
effective_delta_relaxed calculation to perform these checks and fail fast with a
descriptive error mentioning the missing tag (SUBSTRATE_TAG or FILM_TAG) and the
interface variables ase_original_interface / ase_final_interface so callers can
locate the problem.
| "delta_original = calculate_delta_energy(original_energy, original_substrate_energy, original_layer_energy)\n", | ||
| "delta_relaxed = calculate_delta_energy(relaxed_energy, relaxed_substrate_energy, relaxed_layer_energy)\n", | ||
| "\n", | ||
| "area = ase_original_interface.get_volume() / ase_original_interface.cell[2, 2]\n", |
There was a problem hiding this comment.
Use a general in-plane area formula, not volume / c_z.
Line 294 assumes an orthogonal cell and c-axis aligned with z. For tilted/non-orthogonal cells this gives the wrong interface area and corrupts all per-area metrics.
💡 Proposed fix
+import numpy as np
- area = ase_original_interface.get_volume() / ase_original_interface.cell[2, 2]
+cell = ase_original_interface.cell
+area = np.linalg.norm(np.cross(cell[0], cell[1]))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@other/experiments/jupyterlite/relax_structure_with_mlff.ipynb` at line 294,
The current calculation uses area = ase_original_interface.get_volume() /
ase_original_interface.cell[2, 2], which assumes an orthogonal cell; replace it
by computing the in-plane area from the lattice vectors (use
ase_original_interface.cell and take the cross product of the first two cell
vectors) so the area works for tilted/non-orthogonal cells (i.e., compute
norm(cross(cell[0], cell[1])) and assign that to area).
| "print(f\"Relaxed Delta per area: {delta_relaxed / area:.4f} eV/Ang^2\")\n", | ||
| "print(f\"Relaxed interface energy: {relaxed_energy:.4f} eV\")\n", | ||
| "print(\n", | ||
| " f\"Effective relaxed Delta per area: {effective_delta_relaxed:.4f} eV/Ang^2 ({effective_delta_relaxed / 0.16:.4f} J/m^2)\")" |
There was a problem hiding this comment.
Fix the eV/Ų → J/m² conversion factor.
Line 314 divides by 0.16, which is not the correct conversion.
1 eV/Ų = 16.021766 J/m².
💡 Proposed fix
+EV_PER_ANG2_TO_J_PER_M2 = 16.021766
print(
- f"Effective relaxed Delta per area: {effective_delta_relaxed:.4f} eV/Ang^2 ({effective_delta_relaxed / 0.16:.4f} J/m^2)")
+ f"Effective relaxed Delta per area: {effective_delta_relaxed:.4f} eV/Ang^2 "
+ f"({effective_delta_relaxed * EV_PER_ANG2_TO_J_PER_M2:.4f} J/m^2)"
+)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@other/experiments/jupyterlite/relax_structure_with_mlff.ipynb` at line 314,
The displayed conversion uses the wrong factor (dividing by 0.16); update the
f-string that prints effective_delta_relaxed (the string containing "Effective
relaxed Delta per area") to convert eV/Ų to J/m² using the correct factor
16.021766 (i.e., multiply effective_delta_relaxed by 16.021766 when showing
J/m²) so the message shows accurate units.
| resolved_model_path = model_path or checkpoint | ||
|
|
||
| return MatterSimCalculator.from_checkpoint(load_path=str(resolved_model_path), device=device, **kwargs) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Show the exact create_mattersim_calculator implementation around the reported lines
FILE="src/py/mat3ra/notebooks_utils/pyodide/packages/mattersim.py"
echo "=== $FILE ==="
nl -ba "$FILE" | sed -n '1,120p'
# 2) Find all usages of create_mattersim_calculator in the repo
echo
echo "=== rg usages of create_mattersim_calculator ==="
rg -n "create_mattersim_calculator\s*\(" -S .
# 3) Inspect the jupyterlite notebook settings for mattersim MLFF_SETTINGS (look for "mattersim")
NB="other/experiments/jupyterlite/relax_structure_with_mlff.ipynb"
if [ -f "$NB" ]; then
echo
echo "=== grep mattersim config in $NB ==="
# Show only small relevant sections to avoid huge output
rg -n "\"mattersim\"|mattersim" "$NB" || true
# Extract a likely JSON path containing MLFF_SETTINGS["mattersim"]
echo
echo "=== extract nearby JSON around \"mattersim\" occurrences ==="
# Print ~80 lines around first occurrences
python3 - <<'PY'
import json, re
from pathlib import Path
nb_path = Path("other/experiments/jupyterlite/relax_structure_with_mlff.ipynb")
data = json.loads(nb_path.read_text(encoding="utf-8"))
matches = []
for cell in data.get("cells", []):
src = "".join(cell.get("source", []))
if "MLFF_SETTINGS" in src or "mattersim" in src or "mattersim" in src.lower():
if "mattersim" in src:
matches.append(src)
# Fallback: if not found in cells, scan full text for a segment
if not matches:
text = nb_path.read_text(encoding="utf-8")
idx = text.lower().find("mattersim")
if idx != -1:
start = max(0, idx-500)
end = min(len(text), idx+2000)
print(text[start:end])
else:
# Print only short excerpts
for i, m in enumerate(matches[:6]):
m2 = m
if len(m2) > 2000: m2 = m2[:2000] + "\n... (truncated)"
print(f"\n--- cell excerpt {i+1} ---\n{m2}\n")
PY
else
echo "Notebook not found at $NB"
fiRepository: Exabyte-io/api-examples
Length of output: 196
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 0) Locate the notebook file (in case the path differs)
echo "=== fd relax_structure_with_mlff.ipynb ==="
fd "relax_structure_with_mlff\.ipynb$" . || true
# 1) Show create_mattersim_calculator implementation around the reported lines
FILE="src/py/mat3ra/notebooks_utils/pyodide/packages/mattersim.py"
echo
echo "=== $FILE (around lines 1-80) ==="
sed -n '1,80p' "$FILE"
echo
echo "=== $FILE (around lines 15-35) ==="
sed -n '15,35p' "$FILE"
# 2) Find all usages of create_mattersim_calculator in the repo
echo
echo "=== rg usages of create_mattersim_calculator ==="
rg -n "create_mattersim_calculator\s*\(" -S .
# 3) Find possible non-pyodide call paths for mattersim (where non-pyodide branch might be hit)
echo
echo "=== rg for get_mattersim_model_pyodide / MatterSimCalculator ==="
rg -n "get_mattersim_model_pyodide|MatterSimCalculator" -S .
# 4) Inspect MLFF_SETTINGS["mattersim"] inside the Jupyterlite notebook (if present)
NB_CAND="other/experiments/jupyterlite/relax_structure_with_mlff.ipynb"
if [ -f "$NB_CAND" ]; then
NB="$NB_CAND"
else
# try any matching notebook found via fd
NB="$(fd "relax_structure_with_mlff\.ipynb$" . -1 | head -n 1 || true)"
fi
echo
echo "=== notebook MLFF_SETTINGS extraction (if found) ==="
if [ -n "${NB:-}" ] && [ -f "$NB" ]; then
echo "Using notebook: $NB"
python3 - <<'PY'
import json, re
from pathlib import Path
nb_path = Path("other/experiments/jupyterlite/relax_structure_with_mlff.ipynb")
# if the above is not correct, the script will rewrite below:
# (we pass via env)
PY
else
echo "Notebook not found; skipping MLFF_SETTINGS extraction."
fiRepository: Exabyte-io/api-examples
Length of output: 4073
🏁 Script executed:
#!/bin/bash
set -euo pipefail
NB="other/experiments/jupyterlite/relax_structure_with_mlff.ipynb"
FILE="src/py/mat3ra/notebooks_utils/primitive/environment.py"
echo "=== inspect is_pyodide_environment() ==="
sed -n '1,200p' "$FILE" | nl -ba | sed -n '1,160p' || sed -n '1,160p' "$FILE"
echo
echo "=== rg for create_mattersim_calculator in the notebook ==="
rg -n "create_mattersim_calculator|MLFF_SETTINGS|mattersim" -S "$NB" || true
echo
echo "=== extract MLFF_SETTINGS[\"mattersim\"] and related bits ==="
python3 - <<'PY'
import json, re
from pathlib import Path
nb_path = Path("other/experiments/jupyterlite/relax_structure_with_mlff.ipynb")
nb = json.loads(nb_path.read_text(encoding="utf-8"))
def join_source(cell):
src = cell.get("source", [])
if isinstance(src, list):
return "".join(src)
return str(src)
hits = []
for i, cell in enumerate(nb.get("cells", [])):
if cell.get("cell_type") != "code":
continue
src = join_source(cell)
if "MLFF_SETTINGS" in src or "mattersim" in src.lower():
hits.append((i+1, src))
print(f"code cells with MLFF_SETTINGS or mattersim: {len(hits)}")
for (cell_no, src) in hits[:20]:
print("\n--- code cell", cell_no, "---")
# keep output short
if len(src) > 4000:
print(src[:4000] + "\n...<truncated>...")
else:
print(src)
# Try to locate a literal JSON/Python dict for MLFF_SETTINGS in the raw text
text = nb_path.read_text(encoding="utf-8")
m = re.search(r"MLFF_SETTINGS\s*=\s*({.*?})\s*[\r\n]", text, flags=re.S)
if m:
raw = m.group(1)
print("\n=== Found MLFF_SETTINGS assignment (best-effort excerpt) ===")
print(raw[:5000] + ("\n...<truncated>..." if len(raw) > 5000 else ""))
else:
print("\n=== Could not find a simple MLFF_SETTINGS = {...} literal; falling back to targeted scan ===")
# Target scan for mattersim block near "mattersim"
idx = text.lower().find('"mattersim"')
if idx == -1:
idx = text.lower().find("mattersim")
if idx != -1:
start = max(0, idx-1200)
end = min(len(text), idx+2400)
excerpt = text[start:end]
print("\n=== Text excerpt around mattersim ===")
print(excerpt[:8000] + ("\n...<truncated>..." if len(excerpt) > 8000 else ""))
else:
print("No mattersim string found in notebook text (unexpected).")
PYRepository: Exabyte-io/api-examples
Length of output: 11950
Guard missing checkpoint/path before loading MatterSim model (non-Pyodide).
When both model_path and checkpoint are unset, resolved_model_path becomes None, and load_path=str(resolved_model_path) passes the literal "None" into MatterSimCalculator.from_checkpoint(...), leading to an opaque downstream failure. This can occur for non-Pyodide runs because the Jupyterlite notebook config for mattersim provides only {"model": "1m", "device": "cpu"} (no model_path/checkpoint).
Suggested fix
def create_mattersim_calculator(model="1m", device="cpu", model_path=None, checkpoint=None, **kwargs):
if is_pyodide_environment():
return get_mattersim_model_pyodide(model=model, device=device, **kwargs)
resolved_model_path = model_path or checkpoint
+ if resolved_model_path is None:
+ raise ValueError("For non-Pyodide runs, provide `model_path` or `checkpoint`.")
return MatterSimCalculator.from_checkpoint(load_path=str(resolved_model_path), device=device, **kwargs)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| resolved_model_path = model_path or checkpoint | |
| return MatterSimCalculator.from_checkpoint(load_path=str(resolved_model_path), device=device, **kwargs) | |
| resolved_model_path = model_path or checkpoint | |
| if resolved_model_path is None: | |
| raise ValueError("For non-Pyodide runs, provide `model_path` or `checkpoint`.") | |
| return MatterSimCalculator.from_checkpoint(load_path=str(resolved_model_path), device=device, **kwargs) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/py/mat3ra/notebooks_utils/pyodide/packages/mattersim.py` around lines 20
- 22, resolved_model_path may be None when both model_path and checkpoint are
unset, causing
MatterSimCalculator.from_checkpoint(load_path=str(resolved_model_path), ...) to
receive the string "None"; add a guard after computing resolved_model_path
(checking model_path and checkpoint) and before calling
MatterSimCalculator.from_checkpoint: if resolved_model_path is falsy, raise a
clear ValueError (or provide a valid default) explaining that a model_path or
checkpoint is required (mentioning model_path/checkpoint and
MatterSimCalculator.from_checkpoint in the message) so the failure is explicit
and debuggable.
| resolved_model_path = model_path or checkpoint | ||
|
|
||
| nequip_model = load_nequip_model(str(resolved_model_path)) | ||
| return create_nequip_calculator_from_model(nequip_model, device=device) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate the target file
ls -la src/py/mat3ra/notebooks_utils/pyodide/packages/nequip.py
# Show the relevant function and surrounding context (small ranges)
python3 - <<'PY'
from pathlib import Path
p = Path("src/py/mat3ra/notebooks_utils/pyodide/packages/nequip.py")
text = p.read_text(encoding="utf-8").splitlines()
for i in range(1, len(text)+1):
if "def create_nequip_calculator" in text[i-1]:
start = max(1, i-20)
end = min(len(text), i+80)
for j in range(start, end+1):
print(f"{j}:{text[j-1]}")
break
else:
print("create_nequip_calculator not found")
PY
# Find and show load_nequip_model implementation/usages
rg -n "def load_nequip_model|load_nequip_model\(" src/py/mat3ra/notebooks_utils/pyodide/packages/nequip.py src/py/mat3ra/notebooks_utils/pyodide/packages -S
# If load_nequip_model is defined elsewhere, locate it
rg -n "def load_nequip_model" src/py/mat3ra/notebooks_utils/pyodide/packages -S || true
rg -n "load_nequip_model" src/py/mat3ra/notebooks_utils/pyodide/packages -S
# Check call sites that might pass None explicitly
rg -n "create_nequip_calculator\(" src/py/mat3ra -SRepository: Exabyte-io/api-examples
Length of output: 3026
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show load_nequip_model implementation (small, targeted range)
python3 - <<'PY'
from pathlib import Path
p = Path("src/py/mat3ra/notebooks_utils/pyodide/packages/torch.py")
lines = p.read_text(encoding="utf-8").splitlines()
# find def load_nequip_model
for idx, line in enumerate(lines, start=1):
if line.startswith("def load_nequip_model"):
start = max(1, idx-20)
end = min(len(lines), idx+120)
for j in range(start, end+1):
print(f"{j}:{lines[j-1]}")
break
else:
print("load_nequip_model not found")
PY
# Quick look for how checkpoint_path is used in that function (context)
rg -n "load_nequip_model|checkpoint_path" src/py/mat3ra/notebooks_utils/pyodide/packages/torch.py -nRepository: Exabyte-io/api-examples
Length of output: 3341
Add fail-fast validation for missing model_path/checkpoint in non-Pyodide path.
In src/py/mat3ra/notebooks_utils/pyodide/packages/nequip.py (lines 38-41), resolved_model_path = model_path or checkpoint can become None; the code then calls load_nequip_model(str(resolved_model_path)), which passes "None" to torch.load(...) and fails with an unclear error (attempting to load a file literally named "None"). Raise a ValueError when both are unset.
Suggested fix
def create_nequip_calculator(model="oam_s", device="cpu", model_path=None, checkpoint=None):
if is_pyodide_environment():
return get_nequip_model_pyodide(model=model, device=device)
resolved_model_path = model_path or checkpoint
+ if resolved_model_path is None:
+ raise ValueError("For non-Pyodide runs, provide `model_path` or `checkpoint`.")
nequip_model = load_nequip_model(str(resolved_model_path))
return create_nequip_calculator_from_model(nequip_model, device=device)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| resolved_model_path = model_path or checkpoint | |
| nequip_model = load_nequip_model(str(resolved_model_path)) | |
| return create_nequip_calculator_from_model(nequip_model, device=device) | |
| resolved_model_path = model_path or checkpoint | |
| if resolved_model_path is None: | |
| raise ValueError("For non-Pyodide runs, provide `model_path` or `checkpoint`.") | |
| nequip_model = load_nequip_model(str(resolved_model_path)) | |
| return create_nequip_calculator_from_model(nequip_model, device=device) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/py/mat3ra/notebooks_utils/pyodide/packages/nequip.py` around lines 38 -
41, The code currently sets resolved_model_path = model_path or checkpoint and
then calls load_nequip_model(str(resolved_model_path)), which ends up passing
"None" to torch.load if both model_path and checkpoint are unset; update the
non-Pyodide path to validate inputs and raise a ValueError when both model_path
and checkpoint are falsy before calling load_nequip_model; i.e., check
resolved_model_path (or check model_path and checkpoint) and raise a clear
ValueError mentioning that a model path or checkpoint is required, then proceed
to call load_nequip_model and create_nequip_calculator_from_model as before.
| resolved_model_path = model_path or checkpoint | ||
|
|
||
| predictor = load_predict_unit(str(resolved_model_path), device=device, **kwargs) | ||
| return FAIRChemCalculator(predictor, task_name=task_name) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate and display the relevant function in the target file
FILE="src/py/mat3ra/notebooks_utils/pyodide/packages/uma.py"
if [ ! -f "$FILE" ]; then
echo "Missing file: $FILE" >&2
exit 1
fi
echo "=== File: $FILE (lines around 1-120) ==="
nl -ba "$FILE" | sed -n '1,140p'
echo
echo "=== Search for load_predict_unit definition/usages ==="
# Search in repo for load_predict_unit
rg -n "load_predict_unit" -S . || true
echo
echo "=== If load_predict_unit is imported, show import block ==="
rg -n "from .* import .*load_predict_unit|import .*load_predict_unit" -S "$FILE" || trueRepository: Exabyte-io/api-examples
Length of output: 218
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="src/py/mat3ra/notebooks_utils/pyodide/packages/uma.py"
if [ ! -f "$FILE" ]; then
echo "Missing file: $FILE" >&2
exit 1
fi
echo "=== File: $FILE (first ~200 lines with line numbers) ==="
cat -n "$FILE" | sed -n '1,200p'
echo
echo "=== Search for create_uma_calculator ==="
rg -n "def create_uma_calculator" -S "$FILE"
echo
echo "=== Show create_uma_calculator function body (approx lines around match) ==="
# Print a window around the definition
start=$(rg -n "def create_uma_calculator" "$FILE" | head -n1 | cut -d: -f1)
end=$((start+80))
cat -n "$FILE" | sed -n "${start},${end}p"
echo
echo "=== Search for load_predict_unit usage/definition ==="
rg -n "load_predict_unit" -S . || true
echo
echo "=== Search imports related to load_predict_unit in target file ==="
rg -n "load_predict_unit" -S "$FILE" || trueRepository: Exabyte-io/api-examples
Length of output: 2976
Validate non-Pyodide model_path/checkpoint before calling load_predict_unit.
create_uma_calculator sets resolved_model_path = model_path or checkpoint; if both are absent this becomes None, and load_predict_unit(str(resolved_model_path), ...) passes "None", leading to a confusing runtime failure. Add an explicit None guard (non-Pyodide only).
Suggested fix
def create_uma_calculator(model="f16", task_name="omat", device="cpu", model_path=None, checkpoint=None, **kwargs):
if is_pyodide_environment():
return get_uma_model_pyodide(model=model, task_name=task_name, device=device, **kwargs)
resolved_model_path = model_path or checkpoint
+ if resolved_model_path is None:
+ raise ValueError("For non-Pyodide runs, provide `model_path` or `checkpoint`.")
predictor = load_predict_unit(str(resolved_model_path), device=device, **kwargs)
return FAIRChemCalculator(predictor, task_name=task_name)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| resolved_model_path = model_path or checkpoint | |
| predictor = load_predict_unit(str(resolved_model_path), device=device, **kwargs) | |
| return FAIRChemCalculator(predictor, task_name=task_name) | |
| resolved_model_path = model_path or checkpoint | |
| if resolved_model_path is None: | |
| raise ValueError("For non-Pyodide runs, provide `model_path` or `checkpoint`.") | |
| predictor = load_predict_unit(str(resolved_model_path), device=device, **kwargs) | |
| return FAIRChemCalculator(predictor, task_name=task_name) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/py/mat3ra/notebooks_utils/pyodide/packages/uma.py` around lines 23 - 26,
create_uma_calculator currently sets resolved_model_path = model_path or
checkpoint and immediately calls load_predict_unit(str(resolved_model_path),
...), which will pass the string "None" if both are missing; add an explicit
guard in create_uma_calculator that, when NOT running under Pyodide, checks that
either model_path or checkpoint is provided (i.e., resolved_model_path is not
None) and raises a clear error (or returns early) before calling
load_predict_unit; reference the resolved_model_path variable and the
load_predict_unit and FAIRChemCalculator creation so the check happens right
before the load_predict_unit invocation.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@other/experiments/jupyterlite/relax_structure_with_mlff.ipynb`:
- Around line 85-91: The notebook imports symbols that don't exist in
mat3ra.notebooks_utils.mlff (ensure_mlff_switchable and mark_active_mlff),
causing an ImportError; either update the import line to use the actual exported
names from src/py/mat3ra/notebooks_utils/mlff.py (replace ensure_mlff_switchable
and/or mark_active_mlff with the correct function names), or add/rename wrapper
functions in mlff.py to export ensure_mlff_switchable and mark_active_mlff, and
keep get_mlff_install_profiles as-is; ensure the imported symbols referenced in
the notebook (ensure_mlff_switchable, get_mlff_install_profiles,
mark_active_mlff) match the actual function names in mlff.py so the subsequent
calls (ensure_mlff_switchable(MLFF_NAME), get_mlff_install_profiles(MLFF_NAME),
mark_active_mlff(MLFF_NAME)) run without ImportError.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4f94b63c-b0b9-428d-ad2f-c2a787047344
📒 Files selected for processing (11)
.gitattributesconfig.ymlother/experiments/jupyterlite/relax_structure_with_mlff.ipynbpackages/antlr4_python3_runtime-4.9.3-py3-none-any.whlpackages/mattersim-1.1.2-py3-none-any.whlpackages/models/mattersim-v1.0.0-1M.pthpackages/models/nequip-oam-s-config-sd.pthpackages/models/uma-s-1p1-f16.ptpackages/models/uma-s-1p1-int8.ptpackages/nequip-0.15.0-py3-none-any.whlsrc/py/mat3ra/notebooks_utils/mlff.py
✅ Files skipped from review due to trivial changes (5)
- .gitattributes
- packages/nequip-0.15.0-py3-none-any.whl
- packages/mattersim-1.1.2-py3-none-any.whl
- packages/models/uma-s-1p1-f16.pt
- packages/models/nequip-oam-s-config-sd.pth
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (2)
src/py/mat3ra/notebooks_utils/pyodide/packages/uma.py (1)
24-26:⚠️ Potential issue | 🟠 Major
numba.njitstill mishandles bare@njitusage.When the first positional argument is already the decorated function, the stub must return that function directly. The current lambda still returns another wrapper, so
@njitreplaces the function withlambda f: f.In Numba, what should `njit` return when used as `@njit` versus `@njit()`?🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/py/mat3ra/notebooks_utils/pyodide/packages/uma.py` around lines 24 - 26, The numba stub's njit currently returns a wrapper even when used as bare `@njit`; update numba_mod.njit (and thus numba_mod.jit) so that if the first positional argument is a callable (the decorated function) it returns that function directly, otherwise it returns a decorator that accepts a function and returns it; keep using _make_stub_module to create numba_mod and assign numba_mod.jit = numba_mod.njit.src/py/mat3ra/notebooks_utils/pyodide/packages/mattersim.py (1)
219-236:⚠️ Potential issue | 🟠 Major
index.max()still blows up on empty inputs.This is the same empty-index case raised earlier, now in the MatterSim-local
_scattershim: whendim_sizeis omitted andindexis empty, theindex.max()path raises before an empty output tensor can be returned.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/py/mat3ra/notebooks_utils/pyodide/packages/mattersim.py` around lines 219 - 236, In _scatter, guard against calling index.max() on empty tensors: when dim_size is None set dim_size = int(index.max()) + 1 only if index.numel() > 0, otherwise set dim_size = 0 so the function can construct and return an empty out tensor; update the dim_size assignment in the _scatter function to use this conditional check on index.numel() before calling index.max().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/py/mat3ra/notebooks_utils/pyodide/packages/install.py`:
- Around line 99-105: The current logic in package_has_version_specifier only
detects operator-pinned specs, so direct references (emfs:, http(s)://, file:,
git+, ssh+, wheel files like .whl, local paths or any URI containing "://", and
direct wheel artifacts) are not treated as needing reinstall when
profile_changed; update package_has_version_specifier (and thus
should_reinstall_package) to also return True for direct-reference schemes and
wheel artifacts by detecting common URI schemes (e.g., "emfs:", "http:",
"https:", "file:", "git+", "ssh+"), any spec containing "://", and
filenames/paths ending with ".whl" or containing direct refs (e.g., "@" or other
wheel artifact indicators) so that should_reinstall_package(profile_changed)
will force reinstall for those entries on profile switches.
In `@src/py/mat3ra/notebooks_utils/pyodide/packages/mace.py`:
- Around line 34-40: The try/except in patch_mace_tools currently swallows all
exceptions which can hide import or assignment failures for torch_geometric and
leave mace.tools.torch_geometric without a .data attribute; change the except to
catch only expected exceptions (e.g., ImportError, AttributeError) and after the
import/assignment assert that torch_geometric has the data attribute (e.g.,
verify hasattr(torch_geometric, "data") and that torch_geometric.data is
torch_geometric_data); if the check fails, raise a clear RuntimeError (or
re-raise the original exception) so callers (like mace.calculators) fail fast
with an explanatory message.
In `@src/py/mat3ra/notebooks_utils/pyodide/packages/torch.py`:
- Around line 40-42: _OmegaConf.create currently only wraps dict inputs with
_DictConfig and returns raw lists; update create to also wrap list inputs with
the ListConfig stub so behavior matches real OmegaConf. Change _OmegaConf.create
to check for list instances and return the ListConfig wrapper (e.g. return
_ListConfig(data) if isinstance(data, list), return _DictConfig(data) if
isinstance(data, dict), else return data), ensuring you reference the ListConfig
class exposed by _ensure_omegaconf_stub() and the existing _DictConfig symbol.
---
Duplicate comments:
In `@src/py/mat3ra/notebooks_utils/pyodide/packages/mattersim.py`:
- Around line 219-236: In _scatter, guard against calling index.max() on empty
tensors: when dim_size is None set dim_size = int(index.max()) + 1 only if
index.numel() > 0, otherwise set dim_size = 0 so the function can construct and
return an empty out tensor; update the dim_size assignment in the _scatter
function to use this conditional check on index.numel() before calling
index.max().
In `@src/py/mat3ra/notebooks_utils/pyodide/packages/uma.py`:
- Around line 24-26: The numba stub's njit currently returns a wrapper even when
used as bare `@njit`; update numba_mod.njit (and thus numba_mod.jit) so that if
the first positional argument is a callable (the decorated function) it returns
that function directly, otherwise it returns a decorator that accepts a function
and returns it; keep using _make_stub_module to create numba_mod and assign
numba_mod.jit = numba_mod.njit.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7a0505c6-237a-442a-9ea4-d5049be71dc8
📒 Files selected for processing (8)
other/experiments/jupyterlite/relax_structure_with_mlff.ipynbsrc/py/mat3ra/notebooks_utils/pyodide/packages/__init__.pysrc/py/mat3ra/notebooks_utils/pyodide/packages/install.pysrc/py/mat3ra/notebooks_utils/pyodide/packages/mace.pysrc/py/mat3ra/notebooks_utils/pyodide/packages/mattersim.pysrc/py/mat3ra/notebooks_utils/pyodide/packages/nequip.pysrc/py/mat3ra/notebooks_utils/pyodide/packages/torch.pysrc/py/mat3ra/notebooks_utils/pyodide/packages/uma.py
🚧 Files skipped from review as they are similar to previous changes (1)
- other/experiments/jupyterlite/relax_structure_with_mlff.ipynb
| try: | ||
| torch_geometric = import_module("mace.tools.torch_geometric") | ||
| torch_geometric_data = import_module("mace.tools.torch_geometric.data") | ||
| torch_geometric.data = torch_geometric_data | ||
| print("✓ MACE tools patches applied") | ||
| except Exception as exc: | ||
| print(f"⚠ MACE tools patches skipped: {exc}") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate and print the relevant file with line numbers around patch_mace_tools
FILE="src/py/mat3ra/notebooks_utils/pyodide/packages/mace.py"
if [ ! -f "$FILE" ]; then
echo "Missing file: $FILE" >&2
exit 1
fi
echo "=== $FILE (lines 1-220) ==="
nl -ba "$FILE" | sed -n '1,220p'
# Search for patch_mace_tools usage
echo
echo "=== rg: patch_mace_tools usage ==="
rg -n "patch_mace_tools\\b" -S src || true
# Search for torch_geometric data expectations in this module
echo
echo "=== rg: torch_geometric.data usage in repo ==="
rg -n "torch_geometric\\.data" -S src || true
# Find where MACE package version is specified for pyodide
echo
echo "=== Search for mace version in pyodide packaging ==="
rg -n "mace" src/py/mat3ra/notebooks_utils/pyodide -S || true
# Check for requirement / package list files in pyodide folder
echo
echo "=== List pyodide packages directory ==="
find src/py/mat3ra/notebooks_utils/pyodide -maxdepth 4 -type f | head -n 200Repository: Exabyte-io/api-examples
Length of output: 206
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="src/py/mat3ra/notebooks_utils/pyodide/packages/mace.py"
[ -f "$FILE" ] || { echo "Missing file: $FILE" >&2; exit 1; }
python3 - <<'PY'
import pathlib
p = pathlib.Path("src/py/mat3ra/notebooks_utils/pyodide/packages/mace.py")
text = p.read_text()
lines = text.splitlines()
start, end = 1, 220
for i in range(start, min(end, len(lines)) + 1):
print(f"{i:4d}\t{lines[i-1]}")
PY
echo
echo "=== rg: patch_mace_tools usage ==="
rg -n "patch_mace_tools\\b" -S src || true
echo
echo "=== rg: torch_geometric\\.data usage in repo ==="
rg -n "torch_geometric\\.data" -S src || true
echo
echo "=== Search for mace version / pin in pyodide ==="
rg -n "mace" src/py/mat3ra/notebooks_utils/pyodide -S || true
echo
echo "=== List pyodide packages dir (maxdepth 4) ==="
find src/py/mat3ra/notebooks_utils/pyodide -maxdepth 4 -type f | head -n 200Repository: Exabyte-io/api-examples
Length of output: 7453
🏁 Script executed:
#!/bin/bash
set -euo pipefail
python3 - <<'PY'
import pathlib
def show(path, start, end):
p = pathlib.Path(path)
if not p.exists():
print(f"Missing {path}")
return
lines = p.read_text().splitlines()
print(f"\n=== {path} (lines {start}-{end}) ===")
for i in range(start, min(end, len(lines))+1):
print(f"{i:4d}\t{lines[i-1]}")
show("src/py/mat3ra/notebooks_utils/pyodide/packages/torch.py", 1, 120)
show("src/py/mat3ra/notebooks_utils/pyodide/packages/torch.py", 220, 340)
show("src/py/mat3ra/notebooks_utils/pyodide/packages/torch.py", 600, 720)
show("src/py/mat3ra/notebooks_utils/pyodide/packages/install.py", 1, 220)
show("src/py/mat3ra/notebooks_utils/pyodide/packages/__init__.py", 1, 120)
show("src/py/mat3ra/notebooks_utils/pyodide/packages/mattersim.py", 150, 240)
PY
echo
echo "=== rg: apply_patches() call sites ==="
rg -n "apply_patches\\(" -S src/py/mat3ra/notebooks_utils/pyodide || true
echo
echo "=== rg: apply_all_patches\\(" -S src/py/mat3ra/notebooks_utils/pyodide || true
rg -n "apply_all_patches\\(\" -S src/py/mat3ra/notebooks_utils/pyodide || true
echo
echo "=== rg: torch_geometric\\.data import / sys.modules usage ==="
rg -n "torch_geometric\\.data" -S src/py/mat3ra/notebooks_utils/pyodide || trueRepository: Exabyte-io/api-examples
Length of output: 23085
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== rg: create_mace_calculator / get_mace_model_pyodide usage ==="
rg -n "create_mace_calculator|get_mace_model_pyodide" -S src || true
echo
echo "=== rg: apply_all_patches\\(\"mace\"\\) call sites ==="
rg -n "apply_all_patches\\(\"mace\"\\)" -S src || true
echo
echo "=== rg: apply_all_patches\\(.*mace ==="
rg -n "apply_all_patches\\([^\\)]*mace" -S src || true
echo
echo "=== locate pyodide config / requirements files ==="
find src/py/mat3ra/notebooks_utils/pyodide -maxdepth 5 -type f \( -name "*config*.yml" -o -name "*requirements*.yml" -o -name "*.yaml" -o -name "*packages*.yml" \) || true
echo
echo "=== rg: pinned mace package versions (micropip / pip lists) ==="
rg -n "mace(==|>=|<=|~=|!=|>|<)|\"mace\"|\\bmace\\b" src/py/mat3ra/notebooks_utils/pyodide -S || true
echo
echo "=== rg: torch_geometric.data patching in torch.py/common ==="
rg -n "torch_geometric\\.data" -S src/py/mat3ra/notebooks_utils/pyodide/packages/torch.py || trueRepository: Exabyte-io/api-examples
Length of output: 2985
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "apply_all_patches\\(" -S src/py/mat3ra/notebooks_utils/pyodide || true
rg -n "apply_mlff_patches\\(" -S src/py/mat3ra/notebooks_utils/pyodide || true
rg -n "apply_patches\\(" -S src/py/mat3ra/notebooks_utils/pyodide/packages || true
rg -n "apply_all_patches\\(" -S src/py/mat3ra || true
rg -n "apply_mlff_patches\\(" -S src/py/mat3ra || trueRepository: Exabyte-io/api-examples
Length of output: 1259
🏁 Script executed:
#!/bin/bash
set -euo pipefail
python3 - <<'PY'
import pathlib
p = pathlib.Path("src/py/mat3ra/notebooks_utils/mlff.py")
print("exists:", p.exists())
print("\n--- mlff.py (1-200) ---")
lines = p.read_text().splitlines()
for i in range(1, min(200, len(lines))+1):
print(f"{i:4d}\t{lines[i-1]}")
PY
echo
echo "=== rg: create_mace_calculator call sites ==="
rg -n "create_mace_calculator\\(" -S src || true
echo
echo "=== rg: get_mace_model_pyodide call sites ==="
rg -n "get_mace_model_pyodide\\(" -S src || true
echo
echo "=== rg: apply_all_patches\\(\"mace\"\\) call sites (non-docstring where possible) ==="
# best-effort: limit to python files and grep usage lines
rg -n "apply_all_patches\\(\"mace\"\\)" -S src || true
echo
echo "=== rg: apply_mlff_patches\\(\"mace\"\\) call sites ==="
rg -n "apply_mlff_patches\\(\"mace\"\\)" -S src || trueRepository: Exabyte-io/api-examples
Length of output: 2062
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "=== rg: 'All Pyodide patches applied successfully' ==="
rg -n "All Pyodide patches applied successfully" -S src || true
echo
echo "=== rg: apply_common_torch_patches call sites ==="
rg -n "apply_common_torch_patches\\(" -S src || true
echo
echo "=== rg: apply_all_patches call sites (not restricting to pyodide folder) ==="
rg -n "apply_all_patches\\(" -S src || true
echo
echo "=== rg: apply_mlff_patches call sites ==="
rg -n "apply_mlff_patches\\(" -S src || true
echo
echo "=== files under pyodide/runtime.py and api entrypoints (quick inspect) ==="
python3 - <<'PY'
import pathlib
for f in [
"src/py/mat3ra/notebooks_utils/pyodide/runtime.py",
"src/py/mat3ra/notebooks_utils/pyodide/api/__init__.py",
"src/py/mat3ra/notebooks_utils/pyodide/api/auth.py",
"src/py/mat3ra/notebooks_utils/pyodide/ui.py",
"src/py/mat3ra/notebooks_utils/pyodide/io.py",
]:
p = pathlib.Path(f)
print(f, "exists:", p.exists(), "lines:", len(p.read_text().splitlines()) if p.exists() else None)
PYRepository: Exabyte-io/api-examples
Length of output: 1424
🏁 Script executed:
#!/bin/bash
set -euo pipefail
python3 - <<'PY'
import pathlib
paths = [
"src/py/mat3ra/notebooks_utils/pyodide/runtime.py",
"src/py/mat3ra/notebooks_utils/pyodide/io.py",
"src/py/mat3ra/notebooks_utils/pyodide/ui.py",
"src/py/mat3ra/notebooks_utils/pyodide/api/auth.py",
]
for f in paths:
p = pathlib.Path(f)
print(f"\n=== {f} (first 220 lines or full, whichever shorter) ===")
lines = p.read_text().splitlines()
end = min(220, len(lines))
for i in range(1, end+1):
print(f"{i:4d}\t{lines[i-1]}")
if len(lines) > 220:
print(f"... ({len(lines)} total lines)")
PY
echo
echo "=== rg: apply_all_patches|apply_common_torch_patches in pyodide/ ==="
rg -n "apply_all_patches\\(|apply_common_torch_patches\\(" -S src/py/mat3ra/notebooks_utils/pyodide || true
echo
echo "=== rg: create_mlff_calculator call sites ==="
rg -n "create_mlff_calculator\\(" -S src || true
echo
echo "=== rg: mlff_name.*mace call sites ==="
rg -n "mace" src/py/mat3ra/notebooks_utils -S || trueRepository: Exabyte-io/api-examples
Length of output: 17964
Avoid swallowing all exceptions in patch_mace_tools() (fail with a clear error or enforce post-condition).
In src/py/mat3ra/notebooks_utils/pyodide/packages/mace.py (lines 34-40), except Exception as exc only prints a warning and continues, leaving mace.calculators to fail later if mace.tools.torch_geometric.data couldn’t be imported/assigned. Restrict the exception you suppress (e.g., ImportError/expected shape issues) and/or verify that mace.tools.torch_geometric ends up with the .data attribute before proceeding; otherwise re-raise.
🧰 Tools
🪛 Ruff (0.15.15)
[warning] 39-39: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/py/mat3ra/notebooks_utils/pyodide/packages/mace.py` around lines 34 - 40,
The try/except in patch_mace_tools currently swallows all exceptions which can
hide import or assignment failures for torch_geometric and leave
mace.tools.torch_geometric without a .data attribute; change the except to catch
only expected exceptions (e.g., ImportError, AttributeError) and after the
import/assignment assert that torch_geometric has the data attribute (e.g.,
verify hasattr(torch_geometric, "data") and that torch_geometric.data is
torch_geometric_data); if the check fails, raise a clear RuntimeError (or
re-raise the original exception) so callers (like mace.calculators) fail fast
with an explanatory message.
Source: Linters/SAST tools
| @staticmethod | ||
| def create(data): | ||
| return _DictConfig(data) if isinstance(data, dict) else data |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In OmegaConf, what does OmegaConf.create([1, 2])return, and shouldisinstance(OmegaConf.create([1, 2]), ListConfig) evaluate to true?
💡 Result:
The call to OmegaConf.create([1][2]) returns an instance of omegaconf.listconfig.ListConfig [3][4][5]. Yes, isinstance(OmegaConf.create([1][2]), ListConfig) should evaluate to True [6][7]. OmegaConf configuration objects (DictConfig and ListConfig) are designed to provide a rich API for handling configuration data while intentionally not inheriting from native Python primitive types like list or dict [8]. Because of this design choice, they will not evaluate to True for standard native type checks (e.g., isinstance(cfg, list) will return False) [8]. To check if an object is an OmegaConf ListConfig, you should use isinstance(obj, ListConfig) or the provided helper function OmegaConf.is_list(obj) [4][6][7]. If you need to convert the ListConfig back into a native Python list, you can use OmegaConf.to_container(obj) [8][9].
Citations:
- 1: https://github.com/omry/omegaconf/blob/master/tests/test_create.py
- 2: https://omegaconf.readthedocs.io/en/2.0%5Fbranch/usage.html
- 3: https://omegaconf.readthedocs.io/en/latest/_modules/omegaconf/omegaconf.html
- 4: https://omegaconf.readthedocs.io/en/latest/api%5Freference.html
- 5: https://github.com/omry/omegaconf/blob/master/omegaconf/omegaconf.py
- 6: https://omegaconf.readthedocs.io/en/2.2_branch/usage.html
- 7: https://omegaconf.readthedocs.io/en/2.3_branch/usage.html
- 8: how to assert omegaconf.listconfig.ListConfig is list omry/omegaconf#51
- 9: Possible way to convert
omegaconf.listconfig.ListConfiginto python list? omry/omegaconf#620
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="src/py/mat3ra/notebooks_utils/pyodide/packages/torch.py"
# Show the relevant section around the cited lines
nl -ba "$FILE" | sed -n '1,140p' | sed -n '30,90p'
# Search within the file for _ensure_omegaconf_stub and ListConfig definitions/exports
rg -n "_ensure_omegaconf_stub|ListConfig|_ListConfig|class _ListConfig|def create\(" "$FILE"
# Print the exact create() implementation and any class/typing stubs related to it
rg -n "class _OmegaConf|def create\(" "$FILE" -nRepository: Exabyte-io/api-examples
Length of output: 110
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="src/py/mat3ra/notebooks_utils/pyodide/packages/torch.py"
# Basic existence check
test -f "$FILE" && echo "FOUND $FILE" || (echo "MISSING $FILE" && exit 1)
# Show the surrounding lines for the cited range (approx; keep small)
echo "---- sed 1-120 ----"
sed -n '1,120p' "$FILE"
echo "---- sed 30-80 ----"
sed -n '30,80p' "$FILE"
# Search for relevant symbols
echo "---- rg for relevant symbols ----"
rg -n "def _ensure_omegaconf_stub|_ensure_omegaconf_stub|ListConfig|_ListConfig|class _OmegaConf|def create\(" "$FILE"Repository: Exabyte-io/api-examples
Length of output: 5419
Fix _OmegaConf.create() to return ListConfig for list inputs (align with real OmegaConf).
_ensure_omegaconf_stub() exposes ListConfig, but _OmegaConf.create() currently wraps only dict inputs and returns a plain list for list inputs—unlike real OmegaConf, where OmegaConf.create([1, 2]) yields a ListConfig, so isinstance(OmegaConf.create([1, 2]), ListConfig) should be True.
Suggested change
`@staticmethod`
def create(data):
- return _DictConfig(data) if isinstance(data, dict) else data
+ if isinstance(data, dict):
+ return _DictConfig(data)
+ if isinstance(data, list):
+ return _ListConfig(data)
+ return data📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @staticmethod | |
| def create(data): | |
| return _DictConfig(data) if isinstance(data, dict) else data | |
| `@staticmethod` | |
| def create(data): | |
| if isinstance(data, dict): | |
| return _DictConfig(data) | |
| if isinstance(data, list): | |
| return _ListConfig(data) | |
| return data |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/py/mat3ra/notebooks_utils/pyodide/packages/torch.py` around lines 40 -
42, _OmegaConf.create currently only wraps dict inputs with _DictConfig and
returns raw lists; update create to also wrap list inputs with the ListConfig
stub so behavior matches real OmegaConf. Change _OmegaConf.create to check for
list instances and return the ListConfig wrapper (e.g. return _ListConfig(data)
if isinstance(data, list), return _DictConfig(data) if isinstance(data, dict),
else return data), ensuring you reference the ListConfig class exposed by
_ensure_omegaconf_stub() and the existing _DictConfig symbol.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@other/materials_designer/workflows/local/relaxation_mlff_mace.ipynb`:
- Around line 47-49: The call to apply_all_patches("mace") is executed
unconditionally and stubs native modules in non‑Pyodide kernels; wrap the
invocation of
mat3ra.notebooks_utils.pyodide.packages.patches.apply_all_patches("mace") in a
runtime check so it only runs under Pyodide (for example test for Pyodide by
checking sys.platform == "emscripten", or the presence of "pyodide" in
sys.modules or a pyodide-specific object) and skip it otherwise; locate the
apply_all_patches("mace") call and replace it with a short guard that imports
sys (or tries importing pyodide) and calls apply_all_patches only when the guard
indicates a Pyodide runtime.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: ebaaabae-b9d9-4aa5-a448-c145901899da
📒 Files selected for processing (5)
other/experiments/jupyterlite/relax_structure_with_mace.ipynbother/experiments/jupyterlite/relax_structure_with_mlff.ipynbother/materials_designer/workflows/local/relaxation_mlff_mace.ipynbpyproject.tomlsrc/py/mat3ra/notebooks_utils/pyodide/packages/uma.py
🚧 Files skipped from review as they are similar to previous changes (2)
- other/experiments/jupyterlite/relax_structure_with_mlff.ipynb
- src/py/mat3ra/notebooks_utils/pyodide/packages/uma.py
| "from mat3ra.notebooks_utils.pyodide.packages.patches import apply_all_patches\n", | ||
| "\n", | ||
| "apply_all_patches()" | ||
| "apply_all_patches(\"mace\")" |
There was a problem hiding this comment.
Gate Pyodide patch orchestration to Pyodide runtime only.
Line 49 currently runs MLFF Pyodide patches unconditionally in a local Python kernel. That path invokes MACE patch helpers that stub modules (e.g., lmdb, h5py), which can poison later imports in non-Pyodide sessions.
Suggested fix
from mat3ra.notebooks_utils.pyodide.packages.patches import apply_all_patches
+from mat3ra.notebooks_utils.primitive.environment import is_pyodide_environment
-apply_all_patches("mace")
+if is_pyodide_environment():
+ apply_all_patches("mace")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "from mat3ra.notebooks_utils.pyodide.packages.patches import apply_all_patches\n", | |
| "\n", | |
| "apply_all_patches()" | |
| "apply_all_patches(\"mace\")" | |
| from mat3ra.notebooks_utils.pyodide.packages.patches import apply_all_patches | |
| from mat3ra.notebooks_utils.primitive.environment import is_pyodide_environment | |
| if is_pyodide_environment(): | |
| apply_all_patches("mace") |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@other/materials_designer/workflows/local/relaxation_mlff_mace.ipynb` around
lines 47 - 49, The call to apply_all_patches("mace") is executed unconditionally
and stubs native modules in non‑Pyodide kernels; wrap the invocation of
mat3ra.notebooks_utils.pyodide.packages.patches.apply_all_patches("mace") in a
runtime check so it only runs under Pyodide (for example test for Pyodide by
checking sys.platform == "emscripten", or the presence of "pyodide" in
sys.modules or a pyodide-specific object) and skip it otherwise; locate the
apply_all_patches("mace") call and replace it with a short guard that imports
sys (or tries importing pyodide) and calls apply_all_patches only when the guard
indicates a Pyodide runtime.
Summary by CodeRabbit
New Features
Chores