Check duplicate issues.
Description
[Windows] Macro df002_dataModel.C crashes on exit or exits silently due to potential RVec/Heap issues across DLL boundaries in Cling JIT
1. The Problem
When running the official tutorial macro df002_dataModel.C on Windows via root -b -q tutorials\analysis\dataframe\df002_dataModel.C, two types of unstable behaviors/potential bugs are sometimes observed:
- Symptom A (Crash on exit): The macro executes but crashes upon exiting, spilling a wall of JIT/ORC symbol materialization errors from Cling:
cling::DynamicLibraryManager::loadLibrary(): LoadLibrary: returned 126: j|
Error in <AutoloadLibraryMU>: Failed to load library C:\root_v6.39.99\bin\libROOTDataFrame.dll[runStaticInitializersOnce]:
Failed to materialize symbols: { (main, { ??$?0VRNodeBase@RDF@Detail@ROOT@@@?$shared_ptr@VRLoopManager@RDF@Detail@ROOT@@@std@@QAE@$$QAV?$shared_ptr@VRNodeBase@RDF@Detail@ROOT@@@1@PAVRLoopManager@RDF@Detail@ROOT@@@Z, ... }) }
...
[runStaticInitializersOnce]: Failed to materialize symbols: { (main, { ____orc_init_func.cling-module-8 }) }
cling JIT session error: Failed to materialize symbols: { (main, { ?df002_dataModel@@YAHXZ }) }
- Symptom B (Silent exit): In other identical runs, the macro terminates abruptly and exits silently without finishing expected outputs or cleaning up resource workflows properly.
2. Potential Root Cause Analysis (Windows Heap Management Context)
The symbol dump and runtime behavior suggest a potential heap state mismatch or instability inside the memory allocation/reallocation pipeline of ROOT::VecOps::RVec on Windows.
As highlighted in Microsoft's official documentation regarding CRT boundaries (Potential Errors Passing CRT Objects Across DLL Boundaries), managing and reallocating heap memory across different module contexts on Windows requires strict runtime environment consistency. In the case of df002_dataModel.C:
- The core data structures and initial storage of
RVec are managed by pre-compiled code inside libROOTDataFrame.dll / libCore.dll.
- However, during the execution of
RDataFrame::Define and Filter, Cling JIT inline-expands or dynamically compiles user expressions at runtime. This means memory reallocation (growth) requests might be issued or evaluated within a dynamically managed JIT execution context.
- When standard C-style
realloc is invoked in such an environment involving overlapping boundaries (Pre-compiled DLL vs. JIT runtime), it may lead to context conflicts within the Windows heap manager, potentially causing either a silent abort during processing or a JIT session crash when Cling attempts to tear down symbols and release resources upon exit.
3. Proposed Solution
To safely grow and reallocate RVec memory on Windows without violating DLL heap boundaries, I have been exploring a workaround that switches from C-style realloc to explicit C++ allocation hooks.
The targeted locations for this modification are:
math/vecops/src/RVec.cxx: SmallVectorBase::grow_pod
math/vecops/inc/ROOT/RVec.hxx: SmallVectorTemplateBase::grow and destructor-related deallocations
I have managed to stabilize this macro locally by tentatively replacing the realloc routine on Windows with a combination of global ::operator new(..., std::nothrow), memcpy, and ::operator delete.
(Note: Although Microsoft documentation warns that even global new can cause issues if /MD is mismatched, in a uniform project environment where ROOT strictly mandates the dynamic CRT, C++ global ::operator new seems to ensure a more unified heap context under the process heap, effectively bypassing the fragile module-specific boundary assertions triggered by standard C realloc under JIT workflows.)
However, I fully acknowledge that RVec is an ultra-critical core container built for high-performance physics computation. Keeping realloc on Linux/macOS is essential for in-place reallocation optimization ($\mathcal{O}(1)$ complexity). Therefore, I suggest restricting this workaround strictly to Windows using platform macro guards:
#ifdef _WIN32
// Safe double-track mechanism for Windows:
// Allocate via ::operator new, memcpy elements, and ::operator delete the old heap
#else
// Original high-performance realloc path for Linux / macOS
#endif
4. Discussion & Next Steps
I would love to discuss this with the maintainers to find the best architectural consensus for Windows JIT heap stability. If this approach or a similar minimal patch is acceptable, I am more than happy to submit a Pull Request (PR) to help resolve this issue.
Reproducer
root -b -q tutorials\analysis\dataframe\df002_dataModel
ROOT version
Observed on v6.38.04 and v6.39.99 (likely affects multiple recent versions on Windows)
Installation method
pre-built binary and build from source
Operating system
Windows 11
Additional context
No response
Check duplicate issues.
Description
[Windows] Macro df002_dataModel.C crashes on exit or exits silently due to potential RVec/Heap issues across DLL boundaries in Cling JIT
1. The Problem
When running the official tutorial macro
df002_dataModel.Con Windows viaroot -b -q tutorials\analysis\dataframe\df002_dataModel.C, two types of unstable behaviors/potential bugs are sometimes observed:2. Potential Root Cause Analysis (Windows Heap Management Context)
The symbol dump and runtime behavior suggest a potential heap state mismatch or instability inside the memory allocation/reallocation pipeline of
ROOT::VecOps::RVecon Windows.As highlighted in Microsoft's official documentation regarding CRT boundaries (Potential Errors Passing CRT Objects Across DLL Boundaries), managing and reallocating heap memory across different module contexts on Windows requires strict runtime environment consistency. In the case of
df002_dataModel.C:RVecare managed by pre-compiled code insidelibROOTDataFrame.dll/libCore.dll.RDataFrame::DefineandFilter, Cling JIT inline-expands or dynamically compiles user expressions at runtime. This means memory reallocation (growth) requests might be issued or evaluated within a dynamically managed JIT execution context.reallocis invoked in such an environment involving overlapping boundaries (Pre-compiled DLL vs. JIT runtime), it may lead to context conflicts within the Windows heap manager, potentially causing either a silent abort during processing or a JIT session crash when Cling attempts to tear down symbols and release resources upon exit.3. Proposed Solution
To safely grow and reallocate
RVecmemory on Windows without violating DLL heap boundaries, I have been exploring a workaround that switches from C-stylereallocto explicit C++ allocation hooks.The targeted locations for this modification are:
math/vecops/src/RVec.cxx:SmallVectorBase::grow_podmath/vecops/inc/ROOT/RVec.hxx:SmallVectorTemplateBase::growand destructor-related deallocationsI have managed to stabilize this macro locally by tentatively replacing the
reallocroutine on Windows with a combination of global::operator new(..., std::nothrow),memcpy, and::operator delete.(Note: Although Microsoft documentation warns that even global
newcan cause issues if/MDis mismatched, in a uniform project environment where ROOT strictly mandates the dynamic CRT, C++ global::operator newseems to ensure a more unified heap context under the process heap, effectively bypassing the fragile module-specific boundary assertions triggered by standard Creallocunder JIT workflows.)However, I fully acknowledge that$\mathcal{O}(1)$ complexity). Therefore, I suggest restricting this workaround strictly to Windows using platform macro guards:
RVecis an ultra-critical core container built for high-performance physics computation. Keepingreallocon Linux/macOS is essential for in-place reallocation optimization (4. Discussion & Next Steps
I would love to discuss this with the maintainers to find the best architectural consensus for Windows JIT heap stability. If this approach or a similar minimal patch is acceptable, I am more than happy to submit a Pull Request (PR) to help resolve this issue.
Reproducer
ROOT version
Observed on
v6.38.04andv6.39.99(likely affects multiple recent versions on Windows)Installation method
pre-built binary and build from source
Operating system
Windows 11
Additional context
No response