perf: 200-robot simulation optimization — parallel plugin loop + -O3/LTO by sabarish-prasannna · Pull Request #41 · rapyuta-robotics/flatland

sabarish-prasannna · 2026-06-25T03:38:25Z

Summary

Builds on feature/90robots-physics-skip. Adds two complementary optimizations to bring 200-robot simulation CPU from ~90% down to <20%:

Parallel plugin loop: PluginManager::BeforePhysicsStep and AfterPhysicsStep now dispatch each robot's plugin group (SootballNavigatorPlugin + SootballPlugin) as an independent std::async task, utilizing all available CPU cores. World plugins remain sequential after all model plugins complete.
-O3 + LTO: flatland_lib and flatland_server now compile with -O3 -march=native and link-time optimization (skipped when COVERAGE=ON).

Details

Parallel plugin dispatch (`plugin_manager.cpp`)

Groups model_plugins_ by Model* pointer — each robot owns exactly its N plugins.
Launches one std::future<void> per robot, waits on all before proceeding to world plugins.
Thread-safe: Box2D body reads outside physics step are safe, each robot writes only its own bodies, ros::Publisher::publish() is thread-safe in roscpp.
PROFILER macros kept outside lambdas to avoid concurrent map access.

Build flags (`CMakeLists.txt`)

if(NOT "${COVERAGE}" STREQUAL "ON")
    target_compile_options(flatland_lib PRIVATE -O3 -march=native)
    target_compile_options(flatland_server PRIVATE -O3 -march=native)
    if(CMAKE_VERSION VERSION_GREATER_EQUAL 3.9)
        set_property(TARGET flatland_lib PROPERTY INTERPROCEDURAL_OPTIMIZATION TRUE)
        set_property(TARGET flatland_server PROPERTY INTERPROCEDURAL_OPTIMIZATION TRUE)
    endif()
endif()

Expected Impact

Change	Gain
Parallel plugin loop	4–8× on 8–16 core machine
-O3 + LTO	1.2–1.4× raw throughput

Test plan

Build: catkin build flatland_server
Run 90-robot fast sim, confirm CPU drop vs feature/90robots-physics-skip baseline
Run 200-robot fast sim, confirm CPU < 20%
Confirm robots still navigate correctly (no stuck robots, task completion rate unchanged)
Check /tmp/flatland_profile_output.log — "Before Physics Step: model_plugins (parallel)" time should drop proportionally

🤖 Generated with Claude Code

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Group model plugins by owning Model* and dispatch each robot's plugin group as a std::async task. Robots run concurrently; plugins within the same robot run sequentially. World plugins stay sequential. Expected 4-8x speedup on multi-core machines for 200-robot simulations. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

std::async(launch::async) on Linux/libstdc++ creates a new OS thread on every call — there is no implicit pooling. With 90 robots at 50 effective steps/sec this was 9000 thread create+destroy per second, explaining the 72.9% CPU on the main flatland_server thread despite the work being distributed. Fix: add ModelPluginThreadPool (created once in PluginManager constructor, sized to hardware_concurrency) and reuse its workers every step. Also pre-compute plugin_groups_ on Load/Delete instead of rebuilding an unordered_map on every BeforePhysicsStep/AfterPhysicsStep call. Expected result: main-thread CPU drops from ~70% to ~5-10% (synchronization and world-plugin cost only); total CPU scales with actual plugin work. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

deeparaj24 and others added 3 commits June 25, 2026 11:57

perf: enable -O3 and LTO for 200-robot optimization

e17dbf7

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: 200-robot simulation optimization — parallel plugin loop + -O3/LTO#41

perf: 200-robot simulation optimization — parallel plugin loop + -O3/LTO#41
sabarish-prasannna wants to merge 3 commits into
feature/90robots-physics-skipfrom
kaoiwt001_simulation_200robots_optimization

sabarish-prasannna commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

sabarish-prasannna commented Jun 25, 2026

Summary

Details

Parallel plugin dispatch (plugin_manager.cpp)

Build flags (CMakeLists.txt)

Expected Impact

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Parallel plugin dispatch (`plugin_manager.cpp`)

Build flags (`CMakeLists.txt`)