feat: add samply profiler support#338
Conversation
468b6c8 to
0e6e4ea
Compare
b55c9b2 to
9a2ab1b
Compare
GuillaumeLagrange
left a comment
There was a problem hiding this comment.
Switching to github mid review, have to publish 😬
| ) -> anyhow::Result<CommandBuilder>; | ||
|
|
||
| /// The benchmarked process signaled the start of a measured region. | ||
| async fn on_start_benchmark(&mut self) -> anyhow::Result<()> { |
There was a problem hiding this comment.
Could we make use of the fact that we are changing stuff here to clear up the name of the markers?
We've discussed this multiple time, and it's quite a huge clusterfuck
The current names with markers is this
/// It expects the data to be structured like this:
/// - SampleStart
/// - BenchmarkStart
/// - N samples
/// - BenchmarkEnd
/// - SampleEnd
But we also have the fifo commands StartBenchmark and StopBenchmark which actually push SampleStart/End markers
What I propose for this is just a rename that would make everything a bit clearer
The external marker, currently named SampleStart should be named ProfilerSamplingStart, or something that conveys the idea that we are starting and stopping the profiler's sampling functionality, and we should rename the FifoCommand::StartBenchmark to reflect that. Another naming possibility could be BenchmarkRegion if we want to be a bit farther from the implementation detail of starting/stopping the profiler
The internal marker would be clearer if it was something like RoundStart RoundEnd, as in practice it's what is being done.
WE SHOULD BE ABSOLUTELY EXTREMELY CAREFUL TO MAKE SURE IT DOES NOT IMPACT THE SERIALIZED IDs, it's only meant to be a naming clarification.
WDYT?
There was a problem hiding this comment.
Wdyt about this:
pub enum MarkerType {
SampleStart(u64),
SampleEnd(u64),
RoundStart(u64),
RoundEnd(u64),
}
and then rename the commands to StartProfiler/StopProfiler/PingProfiler
GuillaumeLagrange
left a comment
There was a problem hiding this comment.
first pass done, let me know if you want to sync on some points
|
|
||
| pub struct WallTimeExecutor { | ||
| perf: Option<PerfRunner>, | ||
| profiler: Option<Box<dyn Profiler>>, |
There was a problem hiding this comment.
Since it wont really change at runtime and is computed from static cfg asserts anyway, wouldnt it be better to have a type generic profiler rather than a Box<dyn> ?
There was a problem hiding this comment.
We could, but then we'd have to represent the case where we dont' have a profiler (e.g. on Windows). We could have a NoopProfiler that does nothing, but I don't feel like this properly fits what we want to represent.
And the overhead of Box shouldn't be an issue because we're not really calling this in a hot path where the performance matters. So I'd leave it as-is, lmk if you disagree
| let mut samply_builder = CommandBuilder::new("samply"); | ||
| // samply is bundled into this binary as the `samply` subcommand; | ||
| // re-exec ourselves so we don't depend on a system install. | ||
| let current_exe = std::env::current_exe() |
There was a problem hiding this comment.
If we define the subcommands that are invoked by the CLI, we could have this as a helper that would just take the enum value and build the command with args etc
There was a problem hiding this comment.
I'm not quite sure I get what you mean, because we need to pass it as a string to be able to run codspeed samply .... Could you give me an exaxmple?
a4336d4 to
021df91
Compare
Relocate src/executor/wall_time/perf/ to src/executor/wall_time/profiler/perf/ in preparation for additional profiler backends (samply).
Output filename also changes from perf.metadata to walltime.metadata. This is a wire-breaking change that requires a coordinated server rollout to read the new filename.
021df91 to
0fdb0de
Compare
PerfRunner becomes PerfProfiler implementing the Profiler trait. The RunnerFifo loop moves out of the perf module and into WallTimeExecutor, which now drives the trait callbacks (on_start_benchmark / on_stop_benchmark / on_ping / GetIntegrationMode) generically. Profiler::wrap stashes the perf control fifo and output path on the profiler instance; finalize harvests the perf.data artifacts and writes walltime.metadata. The OnceCell<BenchmarkData> bridge is gone — fifo data and timestamps are passed directly into finalize. Executor::run now takes &mut self to allow profilers to hold per-run state on the trait object. This cascades through run_executor, orchestrator, and the Memory/Valgrind executors (no behavioral change for those — they don't mutate self).
…BLED Introduce --enable-profiler / CODSPEED_PROFILER_ENABLED as the canonical flag now that walltime profiling is profiler-agnostic. The legacy --enable-perf / CODSPEED_PERF_ENABLED is kept as a hidden alias and emits a deprecation warning when used, so existing configurations keep working.
Tokio's OpenOptions::read_write is Linux-only, but the O_RDWR trick works on every Unix and is required on macOS to avoid open() deadlocks when both FIFO ends are opened before the peer process is spawned.
Mirror the clock used elsewhere in CodSpeed on macOS (mach_absolute_time with cached mach_timebase_info) so runner and benchmark process timestamps come from the exact same source. Adds mach2 as a macOS-only dependency.
Move the Linux profiling sysctl setup out of the perf profiler and into a shared walltime profiler helper. This lets both perf and samply prepare kernel symbol access and non-root profiling consistently while preserving the existing sudo-based behavior.
Use the samply crate directly instead of a standalone binary. Adds a `codspeed samply` subcommand that forwards args to samply's CLI and dispatches to `do_record_action`. The wall_time profiler re-execs the current binary, removing the need for a PATH-installed samply.
2140160 to
dd3a1e3
Compare
Enabling perf JIT support on macOS produces many unresolved addresses in stack traces. Force it off until we understand why.
Rename `StartBenchmark`/`StopBenchmark` to `StartProfiler`/`StopProfiler` and `BenchmarkStart`/`BenchmarkEnd` markers to `RoundStart`/`RoundEnd` to match the profiler-agnostic data format.
dd3a1e3 to
c7f0146
Compare
No description provided.