[core] Create event log files lazily to avoid stale/empty logs#64366
[core] Create event log files lazily to avoid stale/empty logs#64366nuocbiendang12 wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces lazy initialization for the LogEventReporter's spdlog logger (log_sink_) to prevent empty log files from being created at startup. The review feedback highlights a critical thread-safety issue: because EnsureSinkInitialized() can be called concurrently from multiple threads, the unsynchronized check and initialization of log_sink_ can lead to data races and crashes. The reviewer recommends using std::call_once with a std::once_flag to ensure thread-safe initialization.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| // The underlying spdlog logger. nullptr until the first event is written. | ||
| std::shared_ptr<spdlog::logger> log_sink_; |
There was a problem hiding this comment.
To support thread-safe lazy initialization of log_sink_ using std::call_once, we need to declare a std::once_flag member variable.
// The underlying spdlog logger. nullptr until the first event is written.
std::shared_ptr<spdlog::logger> log_sink_;
// Ensure thread-safe lazy initialization of log_sink_.
std::once_flag sink_init_once_;| void LogEventReporter::EnsureSinkInitialized() { | ||
| if (log_sink_ != nullptr) { | ||
| return; | ||
| } | ||
| log_sink_ = spdlog::get(log_sink_key_); | ||
| // If the file size is over {rotate_max_file_size_} MB, this file would be renamed | ||
| // for example event_GCS.0.log, event_GCS.1.log, event_GCS.2.log ... | ||
| // We allow to rotate for {rotate_max_file_num_} times. | ||
| if (log_sink_ == nullptr) { | ||
| log_sink_ = spdlog::rotating_logger_mt(log_sink_key, | ||
| log_sink_ = spdlog::rotating_logger_mt(log_sink_key_, | ||
| log_dir_ + file_name_, | ||
| 1048576 * rotate_max_file_size_, | ||
| rotate_max_file_num_); | ||
| } | ||
| log_sink_->set_pattern("%v"); | ||
| } |
There was a problem hiding this comment.
In a multi-threaded environment, EventManager::Publish can be called concurrently from multiple threads because it only acquires a reader lock (absl::ReaderMutexLock). This means LogEventReporter::Report (and consequently EnsureSinkInitialized) can be executed concurrently by multiple threads.
Without synchronization, the check if (log_sink_ != nullptr) and the subsequent initialization of log_sink_ (which is a std::shared_ptr) is a data race, leading to undefined behavior. Furthermore, concurrent calls to spdlog::rotating_logger_mt with the same key will throw a spdlog_ex exception, causing the process to crash.
To resolve this, we should use std::call_once with a std::once_flag to ensure that log_sink_ is initialized safely and exactly once.
void LogEventReporter::EnsureSinkInitialized() {
std::call_once(sink_init_once_, [this]() {
log_sink_ = spdlog::get(log_sink_key_);
// If the file size is over {rotate_max_file_size_} MB, this file would be renamed
// for example event_GCS.0.log, event_GCS.1.log, event_GCS.2.log ...
// We allow to rotate for {rotate_max_file_num_} times.
if (log_sink_ == nullptr) {
log_sink_ = spdlog::rotating_logger_mt(log_sink_key_,
log_dir_ + file_name_,
1048576 * rotate_max_file_size_,
rotate_max_file_num_);
}
log_sink_->set_pattern("%v");
});
}There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 2 potential issues.
Reviewed by Cursor Bugbot for commit ea2243f. Configure here.
21370c9 to
7790675
Compare
Fixes ray-project#64153 Previously, LogEventReporter created event log files eagerly in its constructor, causing empty log files to appear on every Ray startup even when no events were ever written: - logs/events/event_CORE_WORKER_<pid>.log (always empty) - logs/events/event_GCS.log (usually empty) - logs/events/event_RAYLET.log (usually empty) - logs/export_events/*.log (empty when export framework enabled) This change defers log file creation to the first call to Report() or ReportExportEvent() via a new EnsureSinkInitialized() helper. If no events are written, no log file is created on disk. Changes: - Added EnsureSinkInitialized() private helper method that creates the spdlog rotating_logger_mt only on first use - Added log_sink_key_ member field to store the key for deferred init - Modified constructor to only compute and store the file name/path, without opening the file - Modified Flush() to safely handle an uninitialized (nullptr) sink - Modified Report() and ReportExportEvent() to call EnsureSinkInitialized() before writing All existing behavior is preserved when events are actually written. Signed-off-by: Lê Thanh Châu <19120463@student.hcmus.edu.vn>
Signed-off-by: Lê Thanh Châu <19120463@student.hcmus.edu.vn>
7790675 to
f680daf
Compare

Why are these changes needed?
Fixes #64153
Ray currently creates several event log files eagerly at process startup,
even when they are never written to:
logs/events/event_CORE_WORKER_<pid>.log— always emptylogs/events/event_GCS.log— usually emptylogs/events/event_RAYLET.log— usually emptylogs/export_events/*.log— empty when new export framework is enabledThis creates unnecessary file clutter and wastes I/O on every Ray startup.
What do these changes do?
Changes
LogEventReporterinsrc/ray/util/event.ccto use lazyinitialization for the underlying log file. The log file is now only
created/opened on the first call to
Report(), not in the constructor.Key changes:
EnsureSinkInitialized()private helper methodlog_sink_key_as a member variable for deferred initializationFlush()to safely handle uninitialized sinkRelated issues/PRs
Closes #64153
Checks
git commit -sscripts/format.shto lint the changes in Pythonclang-formaton C++ changes