Skip to content

fix: 支持日志级别运行时热切换#8146

Open
Sisyphbaous-DT-Project wants to merge 4 commits into
AstrBotDevs:masterfrom
Sisyphbaous-DT-Project:fix/runtime-log-hot-switch
Open

fix: 支持日志级别运行时热切换#8146
Sisyphbaous-DT-Project wants to merge 4 commits into
AstrBotDevs:masterfrom
Sisyphbaous-DT-Project:fix/runtime-log-hot-switch

Conversation

@Sisyphbaous-DT-Project
Copy link
Copy Markdown
Contributor

@Sisyphbaous-DT-Project Sisyphbaous-DT-Project commented May 11, 2026

Motivation / 动机

我平时跑 AstrBot 的时候,基本都把日志级别挂在 INFO,不然控制台和 WebUI 里全是调试信息,看着挺乱的。但偶尔排查问题又必须临时切到 DEBUG 看一眼详细日志,之前系统配置里改了 log_level 之后,正在跑的 logger 根本不认新配置,非得重启整个 AstrBot 才能生效。

这个体验对临时查问题来说挺不友好的——只是想看一眼 debug 日志,却要打断当前正在运行的服务。所以这个 PR 给日志配置加上了运行时热更新,让日志级别可以从 INFODEBUG 之间随意切,不用重启,改完配置马上生效。

Modifications / 改动点

核心思路就是在系统配置保存后,自动检查日志相关配置有没有发生变化,如果变了就立刻应用到运行中的 logger,并且保证这个过程是安全的。具体做了这么几件事:

  1. 配置变更检测
    系统配置保存成功后,会对比运行时日志相关配置项是否被修改,包括:

    • log_level
    • log_file_enable
    • log_file_path
    • log_file_max_mb
    • trace_log_enable
    • trace_log_path
    • trace_log_max_mb
      同时也兼容旧版的 log_file 配置结构,不会因为配置字段迁移导致检测失效。
  2. 即时应用新日志配置
    一旦发现上述配置有变动,就立刻把新的日志配置推到当前正在运行的 logger 里。

    • log_level 后,不管是文件日志还是 WebUI 实时日志,都不用重启就能看到效果。
    • INFO 切到 DEBUG,之后新打出来的 debug 日志会马上出现。
    • 再从 DEBUG 切回 INFO,之后新产生的 debug 日志会立刻被过滤掉。
  3. 给日志重配置加了锁
    LogManager 里加了一个 RLock,防止多个配置保存请求并发触发时,file sink 和 trace sink 交叉 remove / add,导致日志输出状态乱掉。

  4. 优化 sink 替换逻辑,保证不中断已有日志
    文件日志和 trace 日志的 sink 替换都遵循“先建新的,成功后再换旧的”原则。
    如果新的日志路径不可写或者 sink 创建失败,会保留正在用的旧 sink,不会因为一次配置错误就把原本好使的日志输出给掐了。

  5. 普通日志和 trace 日志的热更新互相独立
    普通日志热更新失败不会影响 trace 日志的热更新,反过来也一样。两边异常各自捕获、各自处理,不会互相拖累。

  6. save_config() 支持传入旧配置快照
    为了防止调用方在保存前已经修改了内存里的配置对象,导致差异检测拿不到真正的“旧值”,现在允许显式传入一份旧配置快照。系统配置页在保存前会先把旧配置截下来,后面日志配置变更判断就用这份快照,保证比对结果是准的。

Modified Files / 修改文件

  • astrbot/core/log.py

    • 加了日志重配置锁。
    • 新增 _replace_file_sink()_replace_trace_sink()
    • 调整了 configure_logger()configure_trace_logger() 的 sink 替换逻辑,改成先创建新 sink 再替换。
  • astrbot/dashboard/routes/config.py

    • 增加了运行时日志配置差异判断逻辑。
    • 系统配置保存成功后,按需触发日志配置的热应用。
    • 支持传入 old_config_snapshot,避免差异检测被提前修改的配置值影响。
  • tests/test_log_manager_runtime.py

    • 新增了运行时 sink 替换相关的测试用例。
  • 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

先跑了代码风格检查和语法校验:

uv run ruff format .
uv run ruff check .
python -S -m py_compile astrbot/core/log.py astrbot/dashboard/routes/config.py
pytest tests/test_log_manager_runtime.py -q

结果:
4 passed

然后在真实的 AstrBot 进程里做了一整套验证,流程是:

  1. 启动 AstrBot,登录 WebUI API。
  2. 通过 /api/config/astrbot/updatelog_level 设成 INFO
  3. 触发生成 logger.debug() 的场景,确认文件日志和 WebUI 里都没有 debug 日志。
  4. 不重启,直接把 log_level 切到 DEBUG
  5. 再次触发同样的 debug 场景,确认 debug 日志立刻写入了文件日志,并且在 WebUI 日志历史里也出现了。
  6. 再次不重启,切回 INFO
  7. 再触发一次 debug 场景,确认新 debug 日志不再出现。

最终得到的真实测试结果:

{
  login: ok,
  debug_present_while_info: false,
  debug_present_after_debug_switch_without_restart: true,
  debug_count_before_switch_back_info: 1,
  debug_count_after_switch_back_info: 1,
  webui_debug_visible_while_info: false,
  webui_debug_visible_after_debug_switch_without_restart: true,
  webui_debug_visible_after_switch_back_info: false
}

Notes / 说明

  • 日志级别切换完全不需要重启,保存配置就生效。
  • 文件日志和 WebUI 实时日志都能跟着 log_level 热切换,行为一致。
  • 切换只影响之后新产生的日志。之前在旧级别下被筛掉的 debug 日志不会补回来,这是符合预期的。
  • 这个 PR 没有去修改 trace_enabletrace_log_enable 之间原本的语义差异,只处理了系统配置页里日志级别和文件日志配置的热切换。

Checklist / 检查清单

  • 如果 PR 中有新加入的功能,已经通过 Issue / 说明等方式解释了动机和使用场景。
  • 我的更改经过了测试,并已在上方提供验证步骤和测试结果。
  • 我确认没有引入新依赖。
  • 我的更改没有引入恶意代码。
QQ_1778487377464

Summary by Sourcery

Enable runtime reconfiguration of logging and trace settings without restarting AstrBot.

New Features:

  • Allow core configuration saves to trigger live updates of log level and file-based logging settings when they change.
  • Allow trace logging configuration changes to be applied at runtime independently of normal logging.
  • Support passing an old configuration snapshot into save_config for accurate change detection.

Enhancements:

  • Make LogManager file and trace sink replacement resilient by creating new sinks before swapping and preserving existing sinks on failure.
  • Protect logger reconfiguration with a re-entrant lock to avoid race conditions during concurrent config updates.

Tests:

  • Add runtime LogManager tests to ensure sink replacement preserves existing sinks on failure and cleans up correctly when disabling sinks.

@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. area:core The bug / feature is about astrbot's core, backend labels May 11, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The legacy log_file mapping logic is now duplicated across configure_logger, configure_trace_logger, _runtime_log_config, and _runtime_trace_log_config; consider extracting a small helper to normalize the config into a single canonical structure before use to keep behavior consistent and reduce drift.
  • The lock-based synchronization is only applied in configure_logger/configure_trace_logger, while _replace_file_sink and _replace_trace_sink can be called directly (as in tests); if there is or might be external usage of these helpers, consider either making them truly internal (or guarded) or moving the lock inside them to avoid accidental unsynchronized sink updates.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The legacy `log_file` mapping logic is now duplicated across `configure_logger`, `configure_trace_logger`, `_runtime_log_config`, and `_runtime_trace_log_config`; consider extracting a small helper to normalize the config into a single canonical structure before use to keep behavior consistent and reduce drift.
- The lock-based synchronization is only applied in `configure_logger`/`configure_trace_logger`, while `_replace_file_sink` and `_replace_trace_sink` can be called directly (as in tests); if there is or might be external usage of these helpers, consider either making them truly internal (or guarded) or moving the lock inside them to avoid accidental unsynchronized sink updates.

## Individual Comments

### Comment 1
<location path="astrbot/dashboard/routes/config.py" line_range="42" />
<code_context>

 MAX_FILE_BYTES = 500 * 1024 * 1024

+_RUNTIME_LOG_KEYS = (
+    "log_level",
+    "log_file_enable",
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the new runtime logging configuration logic by normalizing log-related settings, avoiding deep copies, and consolidating snapshot handling around save_config.

You can reduce the added complexity without changing behaviour by:

1. **Flattening/normalizing the log config extraction**
2. **Removing unnecessary `copy.deepcopy`**
3. **Simplifying the change-detection / update flow**
4. **Avoiding double snapshot logic around `save_config`**

### 1. Normalize log-related config with a single helper

Instead of two separate helpers that both build dicts with nested `legacy_log_file` and use `deepcopy`, you can build a single normalized tuple/dict used for equality checks.

For example:

```python
_RUNTIME_LOG_KEYS = (
    "log_level",
    "log_file_enable",
    "log_file_path",
    "log_file_max_mb",
)

_RUNTIME_TRACE_LOG_KEYS = (
    "trace_log_enable",
    "trace_log_path",
    "trace_log_max_mb",
)


def _normalized_log_config(conf: dict) -> dict:
    # no deepcopy needed for scalars
    legacy = conf.get("log_file") or {}

    return {
        "log": {k: conf.get(k) for k in _RUNTIME_LOG_KEYS},
        "legacy_log": {
            "enable": legacy.get("enable"),
            "path": legacy.get("path"),
            "max_mb": legacy.get("max_mb"),
        },
        "trace": {k: conf.get(k) for k in _RUNTIME_TRACE_LOG_KEYS},
        "legacy_trace": {
            "enable": legacy.get("trace_enable"),
            "path": legacy.get("trace_path"),
            "max_mb": legacy.get("trace_max_mb"),
        },
    }
```

This replaces `_runtime_log_config` and `_runtime_trace_log_config`, removes `deepcopy`, and centralizes the legacy handling.

### 2. Simplify `_apply_runtime_log_config_if_changed`

With the normalized helper above, the function becomes shorter and more linear; you no longer need four dicts and an `updated` flag toggled from two branches:

```python
def _apply_runtime_log_config_if_changed(
    old_config: dict,
    new_config: dict,
) -> None:
    old_norm = _normalized_log_config(old_config)
    new_norm = _normalized_log_config(new_config)

    if old_norm == new_norm:
        return

    # detect which parts changed
    log_changed = (
        old_norm["log"] != new_norm["log"]
        or old_norm["legacy_log"] != new_norm["legacy_log"]
    )
    trace_changed = (
        old_norm["trace"] != new_norm["trace"]
        or old_norm["legacy_trace"] != new_norm["legacy_trace"]
    )

    any_updated = False

    if log_changed:
        try:
            LogManager.configure_logger(logger, new_config)
            any_updated = True
        except Exception:
            logger.error(
                "Failed to update runtime logger:\n%s",
                traceback.format_exc(),
            )

    if trace_changed:
        try:
            LogManager.configure_trace_logger(new_config)
            any_updated = True
        except Exception:
            logger.error(
                "Failed to update runtime trace logger:\n%s",
                traceback.format_exc(),
            )

    if any_updated:
        logger.info("Runtime log configuration updated.")
```

This keeps all behaviour (including separate error handling per logger) while making control flow more obvious.

If you prefer, you can also inline the change detection into small helpers (e.g. `_log_config_changed(old_norm, new_norm)`), but the key win is a single normalized structure instead of two different helpers.

### 3. Remove unnecessary `copy.deepcopy` in config snapshots

For both the runtime helpers and the core snapshot, you are mainly dealing with scalars and shallow dicts. The deep copy in `_runtime_log_config` / `_runtime_trace_log_config` is unnecessary once you normalize as above. For the main config snapshot, `copy.deepcopy` is probably only needed if `AstrBotConfig` contains nested mutable structures that _mutate in place_ after save.

If a shallow copy is enough for comparing runtime log-related keys, you can limit deep copying to where it’s really required:

```python
if is_core and old_config_snapshot is None:
    # if AstrBotConfig is dict-like and we only need to compare values,
    # a shallow dict copy is sufficient
    old_config_snapshot = dict(config)
```

If you *must* deep-copy for other reasons, keep it, but you can still drop deep-copying in the log comparison helpers as shown above.

### 4. Decouple `save_config` from external snapshots

Right now `_save_astrbot_configs` builds an `old_config_snapshot` and passes it to `save_config`, and `save_config` may also snapshot internally when `is_core` is `True`. This is extra branching and a second way to pass the same data.

Given your current usage, you can simplify by letting `save_config` always handle its own snapshot when `is_core` is `True`, and remove the external snapshotting from `_save_astrbot_configs`:

```python
async def _save_astrbot_configs(
    self, post_configs: dict, conf_id: str | None = None
) -> None:
    ...
    astrbot_config = self.acm.confs[conf_id]

    if "t2i_active_template" in astrbot_config:
        post_configs["t2i_active_template"] = astrbot_config[
            "t2i_active_template"
        ]

    # rely on save_config to snapshot when is_core=True
    save_config(post_configs, astrbot_config, is_core=True)
```

And keep `save_config`’s logic simple:

```python
def save_config(
    post_config: dict,
    config: AstrBotConfig,
    is_core: bool = False,
    old_config_snapshot: dict | None = None,
) -> None:
    errors = None

    if is_core:
        if old_config_snapshot is None:
            old_config_snapshot = dict(config)

        _log_computer_config_changes(dict(config), post_config)

    ...
    config.save_config(post_config)

    if is_core and old_config_snapshot is not None:
        _apply_runtime_log_config_if_changed(old_config_snapshot, dict(config))
```

If you still need the `old_config_snapshot` parameter for other call sites, this keeps compatibility while avoiding duplicate snapshotting in this path.

---

All of these changes keep the new runtime log-update feature, but reduce structural noise and make the intent (“if log-related config changed, reconfigure runtime loggers”) much clearer.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread astrbot/dashboard/routes/config.py
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements thread-safe runtime logging reconfiguration within the LogManager and integrates it into the dashboard configuration routes. Key changes include the addition of a reconfigure lock, refactoring sink replacement logic into internal helper methods, and implementing change detection during configuration saves to apply updates immediately. Review feedback suggests unifying logger usage for internal errors, optimizing configuration snapshots by removing unnecessary deepcopy calls for immutable scalars, and simplifying redundant change-detection logic in the dashboard routes.

Comment thread astrbot/core/log.py
Comment thread astrbot/dashboard/routes/config.py
Comment thread astrbot/dashboard/routes/config.py Outdated
@Sisyphbaous-DT-Project
Copy link
Copy Markdown
Contributor Author

没什么问题,Bot 的意见参考一下就行,不用当成硬性 block,代码没有影响实际功能的缺陷

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core The bug / feature is about astrbot's core, backend size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant