Skip to content

refactor(ltm): redesign long-term memory with append-only incremental contexts#8144

Open
RC-CHN wants to merge 30 commits into
AstrBotDevs:masterfrom
RC-CHN:refactor-ltm
Open

refactor(ltm): redesign long-term memory with append-only incremental contexts#8144
RC-CHN wants to merge 30 commits into
AstrBotDevs:masterfrom
RC-CHN:refactor-ltm

Conversation

@RC-CHN
Copy link
Copy Markdown
Member

@RC-CHN RC-CHN commented May 11, 2026

Motivation

Fixes #8080
Rewrite the long-term memory (LTM) module from a ring buffer to an append-only architecture that keeps context prefixes stable across requests — enabling KV cache hits and the associated cost discounts (typically 1/10 of standard pricing across OpenAI, Anthropic, DeepSeek, and cloud providers).

Modifications / 改动点

Core: astrbot/builtin_stars/astrbot/long_term_memory.py

  • Replace max_cnt ring buffer with raw_records (deque) + _raw_cursor + contexts (append-only list). Old segments are never rebuilt.
  • _build_segments() converts raw chat lines into OpenAI-format context segments, handling tool calls, parallel tools, and multi-step chains.
  • <BOT/> markers replace [You/] to avoid nickname collisions.
  • on_agent_done records tool-call chains and now includes the @bot prompt in contexts so future rounds see the user's original message.
  • asyncio.Lock for concurrency safety; remove_session() for cleanup.

Hook wiring: astrbot/builtin_stars/astrbot/main.py

  • Swap @on_llm_response@on_agent_done for accurate tool-chain recording.
  • Lazy toggle detection: false→true cleans stale state on next message.
  • group_icl_enable=true skips Conversation DB query (conversation=None).

Config: astrbot/builtin_stars/astrbot/default.py

  • Default context_limit_reached_strategy"llm_compress".

Agent runner: astrbot/core/astr_main_agent.py

  • _get_compress_provider auto-falls back to the main chat provider when llm_compress_provider_id is unset, preventing silent truncation.

Tests: tests/unit/test_long_term_memory.py (new, 47 tests)

  • Pure functions: extract, parse, truncate, build_segments (31 tests).

  • Integration: round-trip lifecycle, multi-round accumulation, tool chains, persona preservation, concurrent safety (16 tests).

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

Tested on personal self-hosted astrbot.
image


Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Refactor the long-term memory subsystem to use an append-only, incremental context architecture and integrate it with agent completion hooks, while improving default compression behavior and regression coverage.

Enhancements:

  • Redesign group chat long-term memory to store raw messages and derived contexts in an append-only structure with concurrency-safe trimming and incremental segment building for LLM requests.
  • Update main agent wiring to build LLM contexts at request time, record full agent tool-call chains after completion, and lazily reset long-term memory state when toggling group ICL.
  • Change the default context limit reached strategy to use LLM-based compression instead of truncating by turns.
  • Allow the compression pipeline to fall back to the primary chat provider when no dedicated compression provider is configured or available.

Tests:

  • Add an extensive test suite for the new long-term memory implementation, covering parsing helpers, segment construction, multi-round accumulation, tool-chain recording, extreme inputs, persona interaction, and concurrency behavior.

Summary by Sourcery

Refactor group long-term memory to an append-only, incrementally built context model integrated with agent completion hooks, while tightening context compression behavior and isolating request-time context guarding from persistent history management.

Enhancements:

  • Redesign long-term memory to store raw group messages and derived LLM contexts in an append-only structure with configurable truncation and optional LLM-based summarization, including tool-call chains and bot replies.
  • Introduce a request-scoped context guard in the agent runner so per-request truncation/compression no longer mutates persistent conversation history, and adjust provider payload construction accordingly.
  • Adjust group ICL wiring to build contexts at request time, record agent results via the agent-done hook, and lazily reset LTM state when toggling group memory on or off.
  • Expand provider LTM and main agent configuration to support richer compaction controls, history tool-result truncation, raw record size limits, and improved default context compression strategy (LLM-based by default).

Tests:

  • Add an extensive unit test suite for the new long-term memory implementation covering parsing helpers, segment construction, multi-round accumulation, tool-chain recording, extreme inputs, persona interaction, and concurrency behavior.

@auto-assign auto-assign Bot requested review from advent259141 and anka-afk May 11, 2026 01:38
@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. area:core The bug / feature is about astrbot's core, backend area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels May 11, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • The MAX_* limits (MAX_MSGS_PER_USER_SEGMENT, MAX_CHARS_PER_USER_SEGMENT, MAX_RAW_BYTES) are currently hard-coded; consider wiring these through configuration (e.g., provider_ltm_settings) so different deployments or groups can tune memory usage and retention behavior without code changes.
  • In _trim_raw_records, total is recomputed by summing len(s.encode()) on every call, which is O(n); if this runs frequently on busy groups, consider tracking a running byte-size counter per umo to avoid repeatedly traversing the deque.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The MAX_* limits (MAX_MSGS_PER_USER_SEGMENT, MAX_CHARS_PER_USER_SEGMENT, MAX_RAW_BYTES) are currently hard-coded; consider wiring these through configuration (e.g., provider_ltm_settings) so different deployments or groups can tune memory usage and retention behavior without code changes.
- In _trim_raw_records, total is recomputed by summing len(s.encode()) on every call, which is O(n); if this runs frequently on busy groups, consider tracking a running byte-size counter per umo to avoid repeatedly traversing the deque.

## Individual Comments

### Comment 1
<location path="astrbot/builtin_stars/astrbot/long_term_memory.py" line_range="290-299" />
<code_context>
+    # 裁剪
+    # =========================================================================
+
+    def _trim_raw_records(self, umo: str) -> None:
+        """仅淘汰 cursor 之前的条目。cursor 之后的绝不碰(issue #2)。"""
+        dq = self.raw_records[umo]
+        cursor = self._raw_cursor[umo]
+
+        # 1. 无条件清除 cursor 之前的条目(已消费)
+        while dq and cursor > 0:
+            dq.popleft()
+            cursor -= 1
+        self._raw_cursor[umo] = cursor
+
+        # 2. 按大小继续从前面淘汰(限制极端情况的总内存)
+        total = sum(len(s.encode()) for s in dq)
+        while total > MAX_RAW_BYTES and dq and cursor > 0:
+            removed = dq.popleft()
+            total -= len(removed.encode())
</code_context>
<issue_to_address>
**issue (bug_risk):** Size-based trimming branch is effectively dead due to cursor reset logic.

In `_trim_raw_records`, the first loop always decrements `cursor` to 0 and then writes it back to `self._raw_cursor[umo]`. As a result, in the size-based loop `while total > MAX_RAW_BYTES and dq and cursor > 0:`, `cursor` is always 0 and the loop never runs, so `MAX_RAW_BYTES` is never enforced.

To preserve the intended behavior (always drop fully-consumed entries, and then optionally drop additional consumed entries to satisfy `MAX_RAW_BYTES`), you’ll need to decouple the notion of “consumed index” from the deque length. For example, track how many entries are removed in the first loop and use that to derive which entries are safe to drop in the size-based phase, rather than relying on `cursor > 0` after the first loop.
</issue_to_address>

### Comment 2
<location path="tests/unit/test_long_term_memory.py" line_range="207-216" />
<code_context>
+    def test_tool_call_then_result_then_bot(self):
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test for `_build_segments` when a tool result appears without a preceding tool call.

Current `_build_segments` tests only cover well-formed tool flows (`<T:CALL>``<T:RES>``<BOT>`). Please add a case where a `<T:RES>` appears without a prior `<T:CALL>`, e.g.:

```python
def test_tool_result_without_call_then_bot(self):
    lines = [
        "<T:RES id=orphan>data</T:RES>",
        "<BOT/14:30>: ok",
    ]
    result = _build_segments(lines)
    # assert behavior: either a valid tool segment or clean ignore, no exception,
    # and an intact assistant segment.
```

This helps ensure `_build_segments` behaves predictably with partial or inconsistent histories.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread astrbot/builtin_stars/astrbot/long_term_memory.py Outdated
Comment thread tests/unit/test_long_term_memory.py
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces Long Term Memory (LTM) v2, which significantly improves chatroom memory management by implementing incremental context building, support for tool-call history, and memory-efficient message tracking using deques and cursors. The changes also include a fallback mechanism for LLM compression and extensive unit tests. Feedback focuses on several critical areas: a potential memory leak in the contexts dictionary which is currently append-only, a logic error in the size-based trimming of raw records that renders some code unreachable, and the risk of KeyError crashes when parsing malformed tool-call records. Additionally, there is a discrepancy between the system prompt's description of bot message markers and the actual role-based formatting sent to the LLM.

Comment on lines +50 to +51
self.contexts: dict[str, list[dict]] = defaultdict(list)
"""累积累积态 LLM 上下文。由 ContextManager 修改后保留。"""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The self.contexts dictionary is append-only and never pruned. In long-running sessions or active group chats, this will lead to a memory leak as the list of segments grows indefinitely. While append-only contexts help with KV cache hits, you should still implement a maximum context length (e.g., based on the provider's window or a safe segment count) to prevent unbounded memory growth.

Comment on lines +156 to +161
async with self._lock:
umo = event.unified_msg_origin

# 记录写入前索引 → on_req_llm 精确排除(issue #1, #9)
raw_idx = len(self.raw_records[umo])
event.set_extra("_ltm_raw_idx", raw_idx)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

handle_message appends to raw_records but never triggers trimming. In groups that rarely interact with the bot, raw_records will grow indefinitely because _trim_raw_records is only called during an agent run. Trimming should be performed here (before calculating raw_idx) to ensure memory usage remains bounded. Note that since this logic is synchronous and does not contain 'await' calls, it is executed atomically in the asyncio event loop and does not require an explicit lock.

        umo = event.unified_msg_origin
        self._trim_raw_records(umo)

        # 记录写入前索引 → on_req_llm 精确排除(issue #1, #9)
        raw_idx = len(self.raw_records[umo])
        event.set_extra("_ltm_raw_idx", raw_idx)
References
  1. In a single-threaded asyncio event loop, synchronous functions (code blocks without 'await') are executed atomically and will not be interrupted by other coroutines. Therefore, they are safe from race conditions when modifying shared state within that block.

Comment thread astrbot/builtin_stars/astrbot/long_term_memory.py Outdated
Comment thread astrbot/builtin_stars/astrbot/long_term_memory.py
Comment thread astrbot/builtin_stars/astrbot/long_term_memory.py Outdated
@RC-CHN RC-CHN marked this pull request as draft May 11, 2026 01:58
@RC-CHN RC-CHN marked this pull request as ready for review May 12, 2026 08:58
@auto-assign auto-assign Bot requested review from Fridemn and Raven95676 May 12, 2026 08:58
@dosubot dosubot Bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels May 12, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • LongTermMemory.on_agent_done holds the asyncio.Lock while awaiting LLM summary calls, which can block all LTM operations for that process during slow network I/O; consider releasing the lock before invoking the provider and then re-acquiring it only to apply the resulting summary/compaction updates.
  • LongTermMemory.remove_session mutates raw_records, contexts, cursors, and summary state without acquiring _lock, which can race with concurrent handle_message/on_req_llm/on_agent_done; consider wrapping the cleanup in the same lock to keep internal structures consistent.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- LongTermMemory.on_agent_done holds the asyncio.Lock while awaiting LLM summary calls, which can block all LTM operations for that process during slow network I/O; consider releasing the lock before invoking the provider and then re-acquiring it only to apply the resulting summary/compaction updates.
- LongTermMemory.remove_session mutates raw_records, contexts, cursors, and summary state without acquiring _lock, which can race with concurrent handle_message/on_req_llm/on_agent_done; consider wrapping the cleanup in the same lock to keep internal structures consistent.

## Individual Comments

### Comment 1
<location path="astrbot/core/config/default.py" line_range="4155-4113" />
<code_context>
+                            "provider_ltm_settings.ltm_compaction_strategy": "llm_summary",
+                        },
+                    },
+                    "provider_ltm_settings.ltm_summary_provider_id": {
+                        "description": "LTM 摘要模型",
+                        "type": "string",
+                        "_special": "select_provider",
+                        "hint": "llm_summary 策略使用的模型,留空使用当前聊天模型。",
+                        "condition": {
+                            "provider_ltm_settings.ltm_compaction_strategy": "llm_summary",
+                        },
</code_context>
<issue_to_address>
**issue (bug_risk):** The "leave empty to use current chat model" hint for LTM summary provider does not match the implementation.

The hint suggests `ltm_summary_provider_id` can be left empty to use the current chat model, but `LongTermMemory.on_agent_done` only calls `_compact_with_llm_summary` when `provider_id` is truthy, and `_compact_with_llm_summary` requires a concrete provider ID. So an empty value actually disables LTM summarisation.

To fix this, either make an empty `ltm_summary_provider_id` resolve a provider (e.g. via `context.get_using_provider(umo=...)` inside `_compact_with_llm_summary`), or update the hint to clarify that an empty value turns off summarisation, so users aren’t misled about whether summaries are running.
</issue_to_address>

### Comment 2
<location path="tests/unit/test_long_term_memory.py" line_range="674-683" />
<code_context>
+    async def test_tool_call_recorded_in_raw(self, ltm, mock_event):
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test to cover `history_tool_result_truncate=False` so that long tool outputs are not unexpectedly truncated

`on_agent_done` uses `history_tool_result_truncate` and `history_tool_result_max_chars` to optionally truncate tool results before persisting them. We already test `_truncate_tool_result_for_history` and that tool chains are recorded into contexts, but there’s no test for the case where `history_tool_result_truncate` is explicitly `False`.

Please add a test that:
1. Configures an `ltm` whose `cfg` returns `history_tool_result_truncate=False` for the relevant group.
2. Sends a tool message with very long `content` through `on_agent_done`.
3. Asserts that the stored tool result in `raw_records` and the derived `contexts` contains the full content (no truncation marker).

This will ensure the config flag is respected and avoid silent data loss when truncation is disabled.

Suggested implementation:

```python
        return LongTermMemory(MagicMock(), ctx)

    @pytest.mark.asyncio
    async def test_tool_result_not_truncated_when_disabled(self, ltm, mock_event):
        """当 history_tool_result_truncate=False 时,长工具结果不会被截断。"""
        from unittest.mock import MagicMock
        from astrbot.api.provider import LLMResponse

        # 1. 配置 ltm 的 cfg,使对应分组的 history_tool_result_truncate=False
        #    同时给 max_chars 一个很小的值,以确保如果截断开启就一定会生效。
        def cfg_side_effect(group, key, default=None):
            if key == "history_tool_result_truncate":
                return False
            if key == "history_tool_result_max_chars":
                return 10
            return default

        # 如果已有 cfg 方法,这里用 side_effect 覆盖;如果没有,则视需要在实现中调整
        if hasattr(ltm, "cfg"):
            original_cfg = ltm.cfg
            ltm.cfg = MagicMock(side_effect=cfg_side_effect)
        else:
            ltm.cfg = MagicMock(side_effect=cfg_side_effect)
            original_cfg = None

        try:
            # 2. 构造一个非常长的工具输出,通过 on_agent_done 走完整存储链路
            long_content = "TOOL-RESULT-" + ("X" * 5000)

            tool_msg = MagicMock()
            tool_msg.role = "tool"
            tool_msg.tool_call_id = "tool-1"
            tool_msg.name = "dummy_tool"
            tool_msg.content = long_content

            response = LLMResponse(
                id="test_tool_result_not_truncated_when_disabled",
                model="test-model",
                choices=[
                    LLMResponse.Choice(
                        index=0,
                        message=tool_msg,
                        finish_reason="stop",
                    )
                ],
            )

            # 通过 on_agent_done 触发持久化逻辑
            await ltm.on_agent_done(mock_event, response)

            # 3. 断言 raw_records 和 contexts 中都包含完整内容且没有截断标记
            # 这里假设 ltm 暴露了 raw_records/contexts 或类似结构,
            # 如果命名不同,可在实现中按实际结构调整。
            raw_repr = repr(getattr(ltm, "raw_records", getattr(ltm, "raw_store", "")))
            ctx_repr = repr(getattr(ltm, "contexts", getattr(ltm, "context_store", "")))

            assert long_content in raw_repr
            assert long_content in ctx_repr

            # 常见的截断标记(按实现调整):例如 "…", "...(truncated)", "[TRUNCATED]"
            assert "TRUNCATED" not in raw_repr.upper()
            assert "TRUNCATED" not in ctx_repr.upper()
        finally:
            # 恢复原始 cfg(如果有的话),避免影响其它测试
            if original_cfg is not None:
                ltm.cfg = original_cfg

    @pytest.mark.asyncio
    async def test_tool_call_recorded_in_raw(self, ltm, mock_event):
        """工具调用被记录到 contexts 中的 assistant(tool_calls) + tool 消息。"""
        from astrbot.api.provider import LLMResponse
        from astrbot.api.message_components import Plain

        mock_event.get_messages.return_value = [Plain(text="@bot weather")]
        await ltm.handle_message(mock_event)

        from unittest.mock import MagicMock
        tc_msg = MagicMock()
        tc_msg.role = "assistant"

```

1. 根据 `LongTermMemory` 的真实接口调整测试中对配置和存储的访问:
   - 如果配置接口不是 `ltm.cfg(group, key, default=None)`,请将 `cfg_side_effect` 中的参数签名及调用方式改成你们实际使用的配置 API(例如 `ltm.cfg("history_tool_result_truncate", group)``ltm.cfg.get(...)` 等)。
   -`raw_repr`/`ctx_repr` 的获取逻辑改为你们真实的持久化结构,例如:
     - 如果有 `ltm.raw_records`,且每条记录为 `{"tool_result": {"content": ...}}`,可以改为:
       ```python
       raw_records = ltm.raw_records
       assert any(r["tool_result"]["content"] == long_content for r in raw_records)
       ```
     - 如果 contexts 是 `ltm.contexts``ltm.context_store`,按实际字段名取出内容并断言包含 `long_content` 且不含截断标记。
2. 如果 `on_agent_done` 的签名不同(例如 `await ltm.on_agent_done(response, mock_event)` 或需要额外参数),请按实现调整调用顺序和参数。
3. 如果截断的标记字符串与 `"TRUNCATED"` 不同(例如 `"…(truncated)"``"...(truncated)"``"[TRUNCATED]"` 等),请将断言中检查的标记替换为你们实际使用的截断标记字符串。
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread astrbot/core/config/default.py
Comment on lines +674 to +683
async def test_tool_call_recorded_in_raw(self, ltm, mock_event):
"""工具调用被记录到 contexts 中的 assistant(tool_calls) + tool 消息。"""
from astrbot.api.provider import LLMResponse
from astrbot.api.message_components import Plain

mock_event.get_messages.return_value = [Plain(text="@bot weather")]
await ltm.handle_message(mock_event)

from unittest.mock import MagicMock
tc_msg = MagicMock()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add a test to cover history_tool_result_truncate=False so that long tool outputs are not unexpectedly truncated

on_agent_done uses history_tool_result_truncate and history_tool_result_max_chars to optionally truncate tool results before persisting them. We already test _truncate_tool_result_for_history and that tool chains are recorded into contexts, but there’s no test for the case where history_tool_result_truncate is explicitly False.

Please add a test that:

  1. Configures an ltm whose cfg returns history_tool_result_truncate=False for the relevant group.
  2. Sends a tool message with very long content through on_agent_done.
  3. Asserts that the stored tool result in raw_records and the derived contexts contains the full content (no truncation marker).

This will ensure the config flag is respected and avoid silent data loss when truncation is disabled.

Suggested implementation:

        return LongTermMemory(MagicMock(), ctx)

    @pytest.mark.asyncio
    async def test_tool_result_not_truncated_when_disabled(self, ltm, mock_event):
        """当 history_tool_result_truncate=False 时,长工具结果不会被截断。"""
        from unittest.mock import MagicMock
        from astrbot.api.provider import LLMResponse

        # 1. 配置 ltm 的 cfg,使对应分组的 history_tool_result_truncate=False
        #    同时给 max_chars 一个很小的值,以确保如果截断开启就一定会生效。
        def cfg_side_effect(group, key, default=None):
            if key == "history_tool_result_truncate":
                return False
            if key == "history_tool_result_max_chars":
                return 10
            return default

        # 如果已有 cfg 方法,这里用 side_effect 覆盖;如果没有,则视需要在实现中调整
        if hasattr(ltm, "cfg"):
            original_cfg = ltm.cfg
            ltm.cfg = MagicMock(side_effect=cfg_side_effect)
        else:
            ltm.cfg = MagicMock(side_effect=cfg_side_effect)
            original_cfg = None

        try:
            # 2. 构造一个非常长的工具输出,通过 on_agent_done 走完整存储链路
            long_content = "TOOL-RESULT-" + ("X" * 5000)

            tool_msg = MagicMock()
            tool_msg.role = "tool"
            tool_msg.tool_call_id = "tool-1"
            tool_msg.name = "dummy_tool"
            tool_msg.content = long_content

            response = LLMResponse(
                id="test_tool_result_not_truncated_when_disabled",
                model="test-model",
                choices=[
                    LLMResponse.Choice(
                        index=0,
                        message=tool_msg,
                        finish_reason="stop",
                    )
                ],
            )

            # 通过 on_agent_done 触发持久化逻辑
            await ltm.on_agent_done(mock_event, response)

            # 3. 断言 raw_records 和 contexts 中都包含完整内容且没有截断标记
            # 这里假设 ltm 暴露了 raw_records/contexts 或类似结构,
            # 如果命名不同,可在实现中按实际结构调整。
            raw_repr = repr(getattr(ltm, "raw_records", getattr(ltm, "raw_store", "")))
            ctx_repr = repr(getattr(ltm, "contexts", getattr(ltm, "context_store", "")))

            assert long_content in raw_repr
            assert long_content in ctx_repr

            # 常见的截断标记(按实现调整):例如 "…", "...(truncated)", "[TRUNCATED]"
            assert "TRUNCATED" not in raw_repr.upper()
            assert "TRUNCATED" not in ctx_repr.upper()
        finally:
            # 恢复原始 cfg(如果有的话),避免影响其它测试
            if original_cfg is not None:
                ltm.cfg = original_cfg

    @pytest.mark.asyncio
    async def test_tool_call_recorded_in_raw(self, ltm, mock_event):
        """工具调用被记录到 contexts 中的 assistant(tool_calls) + tool 消息。"""
        from astrbot.api.provider import LLMResponse
        from astrbot.api.message_components import Plain

        mock_event.get_messages.return_value = [Plain(text="@bot weather")]
        await ltm.handle_message(mock_event)

        from unittest.mock import MagicMock
        tc_msg = MagicMock()
        tc_msg.role = "assistant"
  1. 根据 LongTermMemory 的真实接口调整测试中对配置和存储的访问:
    • 如果配置接口不是 ltm.cfg(group, key, default=None),请将 cfg_side_effect 中的参数签名及调用方式改成你们实际使用的配置 API(例如 ltm.cfg("history_tool_result_truncate", group)ltm.cfg.get(...) 等)。
    • raw_repr/ctx_repr 的获取逻辑改为你们真实的持久化结构,例如:
      • 如果有 ltm.raw_records,且每条记录为 {"tool_result": {"content": ...}},可以改为:
        raw_records = ltm.raw_records
        assert any(r["tool_result"]["content"] == long_content for r in raw_records)
      • 如果 contexts 是 ltm.contextsltm.context_store,按实际字段名取出内容并断言包含 long_content 且不含截断标记。
  2. 如果 on_agent_done 的签名不同(例如 await ltm.on_agent_done(response, mock_event) 或需要额外参数),请按实现调整调用顺序和参数。
  3. 如果截断的标记字符串与 "TRUNCATED" 不同(例如 "…(truncated)""...(truncated)""[TRUNCATED]" 等),请将断言中检查的标记替换为你们实际使用的截断标记字符串。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core The bug / feature is about astrbot's core, backend area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] 关于Astrbot设置里面的那些让大模型API资费猛增的"毒点"

1 participant