Skip to content

feat(kb-open): P0-B 9 个开放 API 端点#445

Open
ncw1992120 wants to merge 4 commits into
mateaix:devfrom
ncw1992120:feat/kb-open-api-p0b
Open

feat(kb-open): P0-B 9 个开放 API 端点#445
ncw1992120 wants to merge 4 commits into
mateaix:devfrom
ncw1992120:feat/kb-open-api-p0b

Conversation

@ncw1992120

Copy link
Copy Markdown
Contributor

Closes #442 · Part of #440 · Builds on #441 (P0-A)

改动

在 P0-A 认证骨架之上,实现 9 个知识库开放 API 端点。

9 个端点

# 方法 路径 Scope 说明
1 GET /pages/{slug} kb:read 实体卡片(mode=summary/full/section)
2 POST /search kb:search 混合检索(granularity=entity/chunk)
3 POST /search/chunks kb:search chunk 级细粒度检索
4 POST /pages/{slug}/traverse kb:read 实体关系遍历(depth≤2)
5 GET /pages/{slug}/trace kb:read 溯源(page→chunk→raw)
6 GET /taxonomy kb:list 类型枚举地图
7 GET /whats-new kb:meta 时效查询
8 GET /stats kb:meta 知识库统计
9 GET /pages kb:list 列页面

架构约束

  • A5:响应一律显式 DTO(records),绝不序列化实体(IDOR/字段泄露硬约束)
  • A6:service 层返回纯 DTO,不耦合 HTTP 类型(HttpServletRequest/R<T> 不漏进 service)

traverse(务实版)

  • depth ≤ 2 + 爆炸防护(visited 集合 + limit × 3 上限)
  • predicate LIKE 模糊匹配
  • slug → pageId → mention → primaryEntity(salience-highest)
  • 邻居节点回显 slug(R11)
  • 边溯源:evidenceChunkId → sourceHandle

测试

Tests run: 21, Failures: 0, Errors: 0 (P0-A 17 + P0-B 4)
BUILD SUCCESS
  • KbOpenApiControllerTest(4):404 missing page/slug、KB not found、service delegation

依赖

此 PR 包含 P0-A 的 cherry-pick commit(认证骨架)。若 #444(P0-A)先合并,rebase 后此 PR 只剩 P0-B 的 4 个文件。

@mateaix

mateaix commented Jun 28, 2026

Copy link
Copy Markdown
Owner

感谢这 9 个开放端点的实现 🙏 鉴权矩阵很干净:9 个端点全部带 @RequireKbScope(...)KbScopeInterceptor 在进 controller 前校验 scope + kbId ∈ boundKbIds,底层查询又都按路径 kbId 过滤,跨租户读取在拦截器和查询两层都被挡住;管理端 CRUD 也有 @RequireWorkspaceRole("admin") + workspace 归属自检,没发现 admin 侧 IDOR。

合并前有几个阻塞项

1. stats.pagesWithLinks 永远是 0(正确性 bug)。
KbOpenApiController.stats()pageService.listByKbId(kbId) 拿页面再 p.getContent().contains("[[") 统计,但 WikiPageService.listByKbId() 里有 pages.forEach(p -> p.setContent(null)) 把 content 抹掉了,null 守卫导致这个计数恒为 0。请改用带 content 的查询(如 listByKbIdWithContent)或换种方式统计链接。

2. 内联全限定名(违反 CLAUDE.md「禁止内联 FQN」)。
SecurityConfig.javaWebMvcConfig.javaprivate final vip.mate.kbopen.auth.KbScopeInterceptor …,以及测试里的 java.util.List.of(...)(测试也会同步到开源),都请改成顶部 import + 简单名。

3. kb-open-api-design.md 放在仓库根目录会泄漏内部引用到开源镜像。
文件含 RFC-090/R12 和内部 issue 流程;按仓库开源同步规则,根目录下非 PRIVATE_ITEMS.md 会被推到公开仓库。请移到 rfcs/ 或加进 PRIVATE_ITEMS(特性 PR 一般也不必把设计文档放进发布树)。

非阻塞建议:

P0-B 跟着 P0-A 的修改 rebase 即可,上面 3 个阻塞项改好后合并 🙏

ncw1992120 added a commit to ncw1992120/mateclaw that referenced this pull request Jun 28, 2026
BLOCKERS:
- stats.pagesWithLinks always returned 0 because listByKbId() nulls out
  content. Switch to listByKbIdWithContent() so [[wiki link]] detection works.
- Test file: replace inline java.util.List.of() FQN with import + simple name
  (sync-opensource would expose the unidiomatic style).

NITS (inherited from P0-A rebase):
- V162→V164, prefix VARCHAR(12), FQN imports, parseScopes trim, ?token=
  fallback removal, design doc moved to rfcs/ — all now in ancestor commit
  6fd6244.

EXTRA:
- whatsNew staleReason: hardcoded Chinese "上游 fact 页面变更" → English
  "Upstream fact page changed" (external-facing API response).
@ncw1992120 ncw1992120 force-pushed the feat/kb-open-api-p0b branch from 3b78663 to d3679ea Compare June 28, 2026 06:21
@ncw1992120

Copy link
Copy Markdown
Contributor Author

Fixed. Rebased onto P0-A (inherits V164 + FQN + parseScopes + design doc move) and addressed all 3 blockers + nits in commit d3679ea:

Blockers:

  1. stats.pagesWithLinkslistByKbId() nulls out content, so contains("[[") always returned 0. Switched to listByKbIdWithContent() which keeps content intact.
  2. Test inline FQN java.util.List.of(...) → added import java.util.List + List.of(...).
  3. Design doc moved to rfcs/ — inherited from P0-A rebase (6fd6244).

Nits (all inherited from P0-A rebase):

  • V162→V164, prefix VARCHAR(12), SecurityConfig/WebMvcConfig FQN imports, parseScopes .map(String::trim), ?token= fallback removed — all now in ancestor commit.

Extra:

  • whatsNew staleReason: hardcoded Chinese "上游 fact 页面变更""Upstream fact page changed" (external-facing API response).

21 tests pass (13 service + 4 rate limiter + 4 controller). Ready for re-review 🙏

@mateaix

mateaix commented Jun 28, 2026

Copy link
Copy Markdown
Owner

P0-B 这版的 review 已经通过 ✅ 三个阻塞项都改好了(stats.pagesWithLinks 改用 listByKbIdWithContent、内联 FQN 改 import、设计文档移到 rfcs/)。

#444(P0-A)已经合并进 dev,所以这条栈底现在被「压扁」成了 dev 上的一个新 commit。由于本 PR 的分支里还带着 P0-A 的两个原始提交,GitHub 现在把它判成冲突(mergeable=false)——这是堆叠 PR 用 squash 合并后栈上 PR 的常见情况,rebase 一下就好,代码本身没问题。

麻烦把分支变基到最新 dev,丢掉已并入的 P0-A 两个提交、只留 P0-B:

# 若还没有上游远端:
git remote add upstream https://github.com/mateaix/mateclaw.git

git fetch upstream dev
git checkout feat/kb-open-api-p0b

# 把 6fd62440(P0-A 的 fix 提交,已并入 dev)之后的 P0-B 两个提交
# 搬到最新 dev 之上:
git rebase --onto upstream/dev 6fd62440d

git push --force-with-lease

说明:

  • --onto upstream/dev 6fd62440d 会保留 6fd62440 之后的提交(a2c0d224 P0-B + d3679eab fix#445),把它们重放到最新 dev 上;P0-A 部分因为已经在 dev 里,就不再重复带入了。
  • 我本地试过这个变基是干净的(只动 5 个 P0-B 文件、零冲突),强推后应该立刻变回 mergeable
  • 直接 git rebase upstream/dev 也行,但 P0-A 那部分可能会让你手动解一次冲突(squash 后 patch-id 变了),所以推荐上面的 --onto 写法。

推上来变绿后我这边就合并 🙏 后续 #446(Deep Research)也建议等这条合并完再 rebase 到它上面。

Implements the 9 read-only KB Open API endpoints on top of the P0-A
auth skeleton (mateaix#441). Each returns an explicit DTO (A5: never raw
entities) and delegates assembly to service-layer methods that return
pure DTOs (A6: no HTTP coupling, MCP-ready).

Endpoints:
- GET  /pages/{slug}        entity card (mode=summary/full/section:{heading})
- POST /search              hybrid retrieval (granularity=entity/chunk)
- POST /search/chunks       chunk-level semantic search
- POST /pages/{slug}/traverse  entity relation graph (depth ≤ 2)
- GET  /pages/{slug}/trace  provenance (page → chunk → raw)
- GET  /taxonomy            pageType/entityType/relationType enumeration
- GET  /whats-new           recent changes + stale pages
- GET  /stats               KB statistics
- GET  /pages               lightweight page list

Components:
- KbOpenApiController: 9 endpoints, each @RequireKbScope annotated
- KbOpenApiService: assembly layer (card, traverse BFS, metadata parsing)
- KbOpenApiDtos: all response DTOs as records (PageCard, TraceResult,
  TaxonomyResult, KbStats, WhatsNewResult, TraverseResult, PageList)

Traverse (pragmatic version):
- depth ≤ 2 with explosion guard, predicate LIKE matching
- slug → pageId → mention → primaryEntity (salience-highest)
- neighbor nodes echo slug when available (R11)
- edge sourceHandle via evidenceChunkId → citing page

Tests (4 new, all green):
- KbOpenApiControllerTest: 404 on missing page/slug, delegation to service

Closes mateaix#442
BLOCKERS:
- stats.pagesWithLinks always returned 0 because listByKbId() nulls out
  content. Switch to listByKbIdWithContent() so [[wiki link]] detection works.
- Test file: replace inline java.util.List.of() FQN with import + simple name
  (sync-opensource would expose the unidiomatic style).

NITS (inherited from P0-A rebase):
- V162→V164, prefix VARCHAR(12), FQN imports, parseScopes trim, ?token=
  fallback removal, design doc moved to rfcs/ — all now in ancestor commit
  6fd6244.

EXTRA:
- whatsNew staleReason: hardcoded Chinese "上游 fact 页面变更" → English
  "Upstream fact page changed" (external-facing API response).
@ncw1992120 ncw1992120 force-pushed the feat/kb-open-api-p0b branch from d3679ea to 9d13fcc Compare June 28, 2026 19:27
@ncw1992120

Copy link
Copy Markdown
Contributor Author

已按你的 --onto 写法 rebase。P0-A 的两个旧 commit(22ef9c4b6fd62440)已丢弃,分支现在只带 P0-B 的两个 commit,干干净净建在最新 dev83660893)之上:

9d13fcc6 fix(kb-open): address review feedback on #445
53d22346 feat(kb-open): P0-B 9 open API endpoints
83660893 fix(chat): restrict generated-file link regex ... (dev tip)

rebase 零冲突,编译通过。PR 状态已变回 mergeable=MERGEABLE / CLEAN,可以合并了 🙏

…x#449 nit)

Per mateaix#449 review (4825113234): the internal RFC-012 reference should not
appear in code. progressPhase/progressTotal/progressDone Javadocs still
carried the "RFC-012 M2 v2 UI:" prefix after mateaix#449's English translation
pass — drop it now that these lines are touched.

Zero behavior change.
Per mateaix#444 review (4825157096): parseScopes used
`.collect(java.util.stream.Collectors.toUnmodifiableSet())` while
`Collectors` is already imported at the top of the file. Use the simple
name. Zero behavior change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(kb-open): P0-B 9 个开放 API 端点

2 participants