fix: Faiss read/write failure on Windows with non-ASCII user paths#8185
fix: Faiss read/write failure on Windows with non-ASCII user paths#8185irmia2026 wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Hey - I've found 4 issues, and left some high level feedback:
- In
_write_index,os.path.dirname(path)will return an empty string for bare filenames, causingos.makedirs('')to fail; consider guarding with a check and only creating the directory when it is non-empty. - Both
_read_indexand_write_indexuse PID-based temp filenames which can collide on concurrent calls within the same process; usingtempfile.mkstempor including a random/UUID suffix would make these temp paths safer. _safe_temp_dircurrently hardcodes aC:\Windows\Tempfallback even on non-Windows platforms; adding anos.name == 'nt'branch (and a POSIX-appropriate fallback) would avoid creating Windows-style paths on Unix-like systems.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `_write_index`, `os.path.dirname(path)` will return an empty string for bare filenames, causing `os.makedirs('')` to fail; consider guarding with a check and only creating the directory when it is non-empty.
- Both `_read_index` and `_write_index` use PID-based temp filenames which can collide on concurrent calls within the same process; using `tempfile.mkstemp` or including a random/UUID suffix would make these temp paths safer.
- `_safe_temp_dir` currently hardcodes a `C:\Windows\Temp` fallback even on non-Windows platforms; adding an `os.name == 'nt'` branch (and a POSIX-appropriate fallback) would avoid creating Windows-style paths on Unix-like systems.
## Individual Comments
### Comment 1
<location path="app/astrbot/core/db/vec_db/faiss_impl/embedding_storage.py" line_range="173-177" />
<code_context>
+ self.index.remove_ids(id_array)
+ await self.save_index()
+
+ async def save_index(self) -> None:
+ """保存索引(兼容含非 ASCII 字符的 Windows 路径)"""
+ if self.index is None:
+ return
+ self._write_index(self.index, self.path)
</code_context>
<issue_to_address>
**issue (bug_risk):** Guard against `self.path` being `None` before calling `_write_index`.
Since the constructor allows `path: str | None = None`, `save_index` can be called when only the in‑memory index is used and `self.path` is still `None`. In that case `_write_index` receives `None` and `os.path.dirname(path)` raises a `TypeError`. Please early‑return when `self.path` is falsy, e.g.:
```python
async def save_index(self) -> None:
if self.index is None or not self.path:
return
self._write_index(self.index, self.path)
```
</issue_to_address>
### Comment 2
<location path="app/astrbot/core/db/vec_db/faiss_impl/embedding_storage.py" line_range="97" />
<code_context>
+
+ 写入前先确保目标目录存在,防止 shutil.move 时目录缺失。
+ """
+ os.makedirs(os.path.dirname(path), exist_ok=True)
+
+ tmp = os.path.join(_safe_temp_dir(), f"_faiss_write_{os.getpid()}.faiss")
</code_context>
<issue_to_address>
**issue:** Handle the case where `path` has no directory component to avoid `os.makedirs('')`.
`os.path.dirname(path)` is `''` when `path` is just a filename (e.g. `'index.faiss'`), and `os.makedirs('')` raises `FileNotFoundError`. Add a guard so you only create the directory when it exists:
```python
dirname = os.path.dirname(path)
if dirname:
os.makedirs(dirname, exist_ok=True)
```
</issue_to_address>
### Comment 3
<location path="app/astrbot/core/db/vec_db/faiss_impl/embedding_storage.py" line_range="77-86" />
<code_context>
+ except RuntimeError:
+ pass # 不吞其他异常类型
+
+ tmp = os.path.join(_safe_temp_dir(), f"_faiss_read_{os.getpid()}.faiss")
+ try:
+ shutil.copy2(path, tmp)
+ return faiss.read_index(tmp)
+ finally:
+ try:
+ os.remove(tmp)
+ except OSError:
+ pass
+
+ @staticmethod
+ def _write_index(index: "faiss.Index", path: str) -> None:
+ """保存 Faiss 索引,兼容含非 ASCII 字符的 Windows 路径。
+
+ 先写入纯 ASCII 临时文件,再用 Python shutil.move 移动到位。
+ Python 文件操作使用 Windows wide-char API (CreateFileW),
+ 正确支持 Unicode 路径。
+
+ 写入前先确保目标目录存在,防止 shutil.move 时目录缺失。
+ """
+ os.makedirs(os.path.dirname(path), exist_ok=True)
+
+ tmp = os.path.join(_safe_temp_dir(), f"_faiss_write_{os.getpid()}.faiss")
+ try:
+ faiss.write_index(index, tmp)
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Use per-call unique temp filenames to avoid collisions in concurrent usage.
Both `_read_index` and `_write_index` derive the temp filename only from `os.getpid()`, so concurrent calls in the same process (threads/async tasks) will race on the same file and can corrupt the index. Please generate a unique temporary file per call (e.g. `tempfile.NamedTemporaryFile(delete=False)`, `tempfile.mkstemp()`, or by adding a random/UUID suffix).
Suggested implementation:
```python
import os
import shutil
import tempfile
```
```python
fd, tmp = tempfile.mkstemp(prefix="_faiss_read_", suffix=".faiss", dir=_safe_temp_dir())
os.close(fd)
try:
shutil.copy2(path, tmp)
return faiss.read_index(tmp)
finally:
try:
os.remove(tmp)
except OSError:
pass
```
```python
fd, tmp = tempfile.mkstemp(prefix="_faiss_write_", suffix=".faiss", dir=_safe_temp_dir())
os.close(fd)
try:
faiss.write_index(index, tmp)
# Windows 同盘 move 是原子 rename,跨盘则走 copy+delete
shutil.move(tmp, path)
finally:
try:
os.remove(tmp)
except OSError:
pass
```
If `import os` / `import shutil` are structured differently (e.g. split across multiple lines or grouped with other imports), adjust the first SEARCH/REPLACE block to match the actual import section and just add `import tempfile` alongside the existing imports.
</issue_to_address>
### Comment 4
<location path="app/astrbot/core/db/vec_db/faiss_impl/embedding_storage.py" line_range="139-144" />
<code_context>
+ raise ValueError(
+ f"向量维度不匹配, 期望: {self.dimension}, 实际: {vectors.shape[1]}",
+ )
+ self.index.add_with_ids(vectors, np.array(ids))
+ await self.save_index()
+
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Specify an explicit integer dtype for the ID array passed to Faiss.
Faiss requires ID arrays to be 64‑bit integers. To avoid platform‑dependent dtypes, cast explicitly:
```python
self.index.add_with_ids(vectors, np.array(ids, dtype=np.int64))
```
```suggestion
if vectors.shape[1] != self.dimension:
raise ValueError(
f"向量维度不匹配, 期望: {self.dimension}, 实际: {vectors.shape[1]}",
)
+ self.index.add_with_ids(vectors, np.array(ids, dtype=np.int64))
+ await self.save_index()
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Code Review
This pull request introduces a Faiss-based vector storage implementation with a specific workaround for Windows path encoding issues using an ASCII temporary directory bridge. The reviewer identified several critical improvements: ensuring consistent vector normalization between insertion and search, handling input dimensions in the search method, and addressing logic flaws in the temporary directory fallback. Additionally, feedback was provided regarding thread safety in temporary file creation, the need for unit tests, and the importance of offloading blocking I/O operations in asynchronous methods to avoid event loop stalls.
|
|
||
| """ | ||
| assert self.index is not None, "FAISS index is not initialized." | ||
| faiss.normalize_L2(vector) |
| """ | ||
| assert self.index is not None, "FAISS index is not initialized." | ||
| faiss.normalize_L2(vector) | ||
| distances, indices = self.index.search(vector, k) |
| @@ -0,0 +1,177 @@ | |||
| try: | |||
| for d in candidates: | ||
| if d.isascii() and os.path.isdir(d) and os.access(d, os.W_OK): | ||
| return d | ||
|
|
||
| # 终极兜底:创建新目录 | ||
| fallback = r"C:\Windows\Temp" | ||
| os.makedirs(fallback, exist_ok=True) | ||
| return fallback |
| except RuntimeError: | ||
| pass # 不吞其他异常类型 | ||
|
|
||
| tmp = os.path.join(_safe_temp_dir(), f"_faiss_read_{os.getpid()}.faiss") |
There was a problem hiding this comment.
使用 os.getpid() 构造临时文件名在多线程环境下不安全。建议使用 tempfile.mkstemp 生成唯一的临时文件名。另外,注意到 _read_index 和 _write_index 中都有类似的临时文件创建逻辑,建议根据 Rule 2 将其重构为共享的辅助函数以减少代码重复。
fd, tmp = tempfile.mkstemp(dir=_safe_temp_dir(), prefix="_faiss_read_", suffix=".faiss")
os.close(fd)References
- When implementing similar functionality for different cases, refactor the logic into a shared helper function to avoid code duplication.
| tmp = os.path.join(_safe_temp_dir(), f"_faiss_write_{os.getpid()}.faiss") | ||
| try: | ||
| faiss.write_index(index, tmp) | ||
| # Windows 同盘 move 是原子 rename,跨盘则走 copy+delete | ||
| shutil.move(tmp, path) | ||
| finally: | ||
| try: | ||
| os.remove(tmp) | ||
| except OSError: | ||
| pass |
There was a problem hiding this comment.
这里的临时文件名同样存在多线程冲突风险。此外,shutil.move 在成功移动文件后通常会删除源文件,导致 finally 块中的 os.remove(tmp) 抛出 FileNotFoundError。建议优化逻辑。另外,此处的临时文件创建与 _read_index 中的逻辑相似,建议根据 Rule 2 将其重构为共享的辅助函数以避免代码重复。
| tmp = os.path.join(_safe_temp_dir(), f"_faiss_write_{os.getpid()}.faiss") | |
| try: | |
| faiss.write_index(index, tmp) | |
| # Windows 同盘 move 是原子 rename,跨盘则走 copy+delete | |
| shutil.move(tmp, path) | |
| finally: | |
| try: | |
| os.remove(tmp) | |
| except OSError: | |
| pass | |
| fd, tmp = tempfile.mkstemp(dir=_safe_temp_dir(), prefix="_faiss_write_", suffix=".faiss") | |
| os.close(fd) | |
| try: | |
| faiss.write_index(index, tmp) | |
| shutil.move(tmp, path) | |
| finally: | |
| if os.path.exists(tmp): | |
| try: | |
| os.remove(tmp) | |
| except OSError: | |
| pass |
References
- When implementing similar functionality for different cases, refactor the logic into a shared helper function to avoid code duplication.
| async def save_index(self) -> None: | ||
| """保存索引(兼容含非 ASCII 字符的 Windows 路径)""" | ||
| if self.index is None: | ||
| return | ||
| self._write_index(self.index, self.path) |
There was a problem hiding this comment.
save_index 是一个异步方法,但它调用的 _write_index 包含同步的磁盘 I/O,这会阻塞 asyncio 事件循环。建议使用 loop.run_in_executor 执行。需要注意的是,根据 Rule 1,虽然同步函数在主循环中是原子的,但一旦移动到 executor 执行,就失去了这种原子性,因此必须对 self.index 的并发访问进行保护。
References
- In a single-threaded asyncio event loop, synchronous functions are executed atomically and will not be interrupted by other coroutines. Therefore, they are safe from race conditions when modifying shared state within that block.
c4af721 to
f9a13d8
Compare
|
All issues raised by gemini-code-assist and sourcery-ai have been addressed in f9a13d8. Ready for human review. 🙏 |
|
Your code path is wrong.. embedding_storage.py is at |
f9a13d8 to
374dd8d
Compare
|
Ah, good catch — that was a leftover Fixed in the latest force-push: the file is now at |
Summary
Fix Faiss index read/write failure on Windows when the path contains non-ASCII characters (Chinese, Japanese, Korean, Cyrillic, etc.).
Root Cause
Faiss C++ (
io.cpp) usesfopen(const char*)which expects ANSI codepage on Windows, but Python passes paths as UTF-8 bytes through the C extension bridge. Multi-byte characters get garbled →No such file or directory.Impact
All Windows users with non-ASCII usernames are unable to use the knowledge base feature. Upload fails with:
While chunks are written to
doc.db(SQLite via Python), the Faiss vector index andkb_documentsmetadata are both lost.Fix
Temp-file bridge: Faiss reads/writes a pure-ASCII temp file; Python
shutil(using Windows wide-char APIs natively) handles the real path.Additional safeguards:
_safe_temp_dir()— priority cascade ensuring ASCII temp directory even whenTEMPenv points to a non-ASCII path (falls back to%SystemRoot%\Temp)os.makedirsbeforeshutil.movefor first-time writes in nested directoriesFiles Changed
One file:
app/astrbot/core/db/vec_db/faiss_impl/embedding_storage.py_safe_temp_dir(),_read_index(),_write_index()__init__(),save_index()try:direct read succeeds and the fallback is never triggered.Verification
7 tests passed on Windows with Chinese username path: write/read round-trip, concurrent access, first-time write, nested directories, polluted TEMP fallback.
Summary by Sourcery
Add a Faiss embedding storage wrapper that safely reads and writes indices, including on Windows paths with non-ASCII characters.
Bug Fixes:
Enhancements: