Expose split_by_character parameter in HTTP API /documents/text endpoint

## Problem

The Python API `LightRAG.ainsert()` supports `split_by_character` and `split_by_character_only` parameters for custom chunk splitting. However, the HTTP API endpoints `/documents/text` and `/documents/texts` in `lightrag/api/document_routes.py` do not expose these parameters — they are not included in the Pydantic request models and not passed through to `ainsert()`.

This forces HTTP API users to rely solely on the built-in token-based chunker, even when they have pre-chunked content with a known separator.

## Use Case

We pre-chunk documents with a semantic chunker (heading-aware, with breadcrumbs and atomic blocks) before sending to LightRAG. We join chunks with a unique separator and want LightRAG to split on it, preserving our chunk boundaries as-is.

Without `split_by_character` in the HTTP API, the only options are:
1. Send each chunk as a separate document (`/documents/texts` with N items) — creates N `doc_id`s per file, breaks deletion, deduplication, and `doc_status` tracking.
2. Use the Python API directly — not possible when LightRAG runs as a separate service.

## Proposed Change

Add `split_by_character` and `split_by_character_only` fields to `InsertTextRequest` and `InsertTextsRequest` in `document_routes.py`, and pass them through to `rag.ainsert()`.

### InsertTextRequest

```python
class InsertTextRequest(BaseModel):
    text: str
    # ... existing fields ...
    split_by_character: Optional[str] = Field(
        default=None,
        description="Character(s) to split the text on instead of token-based chunking",
    )
    split_by_character_only: bool = Field(
        default=False,
        description="If True, split only on split_by_character without token-based fallback",
    )
```

### Route handler

```python
await rag.ainsert(
    request.text,
    split_by_character=request.split_by_character,
    split_by_character_only=request.split_by_character_only,
)
```

Same for `InsertTextsRequest` / `/documents/texts`.

## Notes

- Fully backward compatible — both fields are optional with defaults matching current behavior.
- We have a working patch in production (LightRAG v1.4.14) and can submit a PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose split_by_character parameter in HTTP API /documents/text endpoint #2942

Problem

Use Case

Proposed Change

InsertTextRequest

Route handler

Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Expose split_by_character parameter in HTTP API /documents/text endpoint #2942

Description

Problem

Use Case

Proposed Change

InsertTextRequest

Route handler

Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions