Middleware for Microsoft.Extensions.AI that emulates tool calling for chat clients/models that only return plain text.
It works by injecting tool declarations + a strict tool-call format into normal prompt text, then parsing the model’s response to detect a tool-call payload.
This approach is obviously MUCH less reliable than native tool calling.
- Add tool declarations + a required JSON tool-call format to the outgoing prompt text (no system-message dependency).
- Model returns plain text.
- Middleware detects an attempted tool call and extracts JSON.
- If JSON is invalid (or the tool fails), middleware asks the model to re-emit a corrected tool-call JSON (bounded retries).
- On success, the pipeline executes the tool and appends the tool result to the conversation, then continues.
flowchart TD
A["① Prepare request<br/>• prompt text injection (non-system)<br/>• tool manifest + format rules"] --> B["② Send messages to LLM"]
B --> C["③ Receive plain-text response"]
C --> D{"Detected tool call attempt?"}
D -- No --> Z["⑥ Return final answer (text)"]
D -- Yes --> E{"Valid tool-call JSON extracted?"}
E -- Yes --> F["④ Execute tool externally"]
F --> J{"Tool execution succeeded?"}
J -- Yes --> G["⑤ Append tool result to conversation"]
G --> B
E -- No --> R["④ Prepare retry prompt<br/>• include parse/tool error<br/>• restate required format<br/>• demand JSON only"] --> H["④ Ask LLM to fix tool-call JSON"]
J -- No --> R
H --> I{"Retries left?"}
I -- Yes --> B
I -- No --> Z