Feature: expose a reasoning-disable knob for chat completions providers (Qwen3 / DeepSeek-R1 thinking mode)

## Summary

Reasoning-capable chat models (Qwen3 series, DeepSeek-R1, etc.) default to emitting a long internal reasoning chain. Wiki generation is mostly "read code + write markdown description" — it doesn't benefit from a chain-of-thought, but pays the time cost (often 60-80% of completion tokens go to `reasoning_content` rather than output).

repowise has no surface to disable reasoning per-provider. The user has to either accept the slowdown or fork the LLM provider code.

## Reproduction

- Use any reasoning model via `OPENAI_BASE_URL` pointing to a runtime that serves it (e.g. Ollama with a recent Qwen3 build, or vLLM/SGLang).
- Run `repowise init` and observe `completion_tokens` vs the actual output text length in the generation phase: most tokens are reasoning, not the wiki content.

## Suggested fix

Add an opt-in flag to `GenerationConfig` (or per-provider config) such as:

```python
@dataclass
class GenerationConfig:
    ...
    disable_reasoning: bool = False
    # When True, providers should pass backend-specific kwargs to disable
    # the model's reasoning chain (e.g. extra_body / chat_template_kwargs /
    # reasoning_effort=minimal, depending on the backend).
```

Then each LLM provider translates the flag to its own backend syntax. For OpenAI-compatible chat completions, the common patterns are:

- vLLM / SGLang serving Qwen3: `extra_body={"chat_template_kwargs": {"enable_thinking": False}}`
- OpenAI Responses API o-series: `reasoning={"effort": "minimal"}`
- Some proxies expose vendor-specific envelopes

Locally, we observed a 3.9× speedup translating ~3800 wiki pages (a related batch task using the same chat completions endpoint) with reasoning disabled, with no measurable quality regression on the output markdown structure or technical content.

Happy to send a PR adding the flag and a default `OpenAIProvider` mapping; backend-specific mappings can be added incrementally.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: expose a reasoning-disable knob for chat completions providers (Qwen3 / DeepSeek-R1 thinking mode) #137

Summary

Reproduction

Suggested fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature: expose a reasoning-disable knob for chat completions providers (Qwen3 / DeepSeek-R1 thinking mode) #137

Description

Summary

Reproduction

Suggested fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions