Custom STT bridge ignores per-segment is_user/person_id and collapses speakers to Speaker 1

## Summary
When using `custom_stt=enabled`, a self-hosted custom STT service can return per-segment speaker data that already matches `TranscriptSegment`, including `speaker`, `speaker_id`, `is_user`, and `person_id`. But the app bridge currently does not preserve that information end-to-end.

## Current behavior
- The backend correctly bypasses speech-profile speaker identification in custom STT mode.
- However, the app-side custom STT bridge appears to:
  - force `is_user: false` for forwarded segments,
  - ignore upstream `person_id`,
  - derive speaker identity only from the configured schema speaker field,
  - and effectively collapse all segments to `speaker_id = 0` / `Speaker 1` when the response schema does not explicitly map the speaker field.

## Expected behavior
For custom STT, if the provider already returns segment objects compatible with `TranscriptSegment`, the bridge should preserve and forward:
- `speaker`
- `speaker_id`
- `is_user`
- `person_id`
- `start`
- `end`
- `text`
- `translations` (if present)

Additionally, the response schema should support fields such as:
- `segments_is_user_field`
- `segments_person_id_field`
- ideally explicit support for `segments_speaker_id_field` when speaker label and numeric speaker ID are both present

## Example provider response
```json
{
  "segments": [
    {
      "id": "0",
      "text": "...",
      "speaker": "SPEAKER_00",
      "speaker_id": 0,
      "is_user": true,
      "person_id": null,
      "start": 0.0,
      "end": 8.0,
      "translations": []
    },
    {
      "id": "1",
      "text": "...",
      "speaker": "SPEAKER_01",
      "speaker_id": 1,
      "is_user": false,
      "person_id": null,
      "start": 8.0,
      "end": 14.0,
      "translations": []
    }
  ]
}
```

## Repro
1. Enable custom STT (`custom_stt=enabled`).
2. Use a self-hosted transcription service that returns distinct per-segment speakers and correct boolean `is_user`.
3. Start a conversation with at least two speakers.
4. Observe that the app shows every segment as `Speaker 1`, and the device owner is not rendered as the user.

## Notes
The reporter is happy to test a fix and can also send a PR.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Custom STT bridge ignores per-segment is_user/person_id and collapses speakers to Speaker 1 #7982

Summary

Current behavior

Expected behavior

Example provider response

Repro

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Custom STT bridge ignores per-segment is_user/person_id and collapses speakers to Speaker 1 #7982

Description

Summary

Current behavior

Expected behavior

Example provider response

Repro

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions