[Feature]: Make AI coach multimodal

### What problem does this solve?

Currently the AI coach has text only input. Most models and providers supported allow multimodal inputs (image, audio). It would be a good feature to have the image input for chat. The Google health app has this and is really useful for calorie intake tracking. It might also be good to have speech-to-text so the users can speak to the coach instead of typing. 

### Proposed solution

Most models natively support image inputs using the API. Minor changes to the UI are required to have a camera icon in the input bar and then have option to click or upload image from gallery. 

For speech-to-text there are couple of frameworks that can be used for swift. Voice to voice is known to be worse in accuracy and quality of answers, so might better to have S2T.

### Area

AI Coach (tools, prompts, on-device LLM)

### Alternatives considered

_No response_

### Would you be willing to work on this?

None

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Make AI coach multimodal #27

What problem does this solve?

Proposed solution

Area

Alternatives considered

Would you be willing to work on this?

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[Feature]: Make AI coach multimodal #27

Description

What problem does this solve?

Proposed solution

Area

Alternatives considered

Would you be willing to work on this?

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions