Add LLMInferenceService support by wseaton · Pull Request #5 · wseaton/gpu-pruner

wseaton · 2026-04-01T01:56:10Z

Adds serving.kserve.io/v1alpha1 LLMInferenceService as a scalable resource kind (l flag).

New resources::llminferenceservice CRD definition
Owner ref chain walking, scale-to-zero via minReplicas: 0
Full test coverage (equality, hashing, Meta trait, event generation, resource flags)

- minimal hand-written CRD type with spec.replicas and serde flatten - label fast-path: app.kubernetes.io/part-of=llminferenceservice - owner-ref chain: Pod -> RS -> Deployment -> LLMInferenceService - scale patches both spec.replicas and spec.prefill.replicas to zero - new 'l' flag in enabled-resources (default "drsinl") - 10 unit tests covering bitflags, conversion, equality, hashing, meta

…ore processing

wseaton added 2 commits March 31, 2026 21:52

fix duplicate series: scope node_dmi_info or-fallback, dedup pods bef…

c2ba4f3

…ore processing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add LLMInferenceService support#5

Add LLMInferenceService support#5
wseaton wants to merge 2 commits into
mainfrom
feat/llminferenceservice-support

wseaton commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wseaton commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant