Add model to serve command#50
Merged
InftyAI-Agent merged 2 commits intoInftyAI:mainfrom May 6, 2026
Merged
Conversation
Signed-off-by: kerthcet <kerthcet@gmail.com>
Member
Author
|
/lgtm |
Member
Author
|
/lgtm |
There was a problem hiding this comment.
Pull request overview
This PR updates the serve CLI workflow to require an explicit model name and performs a preflight registry check before starting the API server. It also changes the /health response payload and updates README usage examples accordingly.
Changes:
- Require
puma serve <model>and pass the model name into the serve execution path. - Add a model-exists preflight check in the CLI before starting the server.
- Remove
versionfrom the/healthendpoint response and update documentation to match.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
src/cli/serve.rs |
Extends execute() to accept a model name and logs it on startup. |
src/cli/commands.rs |
Adds a required positional model arg to serve, checks registry presence, and adds related tests. |
src/api/routes.rs |
Updates /health response schema by removing version. |
README.md |
Updates serve usage and health output; adds note about /v1/models. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
59
to
69
| /// Health check response | ||
| #[derive(Serialize)] | ||
| struct HealthResponse { | ||
| status: String, | ||
| version: String, | ||
| } | ||
|
|
||
| /// Health check endpoint | ||
| async fn health_check() -> Json<HealthResponse> { | ||
| Json(HealthResponse { | ||
| status: "ok".to_string(), | ||
| version: env!("CARGO_PKG_VERSION").to_string(), | ||
| }) |
Comment on lines
10
to
33
| @@ -23,7 +27,7 @@ pub async fn execute(host: &str, port: u16) -> Result<(), Box<dyn std::error::Er | |||
| .bright_blue() | |||
| .bold() | |||
| ); | |||
| info!("Starting PUMA inference server"); | |||
| info!("Starting PUMA to serve model: {}", model_name); | |||
|
|
|||
| // Initialize backend (MockEngine for now, replace with MLX later) | |||
| let engine = Arc::new(MockEngine::new()); | |||
Comment on lines
+227
to
+243
| // Verify model exists | ||
| let registry = ModelRegistry::new(None); | ||
| match registry.get_model(&args.model) { | ||
| Ok(Some(_)) => { | ||
| // Model exists, proceed | ||
| } | ||
| Ok(None) => { | ||
| eprintln!("❌ Error: Model '{}' not found in registry", args.model); | ||
| eprintln!("Run 'puma pull {}' to download it first", args.model); | ||
| std::process::exit(1); | ||
| } | ||
| Err(e) => { | ||
| eprintln!("❌ Error checking model: {}", e); | ||
| std::process::exit(1); | ||
| } | ||
| } | ||
|
|
Comment on lines
+416
to
+439
| #[test] | ||
| fn test_serve_with_existing_model() { | ||
| let temp_dir = TempDir::new().unwrap(); | ||
| let registry = ModelRegistry::new(Some(temp_dir.path().to_path_buf())); | ||
|
|
||
| let model = create_test_model("test/serve-model", "abc123"); | ||
| registry.register_model(model).unwrap(); | ||
|
|
||
| // Verify model exists (this is what serve command checks) | ||
| let result = registry.get_model("test/serve-model"); | ||
| assert!(result.is_ok()); | ||
| assert!(result.unwrap().is_some()); | ||
| } | ||
|
|
||
| #[test] | ||
| fn test_serve_with_nonexistent_model() { | ||
| let temp_dir = TempDir::new().unwrap(); | ||
| let registry = ModelRegistry::new(Some(temp_dir.path().to_path_buf())); | ||
|
|
||
| // Verify model doesn't exist | ||
| let result = registry.get_model("nonexistent/model"); | ||
| assert!(result.is_ok()); | ||
| assert!(result.unwrap().is_none()); | ||
| } |
|
|
||
| #### List Models | ||
| ```bash | ||
| # Returns the currently loaded model |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it
Which issue(s) this PR fixes
Fixes #
Special notes for your reviewer
Does this PR introduce a user-facing change?