fix: repair AWS private manager handoff permissions#67
Conversation
Greptile SummaryThis PR fixes a broken permission handoff for AWS private manager scenarios by splitting the
Confidence Score: 4/5The changes are safe to merge. The core permission fixes are correctly structured and well-tested; the only open question is whether the stack-wide ingress grant on container/provision is intentional for all container types or should be limited to Horizon-backed ones. All three permission changes correctly address the stated root causes: the management security-group statement now matches the actual tag value on the setup-owned boundary group, provision gets the RunInstances grants that Auto Scaling dry-run requires, and container provision gets the narrowly scoped ingress permission with no delete. The test suite covers the new statements directly. The one architectural note — every container provision role receives the compute security-group ingress permission regardless of Horizon dependency — may be intentional but is not explicitly confirmed in the PR description or code comments. container/provision.jsonc warrants a second look on the question of whether the setup compute security group ingress permission should be conditional on container type or left as a stack-wide grant.
|
| Filename | Overview |
|---|---|
| crates/alien-permissions/permission-sets/compute-cluster/management.jsonc | Correctly splits the combined launch-template + security-group RunInstances statement into two separate, properly conditioned statements — fixing the root bug where the setup-owned boundary security group was being checked against a managed-by:runtime condition it could never satisfy. |
| crates/alien-permissions/permission-sets/compute-cluster/provision.jsonc | Adds the previously missing ec2:RunInstances blocks (runtime launch templates + security groups, instance creation, companion resources) and ec2:CreateTags for ASG dry-run validation — all properly conditioned on managed-by:runtime. |
| crates/alien-permissions/permission-sets/container/provision.jsonc | Adds AuthorizeSecurityGroupIngress/RevokeSecurityGroupIngress on the setup-owned compute security group for ALB-to-host connectivity. The resource binding does not further narrow by ${resourceName}, so every container resource in the stack receives this permission regardless of whether it is Horizon-backed. |
| crates/alien-permissions/tests/aws_runtime.rs | Adds well-targeted tests verifying the new provision RunInstances permissions and container ingress grants, plus the split management security-group statement; condition_equals helper is a clean addition used throughout the file. |
| crates/alien-permissions/tests/aws_cloudformation.rs | Adds CloudFormation-generator counterparts for the same new permissions; condition access is done inline rather than using a shared helper (unlike aws_runtime.rs), a minor style inconsistency but not a bug. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
subgraph Setup["Setup Phase (setup-owned resources)"]
SG["ComputeCluster Security Group\nmanaged-by: setup, resource: compute"]
end
subgraph Provision["Provision Phase (provision role)"]
LT["Runtime Launch Template\nmanaged-by: runtime"]
RSG["Runtime Security Group\nmanaged-by: runtime"]
ASG_P["Create ASG\nautoscaling:CreateAutoScalingGroup"]
DRY_P["Dry-run RunInstances\nlaunch-template/* + security-group/*\n(managed-by: runtime)"]
end
subgraph Management["Management Phase (management role)"]
ASG_M["Update/Refresh ASG\nautoscaling:UpdateAutoScalingGroup"]
DRY_M_LT["RunInstances on launch-template/*\n(managed-by: runtime)"]
DRY_M_SG["RunInstances on security-group/*\n(managed-by: setup, resource: compute) NEW"]
end
subgraph Container["Container Provision (container role)"]
ALB["ALB Security Group\nmanaged-by: runtime"]
INGRESS["Authorize/Revoke Ingress\non compute SG NEW"]
end
LT --> DRY_P
RSG --> DRY_P
DRY_P --> ASG_P
LT --> DRY_M_LT
SG --> DRY_M_SG
DRY_M_LT --> ASG_M
DRY_M_SG --> ASG_M
ALB --> INGRESS
INGRESS --> SG
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 2
crates/alien-permissions/permission-sets/container/provision.jsonc:245-278
**Ingress permission applies to all containers, not just Horizon-backed ones**
The `resource` binding for this permission does not add an `aws:ResourceTag/${resourceTag}": "${resourceName}"` condition (unlike other resource-level grants in this file), so every container provision role in the stack receives `ec2:AuthorizeSecurityGroupIngress` and `ec2:RevokeSecurityGroupIngress` on the setup-owned compute security group — regardless of whether the container is actually Horizon-backed. A non-Horizon container's provision role could therefore add or remove ingress rules on the shared boundary security group, and there is no per-rule ownership condition available at the IAM level to prevent one container from revoking another container's ingress rules. This appears intentional given the comment, but worth an explicit acknowledgment that the permission is stack-wide rather than per-container.
### Issue 2 of 2
crates/alien-permissions/tests/aws_cloudformation.rs:157-271
**Condition lookup duplicated inline instead of using a shared helper**
The two new CloudFormation tests access `StringEquals` condition values via inline `and_then` chains. `aws_runtime.rs` introduces a `condition_equals` helper for exactly this purpose at the bottom of that file, but `aws_cloudformation.rs` does not share it. The duplication is minor now, but any future change to how conditions are represented would require updates in multiple places. Consider either moving `condition_equals` to `common` or duplicating it in this test module.
Reviews (1): Last reviewed commit: "fix: repair AWS private manager handoff ..." | Re-trigger Greptile
| // Containers own their ALB security group, but public Horizon-backed | ||
| // containers also need to add/remove one ingress rule on the setup-owned | ||
| // ComputeCluster security group so the ALB can reach host ports. | ||
| { | ||
| "label": "Manage setup compute security group ingress", | ||
| "grant": { | ||
| "actions": [ | ||
| "ec2:AuthorizeSecurityGroupIngress", | ||
| "ec2:RevokeSecurityGroupIngress" | ||
| ] | ||
| }, | ||
| "binding": { | ||
| "stack": { | ||
| "resources": ["arn:aws:ec2:${awsRegion}:${awsAccountId}:security-group/*"], | ||
| "condition": { | ||
| "StringEquals": { | ||
| "aws:ResourceTag/${stackTag}": "${stackPrefix}", | ||
| "aws:ResourceTag/${resourceTag}": "compute", | ||
| "aws:ResourceTag/${managedByTag}": "setup" | ||
| } | ||
| } | ||
| }, | ||
| "resource": { | ||
| "resources": ["arn:aws:ec2:${awsRegion}:${awsAccountId}:security-group/*"], | ||
| "condition": { | ||
| "StringEquals": { | ||
| "aws:ResourceTag/${stackTag}": "${stackPrefix}", | ||
| "aws:ResourceTag/${resourceTag}": "compute", | ||
| "aws:ResourceTag/${managedByTag}": "setup" | ||
| } | ||
| } | ||
| } | ||
| } | ||
| }, |
There was a problem hiding this comment.
Ingress permission applies to all containers, not just Horizon-backed ones
The resource binding for this permission does not add an aws:ResourceTag/${resourceTag}": "${resourceName}" condition (unlike other resource-level grants in this file), so every container provision role in the stack receives ec2:AuthorizeSecurityGroupIngress and ec2:RevokeSecurityGroupIngress on the setup-owned compute security group — regardless of whether the container is actually Horizon-backed. A non-Horizon container's provision role could therefore add or remove ingress rules on the shared boundary security group, and there is no per-rule ownership condition available at the IAM level to prevent one container from revoking another container's ingress rules. This appears intentional given the comment, but worth an explicit acknowledgment that the permission is stack-wide rather than per-container.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/alien-permissions/permission-sets/container/provision.jsonc
Line: 245-278
Comment:
**Ingress permission applies to all containers, not just Horizon-backed ones**
The `resource` binding for this permission does not add an `aws:ResourceTag/${resourceTag}": "${resourceName}"` condition (unlike other resource-level grants in this file), so every container provision role in the stack receives `ec2:AuthorizeSecurityGroupIngress` and `ec2:RevokeSecurityGroupIngress` on the setup-owned compute security group — regardless of whether the container is actually Horizon-backed. A non-Horizon container's provision role could therefore add or remove ingress rules on the shared boundary security group, and there is no per-rule ownership condition available at the IAM level to prevent one container from revoking another container's ingress rules. This appears intentional given the comment, but worth an explicit acknowledgment that the permission is stack-wide rather than per-container.
How can I resolve this? If you propose a fix, please make it concise.| ); | ||
| } | ||
|
|
||
| #[test] | ||
| fn test_aws_cloudformation_compute_management_can_use_setup_security_group() { | ||
| let generator = AwsCloudFormationPermissionsGenerator::new(); | ||
| let permission_set = | ||
| get_permission_set("compute-cluster/management").expect("permission set exists"); | ||
| let context = PermissionContext::new() | ||
| .with_stack_prefix("") | ||
| .with_aws_region("${AWS::Region}") | ||
| .with_aws_account_id("${AWS::AccountId}"); | ||
|
|
||
| let result = generator | ||
| .generate_policy(permission_set, BindingTarget::Stack, &context) | ||
| .expect("Should generate AWS CloudFormation policy successfully"); | ||
|
|
||
| let setup_security_group_statement = result | ||
| .statement | ||
| .iter() | ||
| .find(|statement| { | ||
| statement | ||
| .resource | ||
| .iter() | ||
| .any(|resource| resource == &json!({"Fn::Sub": "arn:${AWS::Partition}:ec2:${AWS::Region}:${AWS::AccountId}:security-group/*"})) | ||
| && statement.action.contains(&json!("ec2:RunInstances")) | ||
| }) | ||
| .expect("compute-cluster management should allow setup compute security group use"); | ||
|
|
||
| let string_equals = setup_security_group_statement | ||
| .condition | ||
| .as_ref() | ||
| .and_then(|condition| condition.get("StringEquals")) | ||
| .expect("setup security group permission should be tag-conditioned"); | ||
|
|
||
| assert_eq!( | ||
| string_equals.get("aws:ResourceTag/deployment"), | ||
| Some(&json!({"Fn::Sub": "${AWS::StackName}"})) | ||
| ); | ||
| assert_eq!( | ||
| string_equals.get("aws:ResourceTag/managed-by"), | ||
| Some(&json!("setup")) | ||
| ); | ||
| assert_eq!( | ||
| string_equals.get("aws:ResourceTag/resource"), | ||
| Some(&json!("compute")) | ||
| ); | ||
| } | ||
|
|
||
| #[test] | ||
| fn test_aws_cloudformation_container_provision_can_manage_setup_compute_security_group_ingress() { | ||
| let generator = AwsCloudFormationPermissionsGenerator::new(); | ||
| let permission_set = get_permission_set("container/provision").expect("permission set exists"); | ||
| let context = PermissionContext::new() | ||
| .with_stack_prefix("") | ||
| .with_resource_name("alien-manager") | ||
| .with_aws_region("${AWS::Region}") | ||
| .with_aws_account_id("${AWS::AccountId}"); | ||
|
|
||
| let result = generator | ||
| .generate_policy(permission_set, BindingTarget::Resource, &context) | ||
| .expect("Should generate AWS CloudFormation policy successfully"); | ||
|
|
||
| let setup_compute_ingress_statement = result | ||
| .statement | ||
| .iter() | ||
| .find(|statement| { | ||
| if !statement | ||
| .action | ||
| .contains(&json!("ec2:AuthorizeSecurityGroupIngress")) | ||
| || !statement | ||
| .action | ||
| .contains(&json!("ec2:RevokeSecurityGroupIngress")) | ||
| { | ||
| return false; | ||
| } | ||
|
|
||
| statement | ||
| .condition | ||
| .as_ref() | ||
| .and_then(|condition| condition.get("StringEquals")) | ||
| .is_some_and(|string_equals| { | ||
| string_equals.get("aws:ResourceTag/managed-by") == Some(&json!("setup")) | ||
| && string_equals.get("aws:ResourceTag/resource") == Some(&json!("compute")) | ||
| }) | ||
| }) | ||
| .expect("container provision should manage ingress on setup compute security groups"); | ||
|
|
||
| assert!(!setup_compute_ingress_statement | ||
| .action | ||
| .contains(&json!("ec2:DeleteSecurityGroup"))); | ||
|
|
||
| let string_equals = setup_compute_ingress_statement | ||
| .condition | ||
| .as_ref() | ||
| .and_then(|condition| condition.get("StringEquals")) | ||
| .expect("setup compute security group ingress permission should be tag-conditioned"); | ||
|
|
||
| assert_eq!( | ||
| string_equals.get("aws:ResourceTag/deployment"), | ||
| Some(&json!({"Fn::Sub": "${AWS::StackName}"})) | ||
| ); | ||
| assert_eq!( | ||
| string_equals.get("aws:ResourceTag/managed-by"), | ||
| Some(&json!("setup")) | ||
| ); | ||
| assert_eq!( | ||
| string_equals.get("aws:ResourceTag/resource"), | ||
| Some(&json!("compute")) | ||
| ); | ||
| } | ||
|
|
||
| #[test] | ||
| fn test_aws_cloudformation_missing_actions_error() { | ||
| let generator = AwsCloudFormationPermissionsGenerator::new(); |
There was a problem hiding this comment.
Condition lookup duplicated inline instead of using a shared helper
The two new CloudFormation tests access StringEquals condition values via inline and_then chains. aws_runtime.rs introduces a condition_equals helper for exactly this purpose at the bottom of that file, but aws_cloudformation.rs does not share it. The duplication is minor now, but any future change to how conditions are represented would require updates in multiple places. Consider either moving condition_equals to common or duplicating it in this test module.
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/alien-permissions/tests/aws_cloudformation.rs
Line: 157-271
Comment:
**Condition lookup duplicated inline instead of using a shared helper**
The two new CloudFormation tests access `StringEquals` condition values via inline `and_then` chains. `aws_runtime.rs` introduces a `condition_equals` helper for exactly this purpose at the bottom of that file, but `aws_cloudformation.rs` does not share it. The duplication is minor now, but any future change to how conditions are represented would require updates in multiple places. Consider either moving `condition_equals` to `common` or duplicating it in this test module.
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
No description provided.