AGENTS.md: init#536668
Conversation
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/lets-create-skills-for-nixpkgs-development/78627/20 |
|
Still applies verbatim. |
nh2
left a comment
There was a problem hiding this comment.
I think this is still too generated and verbose.
There's no point telling an agent
Assist a human contributor.
or
Follow the checklist
or
- Did you understand the issue being asked?
(all of these sound like genericisms to me that do not help a machine doing its job) or
The user is responsible for the changes you do. You are just a tool to do it faster.
For this last one in particular I do not see what concretely it will achieve.
I also don't think there's a point in
**Ask first**
- Commit changes.
- Push changes.
- Open/update PRs.
- Comments.
- Destrutive Git operations: force-push, rebase, history rewrite.
- Set up a worktree for a branch.
because that surely depends on the individual user's preference on how to use their tools.
In alignment with what I wrote here I suggest to write this file fully by hand, and keep it as minimal as possible.
Be able to justify every line. For example:
- Ask your agent to package some software. Does it declare success on something suboptimal, such as fail to run the formatter? Then add 1 line to
AGENTS.md, do the same again, and see if it helps. Finally, remove the line again, and observe it fail again. Then report here that this helped and why.
The guiding pricing principle should be:
- For contributors that choose to use a LLM to work on nixpkgs, what are concrete points that will uncontroversially help their work?
|
I was actually following some informal standards on how to do a effective AGENTS. It's a v1. I may not be the best person to do that but I am trying. The main thing is that the repo has rules and the model is supposed to follow all them and issues would be essentially gaps on docs because LLMs start from scratch, like a new contributor, every contribution. I wanted to, for example, give it a build failure issue, it could confirm that the problem is there, do the research on where it was already addressed and if not work on the solution. |
|
This shouldn't be necessary and just increases burden to maintain yet another copy of the standards.... when the regular contributor docs can be improved instead. -1 |
2efd4de to
1e37357
Compare
|
I do not see the point in an |
wolfgangwalther
left a comment
There was a problem hiding this comment.
Still applies verbatim.
Precisely.
|
If taking this approach instead of something minimal like #534657, at a minimum, I think we need to know who will maintain this. I fear that this file will become an append-only log of things people felt they needed to add before their otherworldly patron did what they wanted. When would anyone ever have the confidence to remove something from this file? In the absence of an objective way to determine that a line is no longer load-bearing for the current generation of LLMs, such calls would have to be made by someone or some team (or they never get made and this becomes an ever-growing pile of bloat). Who is stepping up to do that? (This comment is not an endorsement, in general, of wizards multiclassing into warlocks. I'm approaching this as harm reduction — it may be globally optimal to offer clean needles to the public, even if people shooting up is net harmful to society, because without that option people may make worse decisions.) |
|
I would suggest to only use it as an entrypoint. If a guideline changes we change the guideline file instead. |
That's an example of an entrypoint? Seems like a thing you or someone at some point came up with as a reaction to an agent not doing that thing and screwing up as a consequence. What's to stop there from being more such things? Where would they go if not here? |
I agree (see also above). The more minimal this is, the better (both given controversy of using LLMs among some nixpkgs contributors, human review effort, and, in my experience, for the LLM). I didn't know that existed. I think that's a much better starting point than the contents currently proposed in this PR. |
dea0c90 to
516d4db
Compare
Signed-off-by: lucasew <lucas59356@gmail.com>
516d4db to
b82822b
Compare
|
(Just noting that the Nixpkgs core team is looking at this and related PRs per #534657 (comment) and #534657 (comment).) |
|
(Watching bullet points like this being appended to this PR is direct evidence for my argument here.) |
Are you dogfooding the AGENTS too? |
|
Sorry, I don't understand your question. I don't use LLMs for anything, if that's what you're asking. |
Well, the only use of this file is being a LLM entrypoint. There is nothing to maintain. If it messes up it can be either a documentation gap or a guideline gap, so the idea here is to avoid the LLM to reward hack its way through by telling that LLM autonomy is not welcome and the repo has well defined rules. Another static check we could introduce to avoid automated slop is a PR check that auto closes PRs that do not have the template on the end part of the PR body, that way it stops being a burden on reviewers as a PR that does not follow the template would be a low hanging fruit slop. Trivial changes, tbh, are pretty hard for a LLM to get wrong, the way it is is performing great in my tests. It even was able to grok around hooks and stuff, even though the result could be better. I, myself, was essentially guiding it through the solution I was expecting it to achieve. I even got some first turn merges. BTW that __structuredAttrs unnoficial guideline should be formalised somewhere, the model was consistently "forgetting" that. Models can be useful to benchmark the gaps of the documentation, if a model do not follow a guideline it's either the user that forgot to ask to redo the checklist, a documentation gap or the model cut corners reading documentation, which may be a checklist issue. |
Simpler approach for a AI coding agent entrypoint.
No need for AI disclosure as I wrote it all by hand.
Things done
passthru.tests.nixpkgs-reviewon this PR. See nixpkgs-review usage../result/bin/.