Skip to content

WIP: implement a secret vars store in nixpkgs#370444

Draft
Lassulus wants to merge 2 commits into
NixOS:masterfrom
Lassulus:nixos-vars
Draft

WIP: implement a secret vars store in nixpkgs#370444
Lassulus wants to merge 2 commits into
NixOS:masterfrom
Lassulus:nixos-vars

Conversation

@Lassulus

@Lassulus Lassulus commented Jan 3, 2025

Copy link
Copy Markdown
Member

This allows to create secrets (and public files) outside or inside the nix store in a more delclarative way. This is shipped with an example (working) implementation of an on-machine storage. The vars options can easily be used to implement custom backends and extend the behaviour to integrate with already existing solutions, like sops-nix or agenix.

I wanted to upstream this, since we use code similiar to this in https://clan.lol now for a while. Happy to discuss it a bit more. I'm not very fond of the name vars anymore, so maybe a first change could be to come up with a better name :)

I'm not sure if this code is working yet, I will try to develop it a bit more in the next coming days, but I wanted to get this out earlier so other people can throw in some feedback already :)

@Lassulus Lassulus requested a review from oddlama January 3, 2025 01:49
@github-actions github-actions Bot added 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (update) This PR changes an existing module in `nixos/` labels Jan 3, 2025
@lucasew

lucasew commented Jan 3, 2025

Copy link
Copy Markdown
Contributor

How this compares to sops-nix?

@Lassulus

Lassulus commented Jan 4, 2025

Copy link
Copy Markdown
Member Author

How this compares to sops-nix?

This enhances sops-nix, It is more of an upstream evolution of agenix-rekey.

The gist is: This supports declaring how to generate the secrets you store in sops-nix in nixpkgs in the service module.

I will add a declaration to a service soon, so it is easier to understand :)

@Lassulus

Lassulus commented Jan 7, 2025

Copy link
Copy Markdown
Member Author

alright, removed some features to make the first implementation easier.
Some things are still missing in the on-machine backend like dependencies and prompts. but the rough shape should be more clear now :)

@github-actions github-actions Bot added 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux. labels Jan 7, 2025
@Lassulus Lassulus force-pushed the nixos-vars branch 3 times, most recently from 3f0acbe to 339d2bc Compare January 7, 2025 23:43
@nixos-discourse

Copy link
Copy Markdown

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2025-01-08-clan-weekly-changelog/58501/1

@oddlama

oddlama commented Jan 8, 2025

Copy link
Copy Markdown
Member

This looks awesome, I'll have to play around with it a bit more to understand all the details.

Have you already thought about how this would work when someone wants to use vars/secrets in the initrd in stage1? I imagine there could be some kind of hen-egg problem since the initrd will be generated pretty early on when a new generation is activated. The configured vars/secrets may probably not exist at that time, so this might require activating twice (?)

@lucasew

lucasew commented Jan 8, 2025

Copy link
Copy Markdown
Contributor

I am curious about how it would work in stage 1 because AFAIK / is not mounted yet in this phase, so how it will read the host key to decrypt the secrets.

@Lassulus

Lassulus commented Jan 9, 2025

Copy link
Copy Markdown
Member Author

Have you already thought about how this would work when someone wants to use vars/secrets in the initrd in stage1? I imagine there could be some kind of hen-egg problem since the initrd will be generated pretty early on when a new generation is activated.

The current code here doesn't run during activation but creates a script that can be run before activation. in clan.lol we solved that issue differently by having a requiredBy option which can take "activation". We would create the files necessary for activation beforehand in that case, but I think this adds too much complexity to this initial PR.
But this can be something we can add later on top of it.
For now just having all the files created in one phase would be easier and since the creation script can be run at any time (even before installation) they should be available for all phases.

Another problem is also the prompts which some vars can ask for. Having the generation run in the activation would make prompting for input very hard. So I think a better way would be to prompt in nixos-install/nixos-rebuild, so before running activation. But this is also something I will add in a later iteration :)

@Lassulus

Lassulus commented Jan 9, 2025

Copy link
Copy Markdown
Member Author

I am curious about how it would work in stage 1 because AFAIK / is not mounted yet in this phase, so how it will read the host key to decrypt the secrets.

This is not really in scope for this PR. The current on-machine implementation I added here, doesn't encrypt the secrets (yet?) so there should be no issue with stage-1. secrets can still be added to initrd if they are required to early boot and they should be available at that time.

@Lassulus Lassulus force-pushed the nixos-vars branch 3 times, most recently from 46b881c to b6b0bed Compare January 12, 2025 00:50
Comment thread nixos/modules/system/vars/options.nix Outdated
Comment thread nixos/modules/system/vars/options.nix Outdated
Comment on lines 123 to 119

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a value option or something similar to each prompt? I would like to be able to pre-define the value of the prompt to ensure that a user can always opt-in to a purely declarative configuration.

Btw, I currently don't understand when an author of a nixos module would decide to utilize a prompt instead of a generated random value. If I am writing a generator for my personal configuration, I can understand that I might want to use prompt for certain things - but when would I want to use a prompt in an upstream module and force interactivity on every user?

I'm just a little worried right now that introducing prompts without a way to declaratively set their value will sooner or later result in some generators being added to nixpkgs which depend on some form of interactivity. Therefore I'd like to have a way to specify the value of a prompt ahead of time, so the generation script can then just copy the specified file to prompts/$name instead of asking the user a question.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we mainly use prompts for API keys, which we usually don't want to leak into the config. I guess we can think about pre declaring things, this would also make the testing code a bit easier which I worked on last time I touched that code. But this increases the surface every backend implementation has to take care of.

Comment thread nixos/modules/system/vars/options.nix Outdated
Comment on lines 106 to 108

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are currently not used by the on machine generator. Please ignore if that is still in the works :)

@Lassulus Lassulus Jan 12, 2025

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of removing them again, since the chmod/etc needs to happen after activation anyway and the user/group has to exist for that to work. But we use those definitions in clan.lol generators every now and then.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you remove these, everything will be owned by root I presume? That might make it hard to provision secrets for services which don't use LoadCredential

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally we want to fix that in the preStart or with tmpfiles.d but there could be certain cases where it would be necessary to do it beforehand. But maybe it would be better to have a separate activation-script/service for that, that takes a file somewhere on the filesystem and links/copies it to the target? but tmpfiles also does that already, so not sure

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also not sure what would be the best action here, I guess I'm mostly just used to being able to specify it on the secret itself

Comment thread nixos/modules/system/vars/options.nix Outdated
This allows to create secrets (and public files) outside or inside the
nix store in a more delclarative way. This is shipped with an example
(working) implementation of an on-machine storage. The vars options can
easily be used to implement custom backends and extend the behaviour to
integrate with already existing solutions, like sops-nix or agenix.
Comment thread nixos/modules/system/vars/on-machine-backend.nix Outdated

@jfly jfly left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm excited about this!

Sorry for the pretty myopic/nitpicky review, hopefully it's more useful than noisy. Here are some more high level thoughts:

  1. The vars options can easily be used to implement custom backends and extend the behaviour to integrate with already existing solutions, like sops-nix or agenix.

    I'd be interested to hear more about how this would work. Is the idea that a tool like sops-nix would provide a "backend" (similar to the on-machine backend you've included here?). It seems like the bulk of the logic in this PR is in the generate-vars script in the on-machine backend. Are all backends going to have a similar large chunk of logic (generating inputs, prompting, invoking generators with tempdirs, moving files)?

  2. I think it would be nice to lightly document just what a "backend" is (how to create one). For example, are they all supposed to provide a globally installed generate-vars command?

  3. I personally find the generate-vars shell script to be surprisingly readable, but hard to confirm is robust. I think I saw a few missing shell escapes (I left comments about them). I'd feel more sure about things if we were instead generating a JSON config file for something written in python (or something) to deal with. (Similar to how sops-nix works under the hood if you are familiar with that.)

  4. I'm not very fond of the name vars anymore, so maybe a first change could be to come up with a better name

    I personally am not bothered by the word "vars". I assume this isn't called "secrets" because these things sometimes have a secret and a public part (as in your syncthing example)?
    I did find the use of the word "var" to be a little confusing while reading the code. Specifically, the relationship between a "var", a "generator" and a "file". IIUC, a "var" has multiple "files" in it, and each file may be secret or public. A "generator" can generate these files. But what's the difference between a var and a generator? They sort of seem to be the same thing. I also saw the word "fact" floating around. I assume that's an old name, or something from clan?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd feel better about this if we somehow asserted that the result of this is exactly 1 ID. I imagine things would get weird if this started returning multiple IDs or no ID at all.

Or, perhaps we could ask upstream syncthing if they'd be open to a change to syncthing generate to persist the id somewhere?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
< "$out"/config.xml grep -oP '(?<=<device id=")[^"]+' | uniq > "$out"/syncthing.pub
< "$out"/config.xml grep -m 1 -oP '(?<=<device id=")[^"]+' > "$out"/syncthing.pub

should guarantee that it can yield no more than 1 id (uniq could still yield multiple, different ids). Due to the regex match not applying to empty strings, grep will already return rc 1 if no id is found, so the set -e in the reference backend should fetch it.

Might be a good idea to explicitly define that scripts are supposed to be run with set -e. :)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
# if not, they are in an incosistent state and we bail out
# if not, they are in an inconsistent state and we bail out

Comment thread nixos/modules/system/vars/options.nix Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
Takes the file as an arument and returns maybe an attrset with should at least contain the `path` attribute.
Takes the file as an argument and returns maybe an attrset with should at least contain the `path` attribute.

Comment thread nixos/modules/system/vars/options.nix Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"fact" feels like a new piece of vocabulary here. Is this perhaps a term from clan? Should this be "generated file" (or "var")?

Related: what does "public" mean here? Are there also private facts?

Comment thread nixos/modules/system/vars/options.nix Outdated

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This admonition seems a little bit at odds with the fact that this defaults to true. I personally would expect an option that's documented as "enable this only if ..." to default to false.

idea: Would it be better if we could get rid of this option and instead deduce it based on if the generated file is used as an input to other generators? Right now, I think the API here doesn't allow us to know that (generators depend on other generators). Perhaps it would be better for generators to depend on specific files from other generators? In addition to allowing us to remove this option, it might also move some issues "further left" (such as if generator B depends on generator A and assumes that generator A will create a certain file, but never actually creates the file. We could detect that error when running A, rather than waiting for B to (hopefully) fail.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be disable. We cannot deduce this automatically since a file can be referenced in another generator and still be needed to be deployed. We also cannot depend on single files by other generators, because files are stored as a unit. For the on-machine backend this option doesn't really make sense though, since we have no other storage location than where the files should be anyway.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot deduce this automatically since a file can be referenced in another generator and still be needed to be deployed.

Sorry, I'm not quite following this sentence: if a file is referenced in another generator, then doesn't that mean that it does need to be deployed?

I think I'm unclear on where the generators are supposed to run. Is it on the target machine, or on some other machine doing the build?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well that depends on the backend that runs the generators. The declarations of the generators inside nixpkgs should be independent of where they run.
What I wanted to say, is that a generator can be a dependency of another generator but the secret file that is depended on can still be required by the system to be present during runtime. for example ssh-ca, where we would generate a certificate and a private key in a generator and later have a ssh key be generated that needs the private key to sign the public key, so it gets trusted automatically. In that case we would need to deploy the certificate from the first generator to the machine and the public key, private key and certificate from the second generator.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I think that makes sense.

I think the part I'm missing is how these generated files actually get "wired up" to running services. Is that present in the syncthing example in this PR? (I don't think it is).

If we have some way of expressing "this file created by this generator is needed by this service" (which feels like something we'd have to do anyway), then wouldn't that let us deduce if the file should be deployed to the machine?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that makes sense, thanks! For the record, the reason I was asking this question was because I was trying to understand what mechanism would ensure that services would get restarted when secrets change. In other words, is the vars interface missing something like sops-nix's restartUnits functionality to annotate this relationship, or perhaps some other mechanism to ensure restarts (such as requiring that vars backends always generate "input addressed secret paths").

You, me, @a-kenji, @Kranzes (and others) spent some time discussing this last week. My summary of the conversation:

  1. Generators that store secrets completely "out of band" (such as the on machine backend in this PR) cannot possibly implement something like "input addressed secret paths".
  2. We don't need to expose an additional api similar to sops-nix's restartUnits functionality, because we can already achieve this with systemd functionality.
    (set up a unit to watch the file, trigger a restart of our service based on that)
    • We could propose a patch to systemd to have it automatically watch the files that that LoadCredential points at and restart them.
      • @Kranzes pointed out that this wouldn't be possible for systemd to do if LoadCredential points at a socket. Still, this does feel like something that should be supported in systemd... somehow.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generators that store secrets completely "out of band" (such as the on machine backend in this PR) cannot possibly implement something like "input addressed secret paths".

Upon further reflection, I think it's a little more nuanced than this. It's hard to be concrete about this because the precise context in which generators are invoked is not yet specified. But, you could imagine that invocation of generators is done in a way such that we store a "generation id" and are able to inject that id into the string interpolation that produces secret paths. This implies a couple things:

  1. There's some mechanism for storing and fetching the secret "generation id" that is readable at nix evaluation time. Either files in a repo, or module args injected "impurely". This would be a change to the vars api as in this PR.
  2. It would not be possible to rotate secrets completely out of band. You'd have to deploy an updated nixos configuration. Lame, but still better than the "reboot your machine to pull in updated secrets" that we have with the current state of things + this PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need shell escaping? (could file.path have spaces in it?)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need shell escaping? Could gen.name contain a double quote? I think so.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not double quotes as currently all names have the type lib.types.strMatching "[a-zA-Z0-9:_\\.-]*", but ending a name with \ would break this line. Quoting using ' would alleviate this, but for robustness sake, escaping most injected values seems preferable to me.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This '${file.name}' feels like a "poor man's escaping". I'm pretty sure that file.name could itself contain a ', so I don't think this works. Can we use lib.escapeShellArg here instead?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This is unnecessary because of the trap 'rm -rf $out' EXIT up above, right?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, the trap only runs on exit, it's better to clean up the files if they are no longer needed earlier. the trap is just in case the script exists unexpectedly

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not understanding the glue between this generator and the syncthing service. How do the files created by this generator actually get wired up into the running service?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hope I am not misreading your question, but the pem files are seemingly simply used to set conventional nixos options such as on line 639-641 in this example.

The device id in syncthing.pub does not seem to be used in the example, but only stored. Presumably for outside use just as scripted setups of other devices to sync with.

@Lassulus

Lassulus commented Feb 1, 2025

Copy link
Copy Markdown
Member Author
1. > The vars options can easily be used to implement custom backends and extend the behaviour to integrate with already existing solutions, like sops-nix or agenix.
   
   
   I'd be interested to hear more about how this would work. Is the idea that a tool like sops-nix would provide a "backend" (similar to the on-machine backend you've included here?). It seems like the bulk of the logic in this PR is in the `generate-vars` script in the on-machine backend. Are all backends going to have a similar large chunk of logic (generating inputs, prompting, invoking generators with tempdirs, moving files)?

yes basically every implementation has to take care of executing the scripts and storing the generated files in the location, also deployment of those files needs to be taken care of by a backend.

2. I think it would be nice to lightly document just what a "backend" is (how to create one). For example, are they all supposed to provide a globally installed `generate-vars` command?

no, the generate-vars is mostly only useful in this usecase, since generating and deploying of files happens at the same time. Otherwise I would do deployment in activation. But we cannot prompt for inputs there. So this was moved into a script that can be manually run at any time.

3. I personally find the `generate-vars` shell script to be surprisingly readable, but hard to confirm is robust. I think I saw a few missing shell escapes (I left comments about them). I'd feel more sure about things if we were instead generating a JSON config file for something written in python (or something) to deal with. (Similar to how sops-nix works under the hood if you are familiar with that.)

sure, you can just dump the vars config from nix with json. But that increases the complexity of the on-machine implementation, I want a simple one that people can read and understand how to adapt. sops-nix probably would interact more directly with the vars instead of shell scripts.

4. > I'm not very fond of the name vars anymore, so maybe a first change could be to come up with a better name
   
   
   I personally am not bothered by the word "vars". I assume this isn't called "secrets" because these things sometimes have a secret and a public part (as in your syncthing example)?
   I did find the use of the word "var" to be a little confusing while reading the code. Specifically, the relationship between a "var", a "generator" and a "file". IIUC, a "var" has multiple "files" in it, and each file may be secret or public. A "generator" can generate these files. But what's the difference between a var and a generator? They sort of seem to be the same thing. I also saw the word "fact" floating around. I assume that's an old name, or something from clan?

yeah, fact is the old name for vars. we can't really call them generators, because that is already used by other parts of nixos. but in the end vars are generators that generate secret or public files :D

@wegank wegank added the 2.status: merge conflict This PR has merge conflicts with the target branch label Apr 2, 2025
@nixos-discourse

Copy link
Copy Markdown

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/vars-a-framework-for-managing-secrets-and-computed-values/62411/6

@Lassulus

Lassulus commented Apr 5, 2025

Copy link
Copy Markdown
Member Author

there is also a flake now to test this out (or PR against, instead of doing code suggestions here) https://github.com/Lassulus/vars

I will mainly develop this PR there and at a later point bring over the changes made there into this PR again

@nixos-discourse

Copy link
Copy Markdown

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/vars-a-framework-for-managing-secrets-and-computed-values/62411/14

prompt_value=$(cat)
'';
};
generate-vars = pkgs.writeShellApplication {

@roberth roberth Apr 8, 2025

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this script may be run on the deployer host, which can be a different platform.

Suggested change
generate-vars = pkgs.writeShellApplication {
generate-vars = deployerPkgs: deployerPkgs.writeShellApplication {

It is then up to the caller to pass a suitable pkgs instance that works for the deployer host.

EDIT: this is of course not the complete solution because the script contents still refer to pkgs. Use a separate file and callPackage to avoid a footgun.

Suggested change
generate-vars = pkgs.writeShellApplication {
generate-vars = { deployerPkgs }: deployerPkgs.callPackage ./generate-vars.nix { inherit cfg; };

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a test case, you could add aarch64-darwin to this and run the checks on darwin.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually we overwrite pkgs of the system where the vars are defined if the caller architecture is different from the target systems architecture, since this also supports string context references to pkgs and the vars definitons want to live next to the service definitions.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend against that. It blurs the line too much.

  • You'll be evaluating a NixOS configuration that "runs on" macOS, which is unexpected. It may work out for you anyway, thanks to laziness, but if vars calls someone's code that reads something from the rest of the NixOS config (accidentally or otherwise), they'll get strange errors like "util-linux is not supported on darwin" or worse.
  • Overriding is a code smell.
  • Overriding is in general less efficient than evaluating the right thing in one go, especially if it's a NixOS config.

Explicit is good. By making this a proper parameter, readers and module authors have at least an idea of what's going on, and you don't rely on the overriding magic to never hit them in the face.

Comment on lines +60 to +71
lib.types.submodule (file: {
imports = [
config.vars.settings.fileModule
];
options = {
name = lib.mkOption {
type = lib.types.strMatching "[a-zA-Z0-9:_\\.-]*";
description = ''
name of the generated file
'';
readOnly = true;
default = file.config._module.args.name;

@fricklerhandwerk fricklerhandwerk May 6, 2025

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
lib.types.submodule (file: {
imports = [
config.vars.settings.fileModule
];
options = {
name = lib.mkOption {
type = lib.types.strMatching "[a-zA-Z0-9:_\\.-]*";
description = ''
name of the generated file
'';
readOnly = true;
default = file.config._module.args.name;
lib.types.submodule ({ name, ... }: {
imports = [
config.vars.settings.fileModule
];
options = {
name = lib.mkOption {
type = lib.types.strMatching "[a-zA-Z0-9:_\\.-]*";
description = ''
name of the generated file
'';
readOnly = true;
default = name;

This seems to be a much more idiomatic way of writing it. That piece of code had confused the hell out of me, but maybe I'm just used to different things.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only changing it to default = file.name would work to the same effect, but I still find it unusual.

This name will be used to refer to the generator in other generators.
'';
readOnly = true;
default = generator.config._module.args.name;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
default = generator.config._module.args.name;
default = generator.name;

@nixos-discourse

Copy link
Copy Markdown

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/zurich-25-05-zhf-hackathon-report/64882/1

@nixos-discourse

Copy link
Copy Markdown

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/is-it-possible-to-use-nix-in-a-project-but-without-putting-all-the-code-in-nix-store/68873/8

@KiaraGrouwstra

Copy link
Copy Markdown
Contributor

if i understood correctly (?), clan's motivation for vars involved cross-node communication as facilitated by e.g. exports, which wasn't viable for implementations within the scope of nixpkgs. as such, this PR may not see updates any time soon.

at the same time, the contracts PR (#432529) offers tooling around handling of secrets agnostic to storage back-ends as well, raising questions on their potential integration.

@KiaraGrouwstra

Copy link
Copy Markdown
Contributor

one outstanding challenge i see with this, having used the flake repo for a while now, is that configurations using this are not reproducible: they depend on state from the generated secrets, meaning that switching to such a configuration in practice involves first switching to a configuration generating the credentials without yet depending on them.

whether that is something that can be adequately addressed is something I've yet to find a satisfying answer to - but if we want this handled in nixpkgs, I think that would be the main challenge here.

@hsjobeki

hsjobeki commented Apr 11, 2026

Copy link
Copy Markdown
Contributor

As one of the main contributors to Clan, where this system originated, I want to share some concerns that I think are worth resolving before upstreaming.

I agree with Kiara's flags. One of the unresolved core issues is that the system introduces an impure evaluation feedback loop: evaluate → generate → filesystem state → evaluate. This creates subtle correctness problems that we haven't formally solved in Clan yet. The current Clan implementation works because of strict (unbespoke) conventions, not because the underlying model is provably correct.

We neither formally specified under which constraints the model is correct nor solved the base model. Lots of open architectural questions in the Clan issue tracker at git.clan.lol that are directly relevant here.

I'm currently working on a larger refactor of this system in Clan to address some of them. I'd consider Clan's current state an early-stage exploration of the concept. Valuable for surfacing the design space, but not yet mature enough for I would recommend upstreaming into nixpkgs.

In my role as nixpkgs maintainer i would not recommend doing so now because it risks calcifying an architecture we know has unresolved problems, which would make it significantly harder to fix later.

If you're interested in collaboration, best to reach me via Matrix.

@SomeoneSerge

SomeoneSerge commented Apr 11, 2026 via email

Copy link
Copy Markdown
Contributor

@KiaraGrouwstra

Copy link
Copy Markdown
Contributor

@SomeoneSerge I wanna still further test Openbao with these, that ecosystem is maybe closer to having been checked by security people

@CertainLach

CertainLach commented Apr 23, 2026

Copy link
Copy Markdown
Member

Happy to find someone else has a solution for automatically generating secrets instead of manual management of them :D
I would like if something like this is exposed in nixpkgs, it is not too primitive for its purpose, for example it should have basic support for asymmetric keys, where the public part is also available for use in nix, for example it should be possible to generate wireguard keypair on one machine, and use its public key on the other machine

I'll share my own solution here, maybe it will inspire something:

I agree with Kiara's flags. One of the unresolved core issues is that the system introduces an impure evaluation feedback loop: evaluate → generate → filesystem state → evaluate. This creates subtle correctness problems that we haven't formally solved in Clan yet. The current Clan implementation works because of strict (unbespoke) conventions, not because the underlying model is provably correct.

In https://github.com/deltarocks/fleet, I have implemented this system by using nix plugin for custom __fleetEnsureHostSecret builtin https://github.com/deltarocks/fleet/blob/70a291cdfed0afb1b39a8cd7f1f1203f07747d10/crates/fleet-base/src/primops.rs#L233 https://github.com/deltarocks/fleet/blob/70a291cdfed0afb1b39a8cd7f1f1203f07747d10/modules/nixos/secrets.nix#L78-L128

For the secret like this:

secrets.gitlab-initial-root = {
  generator = fleetLib.mkPassword { };
};
services.gitlab = {
  enable = true;
  initialRootPasswordFile = config.secrets.gitlab-initial-root.secret.path;
};

During the secret attribute evaluation (when nix code accesses config.secrets.gitlab-initial-root), it first checks if the secret is stored in state file (github version only supports disk storage of secrets, but I also have version that stores state in any terraform-compatible store), it is not expired, and some other checks, if it is - then returns it directly, otherwise it generates it and stores in the state file

https://github.com/deltarocks/fleet/blob/70a291cdfed0afb1b39a8cd7f1f1203f07747d10/modules/secrets.nix#L101-L145

let
    /**
      Generate a random secret password, 32 ascii characters by default

      Options:
        size: generated password length in ascii characters (bytes).
        noSymbols: by default, character set includes various special characters ($ , ! + * : ~), and might
                   not be accepted in some contexts, this option switches charset to just [A-Za-z0-9].

      Output:
        Resulting secret has only part: secret, which contains encrypted password.
    */
    mkPassword =
      {
        size ? 32,
      }:
      (
        {
          coreutils,
          mkSecretGenerator,
        }:
        mkSecretGenerator {
          script = ''
            mkdir $out
            gh generate password -o $out/secret --size ${toString size}
          '';
          parts.secret.encrypted = true;
        }
      );
    mkAskPass =
      {
        prompt ? "Secret value",
        part ? "secret",
      }:
      (
        {
          kdePackages,
          mkImpureSecretGenerator,
        }:
        mkImpureSecretGenerator {
          script = ''
            mkdir $out
            ${kdePackages.kdialog}/bin/kdialog --inputbox "${prompt}" | gh private -o $out/${part}
          '';

          parts.${part}.encrypted = true;
        }
      );
    /**
      Generate a random RSA keypair

      Options:
        size: RSA key size, 4096 by default

      Output:
        Resulting secret has two parts: public and secret, where the secret part is encrypted.
        Both parts are PEM encoded.
    */
    mkRsa =
      {
        size ? 4096,
      }:
      (
        {
          openssl,
          mkSecretGenerator,
        }:
        mkSecretGenerator {
          script = ''
            mkdir $out

            ${openssl}/bin/openssl genrsa -out rsa_private.key ${toString size}
            ${openssl}/bin/openssl rsa -in rsa_private.key -pubout -out rsa_public.key

            cat rsa_private.key | gh private -o $out/secret
            cat rsa_public.key | gh public -o $out/public
          '';

          # encrypted secrets are only available by target host path (`config.secrets.name.secret.path`) at runtime
          parts.secret.encrypted = true;
          # not encrypted secrets can be used in nix code directly (`config.secrets.name.public.data`), and stored in nix store (`config.secrets.secret.public.file`)
          parts.public.encrypted = false;
        }
      );
in ""

I believe it is most sane way to avoid this feedback loop, despite one of the most complicated.

Do not look at the documentation on website for now, the website is only up for 2 days, and I haven't written it here yet, website only contains couple of partially llm-generated stubs, I was just testing how well I can integrate asciidoc with tree-sitter for now

Instead I have written full wireguard mesh example here:
https://gist.github.com/CertainLach/3046b7af58735c035c86de36441c634a

@KiaraGrouwstra

Copy link
Copy Markdown
Contributor

@CertainLach use of kdialog would seem to presume at least an existing graphical environment, would that not make it less portable to settings such as isolated installs from non-graphical installer?

@CertainLach

CertainLach commented Apr 23, 2026

Copy link
Copy Markdown
Member

@CertainLach use of kdialog would seem to presume at least an existing graphical environment, would that not make it less portable to settings such as isolated installs from non-graphical installer?

impure secretGenerators are running on the localhost or on the any other host specified with impureOn attribute, without any sandboxing, e.g I have such secret generator for my homelab:

  mkKubernetesCert =
    {
      subj,
      sans ? [ ],
      # Host with deployed openssl HSM
      impureOn,
    }:
    {
      mkImpureSecretGenerator,
    }:
    mkImpureSecretGenerator {
      inherit impureOn;
      script = ''
        cacert=~/ca/kubernetes-intermediateCA.crt

        params=$(mktemp)
        csr=$(mktemp)
        cert=$(mktemp)
        openssl ecparam -genkey -name secp384r1 -out $params
        openssl req -new -key $params \
          -subj "${
            lib.strings.concatStringsSep "" (lib.attrsets.mapAttrsToList (k: v: "/${k}=${v}") subj)
          }" \
          ${
            lib.optionalString (
              sans != [ ]
            ) "-addext \"subjectAltName = ${lib.strings.concatStringsSep "," sans}\""
          } \
          -out $csr
        hsms x509 -req \
            -days 365 -in $csr \
            -CA $cacert \
            -CAkey "pkcs11:object=Homelab Kubernetes Intermediate CA;type=private" -CAkeyform=engine \
            -CAcreateserial \
            -copy_extensions copy -out $cert

        expires_at=$(sudo openssl x509 -in $cert -noout -enddate | cut -d= -f2 | xargs -I{} date -u -d {} +"%Y-%m-%dT%H:%M:%S.%NZ")

        mkdir $out
        echo -n $expires_at > $out/expires_at
        cat $params | gh private --output $out/secret
        cat $cert | gh public --output $out/public
      '';
      parts.secret.encrypted = true;
      parts.public.encrypted = false;
    };

askPass secret generator is only intended for basic remote deployments running from gui workstations, yes (at least for the initial deployment, as the subsequent deployments will just read the secret stored in the state file)

But you can write the secret generator in any way you want. When many hosts are being deployed, it is hard to have proper shell prompts interleaved with the deployment logs/progress bars/etc, this is why I used gui prompts for mkAskPass.

@Lassulus

Lassulus commented Apr 24, 2026

Copy link
Copy Markdown
Member Author

one outstanding challenge i see with this, having used the flake repo for a while now, is that configurations using this are not reproducible: they depend on state from the generated secrets, meaning that switching to such a configuration in practice involves first switching to a configuration generating the credentials without yet depending on them.

whether that is something that can be adequately addressed is something I've yet to find a satisfying answer to - but if we want this handled in nixpkgs, I think that would be the main challenge here.

This is mostly a backend specific limitation. My idea is to have the declaration how secrets can be generated upstream in nixpkgs (mainly the options), and users can easily implement their own backend.

The current implementation of the on-machine would not be deployed first and then run, usually you would do something like nix run $myFlake#nixosConfigurations.mymachine.config.system.build.generate-vars on the target machine. Maybe this would be more obvious if we don't expose the vars generation script in environment.systemPackages.

Maybe we should work on a more simple example backend in nixpkgs than the on-machine one, since this one has a few challenges with remote machines and public vars back propagation?

@KiaraGrouwstra

Copy link
Copy Markdown
Contributor

Maybe we should work on a more simple example backend in nixpkgs than the on-machine one, since this one has a few challenges with remote machines and public vars back propagation?

i'd be curious to see this, having reproducible options simpler than e.g. vault would seem nice

@nixos-discourse

Copy link
Copy Markdown

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/gsoc-contributors-2026-announced/77328/4

@nixos-discourse

Copy link
Copy Markdown

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/nixos-artifacts-one-interface-for-secrets-across-agenix-sops-nix-and-more/77535/1

@Mutasem-mk4

Copy link
Copy Markdown

Hi @Lassulus and @oddlama! Thank you for the great work on vars and exploring how we can make secret management more declarative. I've been following this PR closely.

I wanted to share a complementary effort I've been working on called nixos-artifacts (currently proposed in PR #519619 and RFC #201).

While vars proposes a broad framework for both public and secret "facts" with generators, nixos-artifacts takes a narrower, highly opinionated approach focused specifically on the secret provisioning lifecycle. It provides a simple translation layer allowing service modules to declare secrets abstractly while operators choose a backend (sops-nix, agenix, systemd-creds, or dummy).

I believe our efforts are highly aligned. vars could potentially use nixos-artifacts as its secure provisioning backend, or nixos-artifacts could adopt vars generators natively in the future.

I'd love to help test vars or discuss how we might merge our abstractions to give NixOS users the best possible experience without fragmenting the ecosystem. Let me know what you think!

@Lassulus

Copy link
Copy Markdown
Member Author

heyhey, I have to take a deeper look at that. we are also currently through GSoC trying to finish this work and continue on upstreaming vars. we have a matrix room dedicated to that effort: https://matrix.to/#/#vars:nixos.org we wanted to have another meeting today at 16:30 CEST

@Mutasem-mk4

Copy link
Copy Markdown

Hey @Lassulus, I've been working on a similar effort (RFC 0201: security.artifacts) which focuses specifically on the secret provisioning lifecycle (permissions, systemd ordering, backend-agnostic path injection).

I think our efforts are very complementary:
ixos-artifacts\ could serve as the low-level provisioner that \�ars\ targets for secret 'facts'. I'm looking forward to the meeting today at 16:30 CEST to discuss how we can align these!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

2.status: merge conflict This PR has merge conflicts with the target branch 6.topic: nixos Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS 8.has: module (new) This PR adds a module in `nixos/` 8.has: module (update) This PR changes an existing module in `nixos/` 8.has: tests This PR has tests 10.rebuild-darwin: 0 This PR does not cause any packages to rebuild on Darwin. 10.rebuild-linux: 1-10 This PR causes between 1 and 10 packages to rebuild on Linux.

Projects

None yet

Development

Successfully merging this pull request may close these issues.