Skip to content

Add cardano-config package#671

Open
jasagredo wants to merge 41 commits into
masterfrom
js/cardano-config
Open

Add cardano-config package#671
jasagredo wants to merge 41 commits into
masterfrom
js/cardano-config

Conversation

@jasagredo

@jasagredo jasagredo commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Introduce a new cardano-config library that provides a single entry-point for parsing the cardano-node configuration.

It defines:

  • A CLI options parser based on optparse-applicative (parseCliArgs).
  • JSON/YAML parsing of the configuration files, derived from autodocodec codecs (parseConfigurationFiles).
  • resolveConfiguration, combining both into a NodeConfiguration.

The configuration keys are grouped into components (Storage, Consensus, Mempool, Network, Protocol, Testing, Tracing). A bundled cardano-config-schema executable dumps the JSON Schema derived from the same codecs as the authoritative key listing.

Register the package in cabal.project.

Note that the lines changed in Haskell files are fewer than the ones reported by the PR

❯ git diff master...HEAD --stat -- '*.hs'
 cardano-config/app/Main.hs                                 | 155 +++++++++++++++++++++++++++++++
 cardano-config/src/Cardano/Configuration.hs                | 249 +++++++++++++++++++++++++++++++++++++++++++++++++
 cardano-config/src/Cardano/Configuration/Basic.hs          |  23 +++++
 cardano-config/src/Cardano/Configuration/CliArgs.hs        | 435 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 cardano-config/src/Cardano/Configuration/Common.hs         | 117 +++++++++++++++++++++++
 cardano-config/src/Cardano/Configuration/File.hs           | 372 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 cardano-config/src/Cardano/Configuration/File/Consensus.hs | 112 ++++++++++++++++++++++
 cardano-config/src/Cardano/Configuration/File/Mempool.hs   |  80 ++++++++++++++++
 cardano-config/src/Cardano/Configuration/File/Network.hs   | 256 ++++++++++++++++++++++++++++++++++++++++++++++++++
 cardano-config/src/Cardano/Configuration/File/Protocol.hs  | 142 ++++++++++++++++++++++++++++
 cardano-config/src/Cardano/Configuration/File/Storage.hs   | 187 +++++++++++++++++++++++++++++++++++++
 cardano-config/src/Cardano/Configuration/File/Testing.hs   | 111 ++++++++++++++++++++++
 cardano-config/src/Cardano/Configuration/File/Tracing.hs   |  41 ++++++++
 cardano-config/src/Cardano/Configuration/Render.hs         | 188 +++++++++++++++++++++++++++++++++++++
 cardano-config/src/Cardano/Configuration/Schema.hs         | 458 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 cardano-config/test/Main.hs                                | 172 ++++++++++++++++++++++++++++++++++
 16 files changed, 3098 insertions(+)

Comment thread cardano-config/src/Cardano/Configuration/File.hs Outdated
Comment thread cardano-config/src/Cardano/Configuration/File.hs Outdated
Comment thread cardano-config/src/Cardano/Configuration/File.hs Outdated
Comment thread cardano-config/src/Cardano/Configuration/File/Network.hs Outdated
Comment thread cardano-config/src/Cardano/Configuration/File/Consensus.hs Outdated
Comment thread cardano-config/README.md Outdated
@crocodile-dentist

Copy link
Copy Markdown

Some other general comments:

  1. Maybe this could be moved to its own dedicated repo eventually, so as not to force cardano-base releases for configuration-related changes.
  2. Some fields are marked with optionalFieldWithDefault (visible in schema) and others are Maybe. Would it be more principled to have the schema's job to be to just enumerate keys and their types, and the default values provided at the end?

Comment thread cardano-config/src/Cardano/Configuration/CliArgs.hs Outdated
Comment thread cardano-config/src/Cardano/Configuration/File.hs Outdated
@jasagredo

Copy link
Copy Markdown
Contributor Author

Maybe this could be moved to its own dedicated repo eventually, so as not to force cardano-base releases for configuration-related changes.

Let's piggyback on cardano-base until we have a satisfactory implementation, then I can move it to its own repo in one go, instead of merging it here 😄

Some fields are marked with optionalFieldWithDefault (visible in schema) and others are Maybe. Would it be more principled to have the schema's job to be to just enumerate keys and their types, and the default values provided at the end?

Maybe are defaulted in the library, whereas optionalFieldWithDefault is defaulted in the node. Teams want the configs to live in their repos, while the whole spirit of this library necessitates of the defaults to live here. It seems we came to an agreement (maybe? still needs discussion) on having a verbatim copy of a file from network here, and a CI check that sees that they are real verbatim copies. That way network can temporarily modify their defaults in a branch, but when they commit a default, they will need to update it here too.

@carbolymer carbolymer left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The library is inconsistent about where defaults and validation live:

  • Some fields bake in defaults at parse time via optionalFieldWithDefault (e.g. DiffusionMode defaults to "InitiatorAndResponder", EnableCSJ defaults to True).
  • Some fields stay Maybe because the downstream consumer owns the default (e.g. ProtocolIdleTimeout, MaxConcurrencyBulkSync).
  • Some types use the f parameter to distinguish partial from resolved (StorageConfiguration f, ConsensusConfiguration f, ProtocolConfiguration f), but others use plain Maybe permanently (NetworkConfiguration, LocalConnectionsConfig, MempoolConfiguration, TestingConfiguration).

It would help to have a stated design principle in the README or module documentation:

  • When should a default live in cardano-config vs be deferred to the consumer?
  • Where should cross-field validation happen (e.g. "RPC enabled but no socket path")? At parse time? At resolution time? In the consumer?
  • Should all sub-configs use the f parameter uniformly, so that NodeConfiguration carries Identity-wrapped fields guaranteeing completeness?

Without this, each new field or sub-config will make an ad-hoc choice and the inconsistency will grow.

Comment thread cardano-config/test/Main.hs
Comment thread cardano-config/src/Cardano/Configuration/Basics.hs Outdated
Comment thread cardano-config/src/Cardano/Configuration/File/Storage.hs Outdated
Comment thread cardano-config/src/Cardano/Configuration/File.hs Outdated
Comment thread cardano-config/src/Cardano/Configuration/File.hs Outdated
Comment thread cardano-config/src/Cardano/Configuration/Schema.hs Outdated
Comment thread cardano-config/src/Cardano/Configuration/CliArgs.hs
Comment thread cardano-config/src/Cardano/Configuration/File.hs Outdated
Comment thread cardano-config/src/Cardano/Configuration/File/Network.hs Outdated
Comment thread cardano-config/src/Cardano/Configuration/File/Network.hs Outdated
jasagredo added 16 commits June 23, 2026 02:18
Introduce a new `cardano-config` library that provides a single
entry-point for parsing the `cardano-node` configuration.

It defines:

- A CLI options parser based on `optparse-applicative` (`parseCliArgs`).
- JSON/YAML parsing of the configuration files, derived from
  `autodocodec` codecs (`parseConfigurationFiles`).
- `resolveConfiguration`, combining both into a `NodeConfiguration`.

The configuration keys are grouped into components (Storage, Consensus,
Mempool, Network, Protocol, Testing, Tracing). A bundled
`cardano-config-schema` executable dumps the JSON Schema derived from the
same codecs as the authoritative key listing.

Register the package in `cabal.project`.
Use singular module names consistently.
The pnc prefix was a leftover from the node's PartialNodeConfiguration
and carries no meaning here.
- Represent DiffusionMode and RequiresNetworkMagic as enumerated sum
  types via shownBoundedEnumCodec, so the schema lists the valid values
  and typos fail at parse time.
- Constrain the ConsensusMode discriminator to its valid values and
  reject LowLevelGenesisOptions outside GenesisMode instead of silently
  dropping it.
- Dispatch SnapshotPolicy and NodeDatabasePaths on JSON shape via
  matchChoiceCodec so a malformed object surfaces its own validation
  error rather than the irrelevant other-branch failure.
- Replace the last-colon split in parseHostPort with URI-authority
  parsing so bracketed IPv6 (e.g. [2001:db8::1]:3001) is handled and a
  bare IPv6 no longer silently mis-parses into host and port.
- Export the individual optparse-applicative parsers and argument
  readers so other tools can reuse the node's flags.
…on envelope

- Classify each section as inline (top-level keys), inline object, or a
  referenced sub-file in an explicit pass, then read and run the codec,
  rather than interleaving Parser with IO. A referenced file that does
  not exist is now an explicit error instead of a silent fall back to
  inline parsing.
- Carry structured context in ConfigurationParsingError (file, section,
  JSON path, message) so failures point at the offending file and
  location.
- Accept an optional { ConfigurationVersion, Config } envelope so the
  format can evolve, falling back to the legacy flat layout as version
  1. Render the unsupported-version error without a Just constructor.
- Document the parsing helpers' parameters.
Compare the top-level keys against those the parsers recognise (derived
from the same codecs as the schema). By default emit a warning so a
typo such as DiffusionMoed is noticed; parseConfigurationFilesWith
RejectUnknownKeys turns it into a hard error.
Reference RFC 3986 (URI authority, bracketed IPv6) as the rationale and
RFC 5952 for the address text form, and note that the RFC 5321 SMTP
address-literal form is intentionally not accepted.
Add a test asserting the committed config.schema.json equals the schema
derived from the codecs, so the documented schema cannot drift from the
parsers (CI runs cabal test all). Regenerate the schema to pick up the
enumerated DiffusionMode, RequiresNetworkMagic and ConsensusMode values.
resolveConfiguration now returns Either ConfigResolutionError, with
cross-field consistency checks (spanning CLI and file values) performed
after merging. The first such check rejects enabling the gRPC endpoint
when there is no socket path to derive the gRPC socket from. Structural
validation of individual values stays in the codecs.
Express each cross-field check as a ConfigCheck (a description plus a
predicate over the resolved NodeConfiguration) and run a list of them,
reporting every violation. defaultConfigChecks holds the built-ins and
resolveConfigurationWith lets consumers add their own. Add the Mithril
snapshot-policy check (requires the V2LSM backend with an LSMExportPath,
or the V2InMemory backend) alongside the gRPC socket check.
cardano-config is the origin of one defaults file per component, applied
as the base layer during resolution. Each component is ultimately owned
by its implementing layer, which will adopt these and keep them aligned
via CI. Values confirmed from ouroboros-network are included; the rest
are left unset for the owning layer to fill (see defaults/README.md).
Genesis files are intentionally not defaulted (network-specific).
Split the defaults into role/network-agnostic base files plus variants
layered on top:
- Protocol.{mainnet,preview,preprod}.json carry the network-specific
  genesis files/hashes, RequiresNetworkMagic and LastKnownBlockVersion
  (from the published environment configs).
- Network.{relay,blockproducer}.json carry the deadline peer targets and
  PeerSharing, which differ by node role.
Bucket-(A) node-deferred network fields get best-effort placeholder
values (documented in defaults/README.md) so they can resolve fully.
LedgerDB.Snapshots defaults to Mithril and LedgerDB.QueryBatchSize to
100000 (from ouroboros-consensus), so they are resolved rather than
optional. The mempool timeouts have no node default (the node applies no
timeout), so they stay optional and are intentionally absent here.
- A section value may now be a list of paths/objects, deep-merged in
  order so a later entry overrides an earlier one (e.g.
  ["Network.variants/Network.relay.json"]); nested objects merge
  recursively.
- The package's base default defaults/<Component>.json is always read as
  the bottom layer and the user's section layer is merged on top, so a
  config can name only a variant and still inherit the base defaults.
- Move the opt-in variant files into defaults/<Component>.variants/.
- Expose the data dir via Paths_cardano_config; add an example and a test
  asserting later-overrides-earlier.
Update the README to cover the version envelope, the deep-merge defaults
layering and variants, the warn/strict unknown-key handling, the error
and validation model, and the design principles (where defaults and
validation live). Spell out that only the four established-era genesis
files and LastKnownBlockVersion-Major/-Minor are mandatory; everything
else is optional. Refresh the CHANGELOG.
@jasagredo jasagredo force-pushed the js/cardano-config branch 2 times, most recently from 2999618 to 8f4a079 Compare June 23, 2026 00:20
Confirmed against cardano-node: its parser (Cardano.Node.Configuration.POM)
reads only LastKnownBlockVersion-Major/-Minor/-Alt and never
MaxKnownMajorProtocolVersion (which appears in no node Haskell source). It
is a dead key the node ignores, so cardano-config intentionally does not
parse it.
mainnet keeps the Mithril snapshot policy (base default); the test
networks override LedgerDB.Snapshots with an explicit options object: the
per-network SnapshotInterval (preview 864, preprod 4320) plus the full set
of snapshot options (SlotOffset 0, RateLimit 600, MinDelay 300, MaxDelay
600, NumOfDiskSnapshots 2). Closes the last parsed network-divergence gap.
@jasagredo jasagredo force-pushed the js/cardano-config branch from 8f4a079 to 5bec466 Compare June 23, 2026 00:23
The Custom key takes a whole-configuration object — inline or a path to a
JSON/YAML file — and deep-merges it on top of every other file layer, so a
user can keep a shared base configuration and override just a few specific
options. CLI arguments still take precedence.
Types reachable from a resolved NodeConfiguration (and the file form) were
defined and exported by their component modules but not re-exported from
Cardano.Configuration, so consumers could not pattern-match on them:
ConsensusMode, SnapshotPolicy, SnapshotOptions, LedgerDbBackendSelector,
AcceptedConnectionsLimit and TracingConfiguration.
A small tool that loads a configuration (per-component defaults, the file and
its optional Custom override, and the CLI flags), resolves it exactly as the
node does, and prints the complete resolved NodeConfiguration. It accepts the
same flags as the node, with --config selecting the file.
Render the resolved NodeConfiguration with pretty-show's ppShow (indented,
one field per line) instead of a single-line Show, which was unreadable.
Add Cardano.Configuration.Render.nodeConfigurationToJSON, which renders a
resolved NodeConfiguration as a JSON Value using the documented configuration
keys (reusing each component's Maybe-form ToJSON via Identity -> Maybe lifts),
with the CLI-only operational arguments grouped under a Runtime key. The
executable now emits this as YAML instead of the Haskell-record Show output,
which used field/constructor names unfamiliar to non-Haskell users.

Drops the pretty-show dependency in favour of yaml (already a library dep).
@jasagredo jasagredo force-pushed the js/cardano-config branch from 11d0421 to dc9be6b Compare June 23, 2026 08:51
Replace the separate cardano-config-schema and cardano-config-resolve
executables with a single cardano-config executable exposing two
subcommands:

- cardano-config resolve [node flags...] resolves a configuration
  (defaults + file + Custom override + CLI flags) and prints the result
  as YAML.
- cardano-config schema [COMPONENT] [--list] dumps the JSON Schema for
  the whole configuration or a single component.

The schema subcommand is now an optparse-applicative parser rather than
raw getArgs handling. Update scripts/gen-schemas.sh, the README and the
CHANGELOG to the new invocations.
Make the autodocodec-derived schemas fully describe the configuration and
render well in tooling.

Declare a type for every field:
- The Blake2b-256 hashes and the Byron PBFT threshold use typed
  string/number codecs instead of the untyped codecViaAeson.
- String enumerations collapse to { type: string, enum: [...] }, and bare
  consts gain their implied type.

Annotate and name things:
- File-path fields carry "format": "path" (via a sentinel codec marker
  lifted in post-processing) so a path is distinguishable from an
  arbitrary string.
- Every schema and property gains a title, and every document an $id, so
  generators such as jsonschema2md render names rather than
  Untitled/undefined. Both keywords are standard draft-07.
- The whole-configuration schema documents the split-file form (each
  component reachable under its section key as a sub-file path, an inline
  object, or a list) alongside the version envelope and Custom.

Source defaults solely from the defaults/ files:
- Remove the codec-level defaults for Backend, RequiresNetworkMagic and
  LastKnownBlockVersion-Alt; those fields become Maybe and are sourced
  from the files (RequiresNetworkMagic added to defaults/Protocol.json).
  The GenesisMode-only LowLevelGenesisOptions toggles keep their codec
  defaults, as they cannot live in the base files (default mode is
  PraosMode).
- Fill the schema's default keywords from those same files, so the
  documented default matches the applied one.

Tighten and guard parsing:
- The LedgerDB Snapshots named policy accepts only "Mithril" (a const),
  rejecting any other string at parse time.
- Detect shadowed top-level keys: when a component is given as its own
  section, a sibling top-level key belonging to that component is ignored,
  so warn by default and reject under RejectUnknownKeys.

Regenerate the committed schemas and update the README and CHANGELOG.
The Custom whole-configuration override layer is no longer needed: the
per-component sections (each given inline, as a sub-file, or as a list of
sources) plus the defaults layering already cover the use case it served.

Drop it everywhere: the loadCustomLayer pass and the custom parameter
threaded through parseSection/parseConfigurationVersion1, the Custom key in
the schema envelope and recognised keys, the example files, the test cases
and the documentation.
Add a per-component if/then/not constraint to the whole-configuration
schema: when a component's section key is present (e.g. Consensus), none of
that component's top-level keys may be (e.g. ConsensusMode), so a component
is given either as a section or as its flat keys, not both. This is the
static-validation counterpart to the runtime shadowed-key check, so a
generic JSON Schema validator flags the combination too.

Also stop titling structural union branches (e.g. a bare { required: [..] }
constraint) that have no const or type.
Tracing was treated like any other component, so the whole-config schema
documented it three confusing ways: a Tracing section that could be a path,
an inline TracingConfiguration object, or a list — on top of the flat
HermodTracing key.

Tracing is not a section: it is only the single top-level HermodTracing key,
a path to a file the node's tracing system reads (and which cardano-config
neither parses nor describes). Drop Tracing from the section components, so
the whole-config schema documents just the HermodTracing path key; its
schema is still taken from the TracingConfiguration codec. Remove the now
unused per-component Tracing schema and tidy the README.
The Default instances for the resolver's last-resort fallbacks hard-coded
values that already live in defaults/*.json, against the rule that defaults
come solely from the files:

- Default LedgerDbBackendSelector (V2InMemory) was unused since Backend
  became Maybe; remove it.
- Default ConsensusMode (PraosMode) and Default NodeDatabasePaths
  ("mainnet/db") duplicated defaults/Consensus.json and defaults/Storage.json.
  Resolve DatabasePath, ConsensusMode and StartAsNonProducingNode from the
  CLI or the file (whose base-default layer supplies the value) and report a
  ConfigResolutionError if none provides it — matching the requireField
  pattern the other components already use — rather than falling back to a
  hard-coded value.

The remaining Default instances carry no file-able values: LedgerDbConfiguration
(an all-unset placeholder when LedgerDB is absent) and GenesisConfigFlags (the
GenesisMode-only toggles, which cannot be base-file defaults).

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduce a new cardano-config package that centralizes cardano-node configuration handling: CLI parsing (optparse-applicative), JSON/YAML configuration parsing with layering, resolution into a complete NodeConfiguration, and JSON Schema generation derived from the same autodocodec codecs (with committed schemas + drift tests).

Changes:

  • Add cardano-config library modules for CLI parsing, config file parsing/layering, resolution, rendering, and schema generation.
  • Add committed defaults, variants, examples, and JSON Schemas, plus a script to regenerate schemas.
  • Add a test-suite that validates example configs parse and committed schemas match derived schemas; register package in cabal.project.

Reviewed changes

Copilot reviewed 61 out of 62 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
cardano-config/app/Main.hs Implements cardano-config CLI (resolve + schema) over the new library APIs.
cardano-config/cardano-config.cabal Defines the new package (library/exe/tests) and data-file packaging.
cardano-config/CHANGELOG.md Initial changelog entry describing the new package and features.
cardano-config/defaults/Consensus.json Base defaults for Consensus component.
cardano-config/defaults/Consensus.variants/Consensus.preprod.json Preprod consensus overlay defaults.
cardano-config/defaults/Consensus.variants/Consensus.preview.json Preview consensus overlay defaults.
cardano-config/defaults/LocalConnections.json Base defaults for LocalConnections component.
cardano-config/defaults/Mempool.json Base defaults for Mempool component.
cardano-config/defaults/Network.json Base defaults for Network component.
cardano-config/defaults/Network.variants/Network.blockproducer.json Block producer overlay defaults for Network.
cardano-config/defaults/Network.variants/Network.relay.json Relay overlay defaults for Network.
cardano-config/defaults/Protocol.json Base defaults for Protocol component (non-network-specific).
cardano-config/defaults/Protocol.variants/Protocol.mainnet.json Mainnet protocol overlay defaults.
cardano-config/defaults/Protocol.variants/Protocol.preprod.json Preprod protocol overlay defaults.
cardano-config/defaults/Protocol.variants/Protocol.preview.json Preview protocol overlay defaults.
cardano-config/defaults/README.md Documents defaults layering, variant naming, and ownership/intent.
cardano-config/defaults/Storage.json Base defaults for Storage component.
cardano-config/defaults/Storage.variants/Storage.preprod.json Preprod storage overlay defaults.
cardano-config/defaults/Storage.variants/Storage.preview.json Preview storage overlay defaults.
cardano-config/defaults/Testing.json Base defaults for Testing component.
cardano-config/defaults/Testing.variants/Testing.preview.json Preview testing overlay defaults.
cardano-config/examples/consensus.json Example Consensus config input.
cardano-config/examples/fullconfig.json Example “single-file” full config input.
cardano-config/examples/localconnections.json Example LocalConnections config input.
cardano-config/examples/mempool.json Example Mempool config input.
cardano-config/examples/network-a.json Example Network config used for list-merge test (base).
cardano-config/examples/network-b.json Example Network config used for list-merge test (override).
cardano-config/examples/network.json Example Network config input.
cardano-config/examples/protocol.json Example Protocol config input.
cardano-config/examples/shadow.json Example config demonstrating shadowed top-level keys vs section.
cardano-config/examples/split-all.json Example fully split-by-component config.
cardano-config/examples/split-list.json Example split config with list-of-sources deep merge.
cardano-config/examples/split.json Example split config.
cardano-config/examples/storage.json Example Storage config input.
cardano-config/examples/testing.json Example Testing config input.
cardano-config/LICENSE Adds Apache-2.0 license file for the new package.
cardano-config/README.md Package documentation and cookbook for config/schema/resolve usage.
cardano-config/schemas/config.schema.json Committed whole-config schema derived from codecs.
cardano-config/schemas/Consensus.schema.json Committed Consensus schema derived from codecs.
cardano-config/schemas/LocalConnections.schema.json Committed LocalConnections schema derived from codecs.
cardano-config/schemas/Mempool.schema.json Committed Mempool schema derived from codecs.
cardano-config/schemas/Network.schema.json Committed Network schema derived from codecs.
cardano-config/schemas/Protocol.schema.json Committed Protocol schema derived from codecs.
cardano-config/schemas/Storage.schema.json Committed Storage schema derived from codecs.
cardano-config/schemas/Testing.schema.json Committed Testing schema derived from codecs.
cardano-config/scripts/gen-schemas.sh Script to regenerate committed schemas from codecs via the executable.
cardano-config/src/Cardano/Configuration.hs Public library entry-point: re-exports + resolveConfiguration* + consistency checks.
cardano-config/src/Cardano/Configuration/Basic.hs Shared basic codecs/utilities (DiffTime codec, required-field resolution helper).
cardano-config/src/Cardano/Configuration/CliArgs.hs CLI argument types and optparse-applicative parsers for node flags.
cardano-config/src/Cardano/Configuration/Common.hs Shared types/codecs (e.g., file paths + DB paths) for CLI + file parsing.
cardano-config/src/Cardano/Configuration/File.hs YAML/JSON config parsing pipeline (envelope, layering, sub-files, key checks).
cardano-config/src/Cardano/Configuration/File/Consensus.hs Consensus config codecs and constraints (mode + genesis options).
cardano-config/src/Cardano/Configuration/File/Mempool.hs Mempool config codecs and resolution.
cardano-config/src/Cardano/Configuration/File/Network.hs Network and LocalConnections config codecs and resolution.
cardano-config/src/Cardano/Configuration/File/Protocol.hs Protocol config codecs (hashed files, byron genesis, etc.).
cardano-config/src/Cardano/Configuration/File/Storage.hs Storage config codecs + snapshot policy validation + backend selection.
cardano-config/src/Cardano/Configuration/File/Testing.hs Testing config codecs and resolution.
cardano-config/src/Cardano/Configuration/File/Tracing.hs Opaque HermodTracing placeholder type/codecs surfaced in schema.
cardano-config/src/Cardano/Configuration/Render.hs Renders resolved configuration back to documented-key JSON/YAML for display.
cardano-config/src/Cardano/Configuration/Schema.hs Generates and post-processes JSON Schemas from codecs; applies defaults overlays.
cardano-config/test/Main.hs Golden-ish parsing tests + schema drift tests against committed schemas.
cabal.project Registers the new cardano-config package in the workspace.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cardano-config/src/Cardano/Configuration.hs
Comment thread cardano-config/src/Cardano/Configuration.hs
Comment thread cardano-config/src/Cardano/Configuration/CliArgs.hs
Comment thread cardano-config/src/Cardano/Configuration/File.hs Outdated
Comment thread cardano-config/src/Cardano/Configuration/File.hs Outdated
Comment thread cardano-config/README.md Outdated
Comment thread cardano-config/src/Cardano/Configuration/File/Tracing.hs
Comment thread cardano-config/cardano-config.cabal
Comment thread cardano-config/test/Main.hs
Bring back the configuration-validation guidance (lost in the executable
collapse) as a footer on the `cardano-config schema` command, using ajv-cli:
dump the schema, then `ajv validate --strict=false -s ... -d ...`.
--strict=false lets ajv ignore the informational "path" format.
The per-component section keys (used in the split-file form and the
whole-config schema) reused the bare component names, which collided with
the node's top-level `Protocol` scalar: a stock config with `Protocol:
"Cardano"` was misread as a `Protocol` section referencing a sub-file named
"Cardano", and the section/top-level exclusivity rule misfired on it.

Rename the section keys to `StorageConfig`, `ConsensusConfig`,
`ProtocolConfig`, `NetworkConfig`, `LocalConnectionsConfig`, `MempoolConfig`
and `TestingConfig` (and the matching `defaults/` files, per-component
schema files, and the rendered output keys), freeing the bare `Protocol`
key.

Do not parse the node's `Protocol` scalar: it is optional, defaults to
CardanoProtocol and only ever takes the value "Cardano", so it is a
vestigial selector. Document it (alongside MaxKnownMajorProtocolVersion) as
a key the node parses but this library intentionally ignores; a `Protocol`
key now surfaces as an unrecognised-key warning rather than a broken section
reference.
- splitEnvelope now rejects a non-integer or out-of-range Version (e.g. 1.4
  or a huge literal) with a ConfigurationParsingError, instead of silently
  rounding it; the schema declares Version as an integer.
- Package schemas/*.json as data-files so the schema-drift test works from a
  source distribution, not only an in-tree checkout.
- Fix the --config/--topology defaults in the README to .json (matching the
  CliArgs defaults), and a stray space in a Tracing haddock.
The test-suite read examples/ and schemas/ by relative path, which only
works when the current directory is the package root. Under Nix (and a
source distribution) the schema check failed with
"schemas/config.schema.json: withBinaryFile: does not exist".

Resolve every fixture through Paths_cardano_config.getDataFileName (the
files are packaged as data-files), so the tests no longer depend on the
working directory.

@lehins lehins left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've mention in a private chat, I believe it will be much better to have this package be in a standalone repo and make it also depend on Ledger, so it is capable of parsing Ledger Genesis files and verify their hash and potentially provide schema for their content as well.
Some minor suggestions are also included int this review

Comment on lines +33 to +46
Cardano.Configuration
Cardano.Configuration.Basic
Cardano.Configuration.CliArgs
Cardano.Configuration.Common
Cardano.Configuration.File
Cardano.Configuration.File.Consensus
Cardano.Configuration.File.Mempool
Cardano.Configuration.File.Network
Cardano.Configuration.File.Protocol
Cardano.Configuration.File.Storage
Cardano.Configuration.File.Testing
Cardano.Configuration.File.Tracing
Cardano.Configuration.Render
Cardano.Configuration.Schema

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance we can hide module that do not need to be public? If there are in fact such modules, of course


-- | Configuration for the protocol
data ProtocolConfiguration f = ProtocolConfiguration
{ byronGenesis :: ByronGenesisConfiguration

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it lazy on purpose? If so, it would be nice to have a comment about why it is. Otherwise it needs a strictness annotation. Considering that ByronGenesisConfiguration has its all fields strict, I suspect it is the latter

Suggested change
{ byronGenesis :: ByronGenesisConfiguration
{ byronGenesis :: !ByronGenesisConfiguration

-- | A maybe hashed entity, possibly a file.
data Hashed a = Hashed
{ hashed :: a
, hash :: Maybe (Hash Blake2b_256 ByteString)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the Hash optional?
Am I reading this correctly that the hash of genesis files is not required?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is indeed the case today in the node

Comment on lines +316 to +318
let unknown = [k | k <- map K.toText (KM.keys o), k `notElem` recognisedKeys]
unless (null unknown) $ do
let msg = "unrecognised configuration key(s): " <> intercalate ", " (map T.unpack unknown)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight simplification

Suggested change
let unknown = [k | k <- map K.toText (KM.keys o), k `notElem` recognisedKeys]
unless (null unknown) $ do
let msg = "unrecognised configuration key(s): " <> intercalate ", " (map T.unpack unknown)
let unknown = [K.toString k | k <- KM.keys o, K.toText k `notElem` recognisedKeys]
unless (null unknown) $ do
let msg = "unrecognised configuration key(s): " <> intercalate ", " unknown

Comment on lines +155 to +156
portNumber p = toJSON (fromIntegral p :: Integer)
fdNumber fd = toJSON (fromIntegral fd :: Integer)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toInteger is always safer, since it is not susceptible to overflows, as such it also does not require any type annotations

Suggested change
portNumber p = toJSON (fromIntegral p :: Integer)
fdNumber fd = toJSON (fromIntegral fd :: Integer)
portNumber p = toJSON (toInteger p)
fdNumber fd = toJSON (toInteger fd)

methodValue = \case
CLI.TracerConnectViaPipe p -> object ["Pipe" .= p]
CLI.TracerConnectViaRemote host p ->
object ["Host" .= host, "Port" .= (fromIntegral p :: Integer)]

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
object ["Host" .= host, "Port" .= (fromIntegral p :: Integer)]
object ["Host" .= host, "Port" .= toInteger p]

Comment on lines +368 to +372
branchTitle b = case KM.lookup "const" b of
Just (String s) -> Just s
_ -> case KM.lookup "type" b of
Just (String t) -> Just (T.toTitle t)
_ -> Nothing

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cleaner way to write the same thing

Suggested change
branchTitle b = case KM.lookup "const" b of
Just (String s) -> Just s
_ -> case KM.lookup "type" b of
Just (String t) -> Just (T.toTitle t)
_ -> Nothing
branchTitle b = do
String s <- KM.lookup "const" b
String t <- KM.lookup "type" b
Just $ T.toTitle t

-- @defaults\/<Component>.json@ (when one is supplied).
configurationSchemasWithDefaults :: [(Text, Value)] -> [(Text, Value)]
configurationSchemasWithDefaults defs =
[ (name, maybe s (\d -> withDefaults d s) (lookup name defs))

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty bad. lookup is O(n), which makes this whole function O(n^2), furthermore there is also a lookup call on the result.
At the very least converting defs to a Map:

Suggested change
[ (name, maybe s (\d -> withDefaults d s) (lookup name defs))
let defsMap = Map.fromList defs in
[ (name, maybe s (`withDefaults` s) (Map.lookup name defsMap))

Ideally switching to an OMap in all such places where lookup is used whenver order matters, or regular Map if order is irrelevant


`cardano-config` is currently the *origin* of these files, but each component is
ultimately **owned by the layer that implements it** (networking owns
`Network*.json`, consensus owns `Consensus.json`, and so on). The intended flow:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure README is the best place to put future development plans. I think REAMDE should describe current state of things, not what the code might do later.

- the three mempool timeouts (the node's default is no timeout),
- the Testing `Test<Era>HardForkAt*` / `DijkstraGenesis*` knobs.

## Provenance / TODO

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this also feels like it does not belong to a README

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of boilerplate with rewriting a Identity -> a Maybe. Part of it comes from the fact that not all fields are f a where f is a configuration type parameter - it then could be generalised with barbies.

Tbh I don't have any good arguments against wrapping every configuration field in f. It just adds more type-level boilerplate everywhere - but you have that in most places already.

see also https://chrispenner.ca/posts/hkd-options

{-# LANGUAGE GADTs #-}

-- | The representation of the configuration file
module Cardano.Configuration.File (

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File.hs currently handles error types, JSON layering, key linting, and orchestration in a single module (~250 lines). It's manageable today but likely to grow as more validation or versioning logic
lands.

A natural split:

Module Responsibility
File.Error ConfigurationParsingError and its instances
File.Merge JSON plumbing: decodeValueFile, mergeValues, loadSectionSource, loadBaseDefault, sectionUserLayer, parseSection, splitEnvelope, runCodec
File.Lint UnknownKeyPolicy, checkUnknownKeys, checkShadowedKeys - currently the only part that depends on Schema.recognisedKeys/componentPropertyNames
File Orchestration (parseConfigurationFiles, parseConfigurationVersion1, NodeConfigurationFromFileF) and re-exports

The main benefit is that linting (which pulls in the schema) separates from the layering engine (which doesn't), making both independently testable.

Not blocking - the file is short enough today.

Comment thread cardano-config/README.md
...
```

### ... see the schema for the whole config (deprecated, please use sub-files)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if this was not released already, can this pargraph be removed althogether?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't parse the whole schema, we would break the configuration of every user.

Comment thread cardano-config/README.md
Comment on lines +360 to +362
- **Errors.** File/JSON failures are reported as `ConfigurationParsingError`,
which records the offending file, section and JSON path. Resolution failures
are reported as `ConfigResolutionError`, listing the violated checks.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see much value in describing the code in the README.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants