Skip to content

refactor: Improve constructors for testplan elements#216

Merged
rswarbrick merged 1 commit into
lowRISC:masterfrom
rswarbrick:element-constructors
Jun 2, 2026
Merged

refactor: Improve constructors for testplan elements#216
rswarbrick merged 1 commit into
lowRISC:masterfrom
rswarbrick:element-constructors

Conversation

@rswarbrick
Copy link
Copy Markdown
Contributor

This doesn't change the behaviour (except that it might be slightly more careful about setting arbitrary field names).

I've rather simplified the flow though. It now looks like this:

  • The derived class constructor can set some placeholder values (to help tooling see their expected types).

  • It then calls super().init

  • The Element constructor checks it can find strings for the "name" and "desc" fields.

  • The "tags" field gets initialised to [] (so the element will have an empty list of tags if none are specified).

  • The Element constructor then sets all the requested fields, but only allows them to be strings or lists of strings. (No need to allow more complicated types: these are the only types we expect anyway).

  • It finally checks that "tags" is still a list of strings.

  • When we get back to the derived class constructor, we check that any other expected fields have been supplied, given the right type, and (for "stage") have a known value.

  • The Testpoint class also has some fields that it doesn't expect to load from the dictionary. So we check that none were supplied, then set them appropriately.

Phew!

@rswarbrick rswarbrick requested a review from AlexJones0 May 21, 2026 12:45
Copy link
Copy Markdown
Contributor

@AlexJones0 AlexJones0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rswarbrick. In terms of error reporting, this is definitely an improvement.

As a high-level comment, this is mostly cleaning up the JSON/dict validation logic that is already there and enforcing typing better. This is great - but contemporary best practice would be to not do this ourself, and instead rely on some Python data model. This is what we've already done for other parts of DVSim and want to end up doing for Deploy, Testplan, FlowCfg etc.

As an example (very untested, probably some errors), I'd imagine this flow would end up looking something like this:

from pydantic import BaseModel, ConfigDict, model_validator


class Element(BaseModel):
    model_config = ConfigDict(extra="ignore")

    name: str
    desc: str
    tags: list[str]
    # ... whatever other fields we might expect to exist in any `Element`.
    # If we have optional fields that we need to access, we can either add them here like e.g.
    # my_optional_field: str | None = None
    # Or we can use `extra="allow"` in our config and query `.extra["my_optional_field"]`


class Covergroup(Element):
    # If using `extra="allow"`...
    model_config = ConfigDict(extra="allow")

    # This validator is nice - other approaches are to completely separate the data
    # (e.g. `CovergroupData`) from the thing consuming that data (e.g. `Covergroup`) - so
    # that way we don't have to worry about attribute overlaps from magic dict merging! 
    # In terms of the process of refactoring, it's also considered good practice to have
    # strict models at data boundaries as it splits the system up into well-typed
    # boundaries that can be more easily reasoned with.
    #
    # Also note that if the extra attributes are just derived from this data, then e.g. properties
    # or protected attrs inside the class might mean better encapsulation etc. But if we still
    # want to allow extra attributes (using `extra="allow"`) and disallow certain fields, a
    # validator is the way to go.
    @model_validator(mode='after')
    def check_allowed_extra_fields(self) -> 'Covergroup':
        disallowed = {"test_results", "not_mapped"}
        if self.model_extra:
            disallowed = set(self.model_extra.keys()) & disallowed
            if disallowed:
                raise ValueError(f"Covergroup fields disallowed for use: {disallowed}")
                
        return self


class Testpoint(Element):
    tests: list[str]
    stage: str


# Then just do e.g.
Testpoint.model_validate(raw_dict)
# or even:
Covergroup.model_validate_json(raw_json_str)

Comment thread src/dvsim/testplan.py Outdated
stages = ("N.A.", "V1", "V2", "V2S", "V3")

def __init__(self, raw_dict) -> None:
def __init__(self, raw_dict: dict) -> None:
Copy link
Copy Markdown
Contributor

@AlexJones0 AlexJones0 May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit (in a couple of places) - prefer dict[str, Any] to dict for typing?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I think you might be right. When I wrote this, I was thinking about the hjson bindings, which I don't think declare meaningful types for the result of load.

But maybe it makes more sense to move that question closer to the source. :-)

Comment thread src/dvsim/testplan.py Outdated
# Reindent the multiline desc with 4 spaces.
desc = "\n".join([" " + line.lstrip() for line in self.desc.split("\n")])
return f" {self.kind.capitalize()}: {self.name}\n Description:\n{desc}\n"
raw_dict is the dictionary parsed from the HJSon file.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
raw_dict is the dictionary parsed from the HJSon file.
raw_dict is the dictionary parsed from the Hjson file.

(or HJSON).

Comment thread src/dvsim/testplan.py Outdated
def __init__(self, raw_dict) -> None:
"""Initialize the testplan element.
Args:
d: The dictionary being read.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: avoid single-letter variable names outside of e.g. symbols in mathematical formulae, as they tend to propagate and make the code harder to read.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aww, but I like to pretend I'm writing Haskell! :-)

It's a bit tricky to name here: the variable is a dictionary. And I don't know much more about it! (And "dict" is no use as an abbreviation because it's a keyword). I'll go with "src" for now.

Comment thread src/dvsim/testplan.py Outdated
if not isinstance(raw, str):
name_comment = f" with name {elt_name}" if elt_name is not None else ""
msg = (
f"Testplan element {name_comment}has a {field_name} field but this is not a string."
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a missing space in the f-string? unless it's just the diff formatting?

Copy link
Copy Markdown
Contributor

@AlexJones0 AlexJones0 May 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the space should come after {name_comment} (i.e. {name_comment} ), not before, since the space is at the start of the name_comment string if elt_name is not None?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! Amusingly, I took a minute trying to figure out what had caused the missing space before I read Alex's comment. At that point, I realised that the missing space was intentional. And in the wrong place (oops). Thanks guys!

Copy link
Copy Markdown
Collaborator

@machshev machshev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd agree with @AlexJones0, the concept is good! But this should probably be a Pydantic model instead as that gives us the full schema check with type checking. It also gives us serialisation and deserialisation using syntax that looks little more than a dataclass.

Given we already have pydantic as a dependency and several models already it seems like the most logical long term solution. Though we could perhaps merge this as a short term workaround if it meets your immediate needs?

@rswarbrick
Copy link
Copy Markdown
Contributor Author

rswarbrick commented May 26, 2026

@machshev: Are you saying that this change shouldn't land in its current form? I'm slightly concerned about the usual "best being the enemy of the good" problem. But you're the lead for the project: we can always close this and put the work on the back burner if you think that would be better.

@rswarbrick rswarbrick force-pushed the element-constructors branch from b859154 to 3eae671 Compare May 26, 2026 08:39
@rswarbrick rswarbrick requested a review from machshev May 26, 2026 08:39
Copy link
Copy Markdown
Collaborator

@machshev machshev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rswarbrick!

"Project lead" is news to me ;)
Personally I think that ultimately we should aim to move all the data models in DVSim to Pydantic as it supports full schema and type checking, it also provides serialisation and deserialisation from dict/JSON out of the box. Pydantic models can contain other Pydantic modes and then the outer model will recursively serialise/deserialise, but this doesn't work if we have a hybrid with custom data classes.

However we haven't yet got to refactoring this area, and your changes could make things better in the short term as a stop gap. So if you want to merge them then feel free.

Normally for this kind of change I'd have added a set of unit tests using pytest.mark.paramterize as it's deceptively difficult to do this kind of thing well without bugs. I've been bitten by simple changes, they tend to work in most cases but then you find there is some outlier that causes issues. But we can deal with that later if there are issues.

This doesn't change the behaviour (except that it might be slightly
more careful about setting arbitrary field names).

I've rather simplified the flow though. It now looks like this:

 - The derived class constructor can set some placeholder values (to
   help tooling see their expected types).

 - It then calls super().__init__

 - The Element constructor checks it can find strings for the "name"
   and "desc" fields.

 - The "tags" field gets initialised to [] (so the element will have
   an empty list of tags if none are specified).

 - The Element constructor then sets all the requested fields, but
   only allows them to be strings or lists of strings. (No need to
   allow more complicated types: these are the only types we expect
   anyway).

 - It finally checks that "tags" is still a list of strings.

 - When we get back to the derived class constructor, we check that
   any other expected fields have been supplied, given the right
   type, and (for "stage") have a known value.

 - The Testpoint class also has some fields that it doesn't expect to
   load from the dictionary. So we check that none were supplied, then
   set them appropriately.

Phew!

Signed-off-by: Rupert Swarbrick <rswarbrick@lowrisc.org>
@rswarbrick rswarbrick force-pushed the element-constructors branch from 3eae671 to b6e5f7e Compare June 2, 2026 13:07
@rswarbrick
Copy link
Copy Markdown
Contributor Author

Cool, I think that makes sense. I'm reasonably confident that this change is safe to land. I think we can do this as a stop-gap before moving to a proper Pydantic back-end.

@rswarbrick rswarbrick added this pull request to the merge queue Jun 2, 2026
Merged via the queue into lowRISC:master with commit 15cc201 Jun 2, 2026
6 checks passed
@rswarbrick rswarbrick deleted the element-constructors branch June 2, 2026 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants