Skip to content

fix(validation): reject THIS.<member> without dereference#1729

Open
ghaith wants to merge 6 commits into
masterfrom
fix/this-without-deref-1728
Open

fix(validation): reject THIS.<member> without dereference#1729
ghaith wants to merge 6 commits into
masterfrom
fix/this-without-deref-1728

Conversation

@ghaith
Copy link
Copy Markdown
Collaborator

@ghaith ghaith commented May 6, 2026

Summary

  • Validator now emits E120 when a member access uses THIS as a base without an explicit ^ deref (e.g. THIS.field instead of THIS^.field). Mirrors the existing SUPER-deref rule.
  • Without this guard, THIS.<field> slipped through both plc check and plc build and crashed codegen with Builder error: GEP pointee is not a struct (pointee: "ptr", index: 1).
  • E120's markdown body already documented this rule; only the validator site was missing.
  • Also dedupes exact-duplicate diagnostics in Validator::push_diagnostic (boyscout). Lowering occasionally clones a receiver AST into multiple positions in the post-lowered tree (operator chain + synthesised self argument), which caused this new THIS check and the existing SUPER-deref check to render twice for the same span. Dedup is keyed on Diagnostic's existing PartialEq (code + message + locations + sub-diagnostics).

Fixes #1728

Test plan

  • tests/lit/single/oop/this_member_access_without_deref_errors.st — runs plc -c on the minimal repro, asserts E120 fires, asserts no binary is produced.
  • this_member_access_without_deref_is_an_error unit test in src/validation/tests/this_keyword_validation_tests.rs with inline snapshot.
  • cargo test --workspace clean (no snapshot churn from the dedup).
  • cargo xtask lit clean (377 tests; 359 pass, 14 XFAIL, 4 unsupported).
  • cargo fmt --all and cargo clippy --workspace -- -Dwarnings clean.

🤖 Generated with Claude Code

Fixes #1728. THIS is POINTER TO <enclosing FB>, so member access on it
requires an explicit `^` deref. Without this fix, THIS.<field> slipped
through the validator -- `--check` was silent, `build` was silent --
and reached codegen, which panicked with
"Builder error: GEP pointee is not a struct (pointee: \"ptr\", index: 1)"
when attempting to GEP through the unindirected pointer operand.

Mirror the existing SUPER pattern in `validate_reference_expression`:
when the base is THIS (and not THIS^), emit E120 -- which already
documents this rule in its markdown but had no validator site emitting
it.

Also dedupes exact-duplicate diagnostics in `Validator::push_diagnostic`.
Lowering occasionally clones a receiver AST into multiple positions in
the post-lowered tree (operator chain + synthesised `self` argument),
which caused both the new THIS check and the existing SUPER check to
render twice for the same source location. The dedup is keyed on
`Diagnostic`'s existing PartialEq (code + message + locations +
sub-diagnostics), so distinct diagnostics at the same span are
unaffected.

Adds:

- `tests/lit/single/oop/this_member_access_without_deref_errors.st`
  -- runs `plc -c` on the minimal repro, asserts E120 fires and no
  binary is produced.
- `this_member_access_without_deref_is_an_error` unit test in
  `src/validation/tests/this_keyword_validation_tests.rs` with an
  inline snapshot pinning the diagnostic shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Build Artifacts

🐧 Linux

Artifact Link Size
deb-aarch64 Download 31.0 MB
plc-aarch64 Download 43.6 MB
deb-x86_64 Download 38.5 MB
schema Download 0.0 MB
stdlib Download 32.4 MB
plc-x86_64 Download 43.7 MB

From workflow run

🪟 Windows

Artifact Link Size
stdlib.lib Download 4.3 MB
stdlib.dll Download 0.1 MB
plc.exe Download 38.5 MB

From workflow run

Copy link
Copy Markdown
Collaborator

@riederm riederm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hey, just checking in once in a while to see how the AI task-force is doing ;-)
and left a question here 😆

Comment thread src/validation/statement.rs Outdated
.with_location(m.get_location())
.with_error_code("E119"),
);
} else if base.is_this() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if listing all these special cases here is the correct way to go?

Aren't (almost all of) these cases caught by one generic type-check? like see if the member could be resolved under this base? Or is the resolver actually able to find anything for Member in these cases?

What I mean is, should these two lines maybe give the same error-message?

VAR_GLOBAL
    p_fb : POINTER TO FB ...;
END_VAR

FUNCTION_BLOCK fb
...
METHOD m : DINT
    THIS.a;    // cannot access a on type pointer to fb
    p_fb.a;     // cannot access a on type pointer to fb
END_METHOD

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the logic for both this and super (and normal pointers) to get a clearer message, now we report that you should dereference the pointer before accessing a member.

ghaith and others added 2 commits May 11, 2026 10:04
Per maintainer review on #1729: THIS, SUPER, and user-declared
`POINTER TO ...` bases all share the same shape -- the value of the
base is a pointer, not the pointee -- so member access on any of them
without dereference should produce the same diagnostic. Today
`THIS.a` emits E120, `SUPER.a` emits E119, and `p_fb.a` emits a
misleading E048 ("Could not resolve reference to a" -- but `a` is a
valid field on `fb`, just not on `POINTER TO fb`).

Replace the SUPER- and THIS-specific deref branches in
`validate_reference_expression` with a single type-driven check: if
the base resolves to a non-auto-deref `Pointer` data type, emit a
new E137 -- "Cannot access `<m>` on `POINTER TO <type>`; dereference
with `^` first" -- and return early to suppress the follow-on E048.
The member name comes from the source slice so the diagnostic shows
the user's text (`prop`) even when the resolver has rewritten the AST
to a synthetic name (`__get_prop`).

E119 and E120 stay -- they still cover the THIS/SUPER-as-member case
(`x.THIS`, `x.SUPER`) and the invalid-context checks (SUPER outside
an extending POU, THIS outside an FB). Only the deref-required
sub-rule moves to E137.

Also add `return` after the `m.is_super()` branch when the enclosing
POU actually extends something, so users don't get "SUPER not allowed
in member-access position" followed by a redundant deref-required
cascade. The gate keeps the "SUPER outside extending POU" diagnostic
when the qualifier doesn't extend -- that's orthogonal signal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@riederm
Copy link
Copy Markdown
Collaborator

riederm commented May 12, 2026

not sure if my question made it through 😆 . I was asking if the root of the problem is the type-resolver which forces you to such differentiated checks. (the problem as in, why are such specific checks necessary, while I expected the type-resolver to see that problem clearly).

i tested this quickly:

    #[test]
    fn super_and_this() {
        let src = 
            "
        FUNCTION_BLOCK parent
        VAR
            y : INT := 20;
        END_VAR
        END_FUNCTION_BLOCK

        FUNCTION_BLOCK child EXTENDS parent
        VAR
            x  : INT := 10;
            px : POINTER TO child;
        END_VAR

            {{px^}.x};
            {{px}.x};

            {{this^}.x} := 50;
            {{this}.x} := 50;

            {{super}.x};
            {super^.x};  // super^ seams to be special?! the expression collector fails to collect super^
            
        END_FUNCTION_BLOCK
        ";

        let resolves = test_resolve(src).unwrap();
        assert_snapshot!(resolves, @r###"
             EXPR | TYPE         | HINT 
        ----------+--------------+------
            px^.x | INT          | -          - OK
              px^ | child        | -          - OK
             px.x | -            | -          - OK
               px | __child_px   | -          - OK
          this^.x | INT          | -          - OK
            this^ | child        | -          - OK
           this.x | INT          | -          - surprising!! 
             this | child.__THIS | -          - OK
          super.x | INT          | -          - surprising!!
            super | -            | -          - surprisnig!! (would have expected similar behavior like this)
         super^.x | INT          | -          - OK
        "###);
    }

it is also strange that the expression-collector seams to fail to collect super^ as an expression and type-resolve it. adding {super^} fails while {px^} and {this^} work fine - very strange.

❓ so what I wanted to discuss was, is this differentiation of a px, this and super intended and actually helpful?

@ghaith
Copy link
Copy Markdown
Collaborator Author

ghaith commented May 12, 2026

not sure if my question made it through 😆 . I was asking if the root of the problem is the type-resolver which forces you to such differentiated checks. (the problem as in, why are such specific checks necessary, while I expected the type-resolver to see that problem clearly).

i tested this quickly:

    #[test]
    fn super_and_this() {
        let src = 
            "
        FUNCTION_BLOCK parent
        VAR
            y : INT := 20;
        END_VAR
        END_FUNCTION_BLOCK

        FUNCTION_BLOCK child EXTENDS parent
        VAR
            x  : INT := 10;
            px : POINTER TO child;
        END_VAR

            {{px^}.x};
            {{px}.x};

            {{this^}.x} := 50;
            {{this}.x} := 50;

            {{super}.x};
            {super^.x};  // super^ seams to be special?! the expression collector fails to collect super^
            
        END_FUNCTION_BLOCK
        ";

        let resolves = test_resolve(src).unwrap();
        assert_snapshot!(resolves, @r###"
             EXPR | TYPE         | HINT 
        ----------+--------------+------
            px^.x | INT          | -          - OK
              px^ | child        | -          - OK
             px.x | -            | -          - OK
               px | __child_px   | -          - OK
          this^.x | INT          | -          - OK
            this^ | child        | -          - OK
           this.x | INT          | -          - surprising!! 
             this | child.__THIS | -          - OK
          super.x | INT          | -          - surprising!!
            super | -            | -          - surprisnig!! (would have expected similar behavior like this)
         super^.x | INT          | -          - OK
        "###);
    }

it is also strange that the expression-collector seams to fail to collect super^ as an expression and type-resolve it. adding {super^} fails while {px^} and {this^} work fine - very strange.

❓ so what I wanted to discuss was, is this differentiation of a px, this and super intended and actually helpful?

Fair i'll check why they do differ, i think when super and this were implemented they were special cases but i need to check again how exactly they behave. I misunderstood the initial request, I thought you wanted a clearer message there (which I think is also cool)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

THIS.<field> (without ^ deref) panics in codegen with "GEP pointee is not a struct"

2 participants