Skip to content

Change syntax for constructing a record with a single field#4295

Open
hazefully wants to merge 7 commits into
FoundationDB:mainfrom
hazefully:revisit-flatten-single-item-record
Open

Change syntax for constructing a record with a single field#4295
hazefully wants to merge 7 commits into
FoundationDB:mainfrom
hazefully:revisit-flatten-single-item-record

Conversation

@hazefully

Copy link
Copy Markdown
Contributor

Note to reviewer: it might be better to review ebfd935 independently first, since it is isolated from the main change in this PR but it is required to make mixed version testing possible.


This PR changes the syntax for constructing an anonymous struct/record with a single field from (..) to STRUCT (..). The reason for this change is to remove the need for the automated flattening of function arguments with single item records which used to happen in some contexts while parsing the query, and would lead to issues when we have more user-defined table/scalar functions that accept user-defined struct types.

After this PR, the new syntax for constructing an anonymous record with multiple fields will be (field1, field2, ...) or the equivalent STRUCT (field1, field2, ...). To construct a record with a single field, the syntax will be STRUCT (field1). The existing syntax (<expr>) will be interpreted as an expression that has a higher precedence in evaluation over surrounding expressions.


To make testing the breaking change in this PR possible in YAML tests, the commit ebfd935 extends the syntax introduced in #4155 to support multiple variants for schema_template blocks in YAML tests to support multiple variant for a single step in setup block as well, making it possible to have different queries executed based on the current version under test that would run the setup step. The syntax for the multi-variant command block is changed to reflect the fact that we want the YAML tests framework to choose which variant to run based on the current version that would run the command, instead of the initial version. The syntax for a multi-variant schema template is now:

schema_template:
    - currentVersionAtLeast: "<version_number>"
      definition: |
          create table t1(id bigint, primary key(id))
    - currentVersionLessThan: "<version_number>"
       definition: |
          create table t1(id bigint, primary key(id))

Similarly, a step in a setup block can have multiple variants:

setup:
  steps:
    - query: INSERT INTO table_T1 VALUES (5, 9)
    -
      - currentVersionLessThan: "3.0.18.0"
        definition:
          - query: INSERT INTO table_T1 VALUES (10, 20)
      - currentVersionAtLeast: "3.0.18.0"
        definition:
          - query: INSERT INTO table_T1 VALUES (30, 40)
    - query: INSERT INTO table_T1 VALUES (50, 60)

This commit changes the syntax for constructring an anonymous record with a single
field from `(..)` to `STRUCT (..)`. The reason for this change is to remove the need
for the automated flattening of function arguments with single item records which happens
in some contexts.
@hazefully hazefully added breaking change Changes that are not backwards compatible Run mixed-mode Label to add to Pull Requests to have it run mixed mode tests labels Jun 24, 2026
@hazefully hazefully requested a review from hatyo June 24, 2026 14:06
@github-actions

Copy link
Copy Markdown

📊 Metrics Diff Analysis Report

Summary

  • New queries: 1
  • Dropped queries: 4
  • Plan changed + metrics changed: 1
  • Plan unchanged + metrics changed: 0
ℹ️ About this analysis

This automated analysis compares query planner metrics between the base branch and this PR. It categorizes changes into:

  • New queries: Queries added in this PR
  • Dropped queries: Queries removed in this PR. These should be reviewed to ensure we are not losing coverage.
  • Plan changed + metrics changed: The query plan has changed along with planner metrics.
  • Metrics only changed: Same plan but different metrics

The last category in particular may indicate planner regressions that should be investigated.

New Queries

Count of new queries by file:

  • yaml-tests/src/test/resources/arrays-unnesting.metrics.yaml: 1

Dropped Queries

The following queries with metrics were removed:

The reviewer should double check that these queries were removed intentionally to avoid a loss of coverage.

Plan and Metrics Changed

These queries experienced both plan and metrics changes. This generally indicates that there was some planner change
that means the planning for this query may be substantially different. Some amount of query plan metrics change is expected,
but the reviewer should still validate that these changes are not excessive.

Total: 1 query

Statistical Summary (Plan and Metrics Changed)

task_count:

  • Average change: +211.0
  • Average regression: +211.0
  • Median change: +211
  • Median regression: +211
  • Standard deviation: 0.0
  • Standard deviation of regressions: 0.0
  • Range: +211 to +211
  • Range of regressions: +211 to +211
  • Queries changed: 1
  • Queries regressed: 1

transform_count:

  • Average change: +62.0
  • Average regression: +62.0
  • Median change: +62
  • Median regression: +62
  • Standard deviation: 0.0
  • Standard deviation of regressions: 0.0
  • Range: +62 to +62
  • Range of regressions: +62 to +62
  • Queries changed: 1
  • Queries regressed: 1

transform_yield_count:

  • Average change: +13.0
  • Average regression: +13.0
  • Median change: +13
  • Median regression: +13
  • Standard deviation: 0.0
  • Standard deviation of regressions: 0.0
  • Range: +13 to +13
  • Range of regressions: +13 to +13
  • Queries changed: 1
  • Queries regressed: 1

insert_new_count:

  • Average change: +19.0
  • Average regression: +19.0
  • Median change: +19
  • Median regression: +19
  • Standard deviation: 0.0
  • Standard deviation of regressions: 0.0
  • Range: +19 to +19
  • Range of regressions: +19 to +19
  • Queries changed: 1
  • Queries regressed: 1

insert_reused_count:

  • Average change: -3.0
  • Median change: -3
  • Standard deviation: 0.0
  • Range: -3 to -3
  • Queries changed: 1
  • No regressions! 🎉

Significant Regressions (Plan and Metrics Changed)

There was 1 outlier detected. Outlier queries have a significant regression in at least one field. Statistically, this represents either an increase of more than two standard deviations above the mean or a large absolute increase (e.g., 100).

  • yaml-tests/src/test/resources/valid-identifiers.metrics.yaml:383: EXPLAIN select struct "x$$" ("foo.tableA".*) from "foo.tableA"
    • old explain: ISCAN(foo.tableA.idx3 <,>) | MAP (_ AS foo.tableA)
    • new explain: SCAN([IS foo__2tableA]) | MAP (_ AS foo.tableA)
    • task_count: 288 -> 499 (+211)
    • transform_count: 74 -> 136 (+62)
    • transform_yield_count: 37 -> 50 (+13)
    • insert_new_count: 32 -> 51 (+19)
    • insert_reused_count: 6 -> 3 (-3)

import java.util.Set;
import java.util.function.BiFunction;

public final class VariantCommand extends Command {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document this please?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I am for the introduction of another YAML command currentVersionAtLeast, it looks very similar to initialVersionAtLeast. and it seems the differences between the two are purely technical, can we unify these two? if there are differences related to handling continuations across different server versions, then that seems like an internal testing framework detail that shouldn't be exposed at this level.


@Nonnull
private final YamlReference reference;
final YamlReference reference;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is probably a better idea to make this protected instead of using package-level visibility.

# contain a flat record with the scalar value.
- query: SELECT "b".* FROM (SELECT "a" FROM VALUES ([1, 2, 3, 4]) AS T("a")) AS "sq", (SELECT ("x") AS wrapped FROM "sq"."a" AS "x") AS "b"
- initialVersionLessThan: !current_version
- result: [{wrapped: {1}}, {wrapped: {2}}, {wrapped: {3}}, {wrapped: {4}}]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks continuations, so we have to be careful with the rollout.

Comment on lines +149 to 150
# Double-wrapped (*) nesting is unaffected for older versions
- query: select ((*)) from t1 where pk = 1

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  # Double-wrapped (*) nesting is unaffected for older versions

... for versions older than !current_version, I think.

import static org.junit.jupiter.api.Assertions.assertThrows;

/**
* Tests that tests with setup blocks based on the current version are executed correctly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests that test*

: '(' expressionWithOptionalName (',' expressionWithOptionalName)* ')'
;

expressionWithPrecedence

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename this to expressionWithParenthesis? defining "precedence" at this high level is a bit too strong of a rule.


expressionAtom
: constant #constantExpressionAtom // done
: expressionWithPrecedence #expressionWithPrecedenceAtom // done

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here, please rename the rule branch to expressionWithParethesisAtom

Struct Type Declaration
=======================

You can define a *struct type* (often interchangeably referred to as a *nested type*). A struct is a tuple of columns that allow the same types as a table does, but does _not_ have a primary key. Struct types are "nested" within another owning type, and are stored in the same location as their owning record. For example, a table :sql:`foo` can have the following layout (using the :doc:`DDL <sql_commands/DDL>` syntax):

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can define a struct type (often interchangeably referred to as a nested type). A struct is a tuple of columns that allow the same types as a table does, but does not have a primary key. Struct types are "nested" within another owning type, and are stored in the same location as their owning record. For example, a table :sql:foo can have the following layout (using the :doc:DDL <sql_commands/DDL> syntax):

While you're at it, here is a suggested more concise rewording:

A struct is defined almost like a table, consisting of a tuple of attributes but lacking a primary key. Structs can be nested, as demonstrated in the following example definition:

Struct Literals
===============

A *struct literal* constructs a struct value inline, directly within a query, instead of reading it from a table. Struct literals are accepted anywhere a value is expected, most commonly in :sql:`INSERT` statements, :sql:`SELECT` projections, and :sql:`WHERE` comparisons.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A struct literal constructs a struct value inline, directly within a query, instead of reading it from a table

suggestion:

a struct literal is an expression, it instantiates a struct expression in place. It is commonly used in ....

This returns a single column whose value is a :sql:`STRUCT` containing one field, which is itself the row :sql:`STRUCT`.

.. note::
The :sql:`STRUCT` keyword is required here since we are constructing a struct literal with a single field which is the row struct, see :ref:`struct_types` for more details.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice callout.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking change Changes that are not backwards compatible Run mixed-mode Label to add to Pull Requests to have it run mixed mode tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants