Skip to content

Add $push accumulator tests#257

Open
alinaliBQ wants to merge 11 commits into
documentdb:mainfrom
alinaliBQ:push
Open

Add $push accumulator tests#257
alinaliBQ wants to merge 11 commits into
documentdb:mainfrom
alinaliBQ:push

Conversation

@alinaliBQ
Copy link
Copy Markdown
Contributor

This change adds tests for the $push accumulator operator.

Add accumulator operator tests for $push. Tests database $push behavior, output collection, syntax, and expected errors.
Integration tests are in documentdb_tests/compatibility/tests/core/operator/accumulators/test_accumulators_ push_integration.py

Comment on lines +85 to +101
AccumulatorTestCase(
"expr_error_divide_by_zero",
docs=[{"v": 1}],
pipeline=[{"$group": {"_id": None, "result": {"$push": {"$divide": ["$v", 0]}}}}],
error_code=DIVIDE_BY_ZERO_V2_ERROR,
msg="$push should propagate $divide by zero error",
),
AccumulatorTestCase(
"expr_error_divide_by_zero_field_path",
docs=[{"_id": 0, "v": 0}],
pipeline=[
{"$sort": {"_id": 1}},
{"$group": {"_id": None, "result": {"$push": {"$divide": [1, "$v"]}}}},
],
error_code=DIVIDE_BY_ZERO_V2_ERROR,
msg="$push should propagate $divide by zero when divisor comes from field path",
),
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added divide tests

@alinaliBQ alinaliBQ marked this pull request as ready for review June 1, 2026 19:15
@alinaliBQ alinaliBQ requested a review from a team as a code owner June 1, 2026 19:15
@documentdb-triage-tool documentdb-triage-tool Bot added compatibility test Compatibility test related enhancement New feature or request labels Jun 1, 2026
@documentdb-triage-tool
Copy link
Copy Markdown

🤖 Auto-triaged by documentdb-triage-tool.

Applied: compatibility test, enhancement
Project fields suggested: Component test-coverage · Priority P2 · Effort L · Status Needs Review
Confidence: 0.92 (mixed)

Reasoning

component from path globs (test-coverage); effort from diff stats (1658+0 LOC, 8 files); LLM: Adds new integration test coverage for the $push accumulator operator, touching the compatibility tests directory with multiple test scenarios.

If a label is wrong, remove it manually and ping @patty-chow so the rules can be tuned. The bot will not re-label items that already have component labels.

],
msg="$push should produce single-element arrays for single-document groups",
),
AccumulatorTestCase(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also have a test case, where we do acutal grouping on large number of rows and not just group by _id: None as that is bound to give one row

expected=[{"_id": None, "result": [["a", "b"]]}],
msg="$push should collect array traversal result from field path through array of objects",
),
AccumulatorTestCase(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also add a similar test on actual array (e.g., {"a": [{"b": 1}, {"b": 2}]} and do a $push on $a.0.b

],
expected=[{"_id": "A", "result": [1, 1, 2]}],
msg="$push should respect compound sort order within each group",
),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also add cases for:

  • 3-key sort with mixed asc/desc e.g., {"$sort": {"priority": 1, "status": -1, "timestamp": 1}}.
  • Sort by nested field paths Eg: "user.dept": 1

),
]

# Property [Grouping Behavior]: each group produces an independent array, and
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add compound _id grouping (group by multiple fields).
Eg: {"_id": {"region": "$region", "status": "$status"}

Lets alsoo add:
grouping by a nested field path as _id (e.g., "_id": "$user.dept")

"group_large",
docs=[{"v": 1} for _ in range(10_000)],
pipeline=[
{"$group": {"_id": None, "result": {"$push": "$v"}}},
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type of test will only validate count of documents due to project, we should also verify the context of documents and not just count, I suggest we write a loop in expected results too and check actual content and not just count

# Property [Field Path Resolution]: $push correctly resolves simple, nested,
# and non-existent field paths.
PUSH_FIELD_PATH_TESTS: list[AccumulatorTestCase] = [
AccumulatorTestCase(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets also add sort on nested field {"$sort": {"item.name": 1}},

expected=[{"_id": None, "result": [DECIMAL128_NAN]}],
msg="$push should preserve Decimal128 NaN in output array",
),
AccumulatorTestCase(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also add special float values mixed with non-numeric types.

# the referenced field is missing, producing a shorter array or an empty array
# if all documents lack the field.
PUSH_MISSING_TESTS: list[AccumulatorTestCase] = [
AccumulatorTestCase(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple $push accumulators can be in a single group query:

Like $group: [$push, $push] - We should add this test too.

Also lets add multiple $group in a single aggregation pipeline.
like pipeline = [$group,$group]

Lets make sure actual grouping is happening and we just dont do $group by _id: null

),
]

# Property [Nested Array and Document Handling]: $push collects array and
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also add deeply nested arrays-of-objects. ex:
{"data": {"users": [{"profile": {"name": "Alice", "scores": [85, 90]}}]}, "s": 1}

@@ -0,0 +1,337 @@
"""Tests for $push accumulator: BSON type preservation and type verification."""

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment for overall CR:

  • Most of the tests here are only on _id: null which leads to single group, I think we should add more tests that do the actual grouping.
  • Add more large group test , need to check content inside too and not just the count as there's no test with 1000+ docs split across multiple groups (ex: 5000 docs across 5 groups of varying sizes). This would catch bugs in group-boundary accumulator resets. (Add with both $sort before $push and after)
  • Multiple $push accumulator in a single pipeline . We should test this too: $group:[$push, $push ......]

alinaliBQ added 11 commits June 1, 2026 14:52
Signed-off-by: Alina (Xi) Li <Alina.Li@improving.com>
Signed-off-by: Alina (Xi) Li <Alina.Li@improving.com>
Signed-off-by: Alina (Xi) Li <Alina.Li@improving.com>
- Extract 3 NaN tests into separate PUSH_SPECIAL_NAN_TESTS list
- Add test_push_core_nan function using assertSuccessNaN
- Replace pytest.approx(math.nan, nan_ok=True) with float("nan")
- Remove unused math import

Signed-off-by: Alina (Xi) Li <Alina.Li@improving.com>
Signed-off-by: Alina (Xi) Li <Alina.Li@improving.com>
Signed-off-by: Alina (Xi) Li <Alina.Li@improving.com>
Signed-off-by: Alina (Xi) Li <Alina.Li@improving.com>
Signed-off-by: Alina (Xi) Li <Alina.Li@improving.com>
Signed-off-by: Alina (Xi) Li <Alina.Li@improving.com>
Signed-off-by: Alina (Xi) Li <Alina.Li@improving.com>
Signed-off-by: Alina (Xi) Li <Alina.Li@improving.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

compatibility test Compatibility test related enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants