Add AdvanceWallClockTime support for testing PROCESSING_TIME punctuations#460
Conversation
…_TIME punctuations This change adds support for advancing wall-clock time in TopologyTestDriver, enabling proper unit testing of processors that use PROCESSING_TIME punctuations. Previously, there was no way to control wall-clock time in tests, making it impossible to reliably test buffering, windowing, and other time-sensitive operations that use PROCESSING_TIME. Changes: - Add IWallClockTimeProvider interface for time abstraction - Add MockWallClockTimeProvider for controllable time in tests - Add AdvanceWallClockTime method to TopologyTestDriver - Modify StreamTask to use injected time provider when available - Fix PROCESSING_TIME scheduling to correctly fire after interval (was using startTime = now + interval, changed to startTime = now) This mirrors the Java Kafka Streams TopologyTestDriver.advanceWallClockTime API.
… topology support Building on the AdvanceWallClockTime feature, this change: 1. Exposes WallClockTime as a property on ProcessorContext, allowing processors and transformers to access mock-aware wall clock time. This enables custom processors that need wall clock time to work correctly with TopologyTestDriver. 2. Improves AdvanceWallClockTime to properly handle multi-stage topologies with repartitions. The implementation now iterates until no more progress is made, ensuring records flow through all stages including downstream buffers that may have their own PROCESSING_TIME punctuations. Changes: - Add GetWallClockTime() abstract method to AbstractTask - Add WallClockTime property to ProcessorContext and ProcessorContext<K,V> - Implement GetWallClockTime in StreamTask (uses injected provider) and ExternalStreamTask (uses real time) - Enhance TaskSynchronousTopologyDriver.AdvanceWallClockTime to handle internal communication topics and iterate until stable
LGouellec
left a comment
There was a problem hiding this comment.
Some comments, but LGTM as soon as the comments is fixed
| _currentTimeMs = DateTime.Now.GetMilliseconds(); | ||
| } | ||
|
|
||
| public long GetWallClockTime() => _currentTimeMs; |
| forwardedThisPass = false; | ||
| foreach (var topic in internalTopics) | ||
| { | ||
| if (ForwardTopicRecordsIfAny(topic)) |
There was a problem hiding this comment.
I don't understand why it's necessary. When you Input a new message, by default the TopologyTestDriver will forward the message downstream in the repartition topic. Why you need to do again when you advance the time ?
There was a problem hiding this comment.
I think this is required in the case where records may be produced to internal topics during punctuation (not just during the initial PipeInput)
Consider this scenario with a buffering transformer (which was what motivated this feature):
- PipeInput("key", "value") --> record flows through topology --> hits a BufferByTimeTransformer --> buffered, not forwarded downstream yet
- AdvanceWallClockTime(500ms) --> PROCESSING_TIME punctuation fires --> buffered records are now forwarded to a repartition topic
- These records weren't in the repartition topic when the original Flushed handlers ran during step 1
Without internal topic forwarding in AdvanceWallClockTime, records produced to repartition topics during punctuation would never reach downstream tasks. The existing Flushed event handlers only trigger when PipeInput completes - they don't know about records produced later by punctuations.
If there is a cleaner way to hook into the existing forwarding mechanism, I'd be happy to refactor but the core requirement is to have any records produced to internal topics during AdvanceWallClockTime flow downstream.
Does that make sense?
…rovider implementation
LGouellec
left a comment
There was a problem hiding this comment.
LGTM, just a minor change about the property getter/setter and I'll approve it sooner
| /// Gets the current wall-clock time in milliseconds since epoch. | ||
| /// </summary> | ||
| long GetWallClockTime(); | ||
| long WallClockTime { get; } |
There was a problem hiding this comment.
Should be
long WallClockTime { get; private set; }
There was a problem hiding this comment.
It is an interface, you can't put private members on it... by definition its the public contract.
There was a problem hiding this comment.
Yeap you are right, I didn't point the correct file.
long WallClockTime { get; private set; }
should be in MockWallClockTimeProvider
There was a problem hiding this comment.
Sorry I don't think I understand what you're asking. In MockWallClockTimeProvider, WallClockTime is
public long WallClockTime => _currentTimeMs;
You prefered it to be a gettable property over the GetWallClockTime() that it was originally. This is implicitly get only, since the implementation just returns _currentTimeMs.
Would you prefer that it to be more verbose like this?
public long WallClockTime
{
get { return _currentTimeMs; }
private set { }
}
There was a problem hiding this comment.
public long WallClockTime
{
get;
private set;
}
There was a problem hiding this comment.
Have removed the member variable and converted the expression body syntax to make it private settable
|
All good to go then? Looks like you resolved any merge conflict with the pull? |
This PR adds support for advancing wall-clock time in TopologyTestDriver, enabling proper unit testing of processors that use PROCESSING_TIME punctuations. This mirrors the Java Kafka Streams TopologyTestDriver.advanceWallClockTime API.
Previously, there was no way to control wall-clock time in tests, making it impossible to reliably test:
Changes
Commit 1: Add AdvanceWallClockTime to TopologyTestDriver
Commit 2: Expose WallClockTime through ProcessorContext
Usage
Testing
Added tests in TopologyBuildOrderBugTests.cs verifying records flow correctly through Aggregate > ToStream > Repartition topologies when using AdvanceWallClockTime.