Allow --shard-by=file#2655
Conversation
PR HealthChangelog Entry ✔️
Changes to files need to be accounted for in their respective changelogs. This check can be disabled by tagging the PR with |
| --total-shards The total number of invocations of the test runner being run. | ||
| --shard-index The index of this test runner invocation (of --total-shards). | ||
| --shard-by How to distribute tests across shards. | ||
| [test (default), file] |
There was a problem hiding this comment.
Any expectations of other granularities?
(What would that even be? "Directory", but most packages have all the tests in one directory anyway.)
Is there any reason to not always use --shard=by=file?
(If you have only one big file with many tests, I guess?)
| as evenly as possible across all shards. | ||
| * When sharding by `file`, the files are partitioned *before* they are loaded | ||
| and filtered. If a filter only matches tests in a few files, some shards | ||
| might run no tests because those files were allocated to other shards. |
There was a problem hiding this comment.
Why not shard after filtering?
I'm assuming there is a reason, but it would be nice to know what it is.
I guess each shard will individually do the same "find all tests" computation and sharding, but if it can do that normally, it should be able to do it sharding by file as well. As long as all the shards agree on the test-set, they should agree after filtering too.
There was a problem hiding this comment.
Some filters are runtime only, but we should be able to respect the top level @TestOn annotation filters without loading the test.
|
|
||
| source = StreamGroup.merge( | ||
| shardFiles.map((path) { | ||
| final key = _config.testSelections.keys.firstWhere( |
There was a problem hiding this comment.
(Can use _config.testSelections.entries.firstWhere((e) => path == e.key || p.isWithin(e.key, path), ..., then you don't need to do a ...[key]! afterwards)
| final Stream<LoadSuite> source; | ||
| if (_config.totalShards != null && _config.shardBy == 'file') { | ||
| final allFiles = <String>[]; | ||
| for (var pathEntry in _config.testSelections.entries) { |
There was a problem hiding this comment.
(Can use patter assignment:
for (var MapEntry(key: testPath, value: ...)) ...but I'm not sure the value is actually used, so it could just iterate keys?)
|
Meta comment, but we call test "files" test "suites". So it might be best to make this I am curious what @natebosch thinks though. |
| @@ -172,6 +172,147 @@ void main() { | |||
| await test.shouldExit(79); | |||
There was a problem hiding this comment.
Should add some edge cases, 0 test suites, less test suites than shards, more test suites than shards, etc.
| @@ -172,6 +172,147 @@ void main() { | |||
| await test.shouldExit(79); | |||
There was a problem hiding this comment.
I am also curious how this interacts with the @TestOn annotation as well as tags.
| int? concurrency, | ||
| int? shardIndex, | ||
| int? totalShards, | ||
| String? shardBy, |
| Stream<LoadSuite> _loadSuites() { | ||
| return StreamGroup.merge( | ||
| _config.testSelections.entries.map((pathEntry) { | ||
| final Stream<LoadSuite> source; |
There was a problem hiding this comment.
I would break this out into separate functions instead of inlining it all.
Or instead of making shardBy an enum make it a sealed class where the subclasses contain implementations so this is just _config.shardBy.loadSuites(...) or something.
| .map((f) => f.path) | ||
| .where((path) => _config.filename.matches(p.basename(path))), | ||
| ); | ||
| } else if (File(testPath).existsSync()) { |
There was a problem hiding this comment.
This test does nothig, you add the testPath to allFiles whether it's true or false.
Parsing all files, and running all setupAll hooks is expensive when what we are trying to do is to minimize the time spent for each shard.
This allows choosing a less fine-grained strategy of sharding by individual test files.
Fixes #2614