Update #6
Open
rRajivramachandran wants to merge 258 commits into
Open
Conversation
Changes: - Reduce log level of some coordinator stats, which only denote normal coordinator operation. These stats are still emitted and can be logged by setting debugDimensions in the coordinator dynamic config. - Initialize SegmentLoadingConfig only for historical management duties. This config is not needed in other duties and initializing it creates logs which are misleading.
* fix nvl in table * add query parameter dialog * pre-wrap in the tables * fix typo
Changes: - Make ServiceMetricEvent.Builder extend ServiceEventBuilder<ServiceMetricEvent> and thus convert it to a plain builder rather than a builder of builder. - Add methods setCreatedTime , setMetricAndValue to the builder
Changes: [A] Remove config `decommissioningMaxPercentOfMaxSegmentsToMove` - It is a complicated config 😅 , - It is always desirable to prioritize move from decommissioning servers so that they can be terminated quickly, so this should always be 100% - It is already handled by `smartSegmentLoading` (enabled by default) [B] Remove config `maxNonPrimaryReplicantsToLoad` This was added in apache#11135 to address two requirements: - Prevent coordinator runs from getting stuck assigning too many segments to historicals - Prevent load of replicas from competing with load of unavailable segments Both of these requirements are now already met thanks to: - Round-robin segment assignment - Prioritization in the new coordinator - Modifications to `replicationThrottleLimit` - `smartSegmentLoading` (enabled by default)
Check that a checkpoint is non-empty before adding it to the checkpoint sequence in a SeekableStreamSupervisor
Changes: - Simplify static `create` methods for `NoopTask` - Remove `FirehoseFactory`, `IsReadyResult`, `readyTime` from `NoopTask` as these fields were not being used anywhere - Update tests
…he#14408) * Vectorizing earliest for numeric * Vectorizing earliest string aggregator * checkstyle fix * Removing unnecessary exceptions * Ignoring tests in MSQ as earliest is not supported for numeric there * Fixing benchmarks * Updating tests as MSQ does not support earliest for some cases * Addressing review comments by adding the following: 1. Checking capabilities first before creating selectors 2. Removing mockito in tests for numeric first aggs 3. Removing unnecessary tests * Addressing issues for dictionary encoded single string columns where we can use the dictionary ids instead of the entire string * Adding a flag for multi value dimension selector * Addressing comments * 1 more change * Handling review comments part 1 * Handling review comments and correctness fix for latest_by when the time expression need not be in sorted order * Updating numeric first vector agg * Revert "Updating numeric first vector agg" This reverts commit 4291709. * Updating code for correctness issues * fixing an issue with latest agg * Adding more comments and removing an unnecessary check * Addressing null checks for tie selector and only vectorize false for quantile sketches
Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>
…pache#14322) Currently, after an MSQ query, the web console is responsible for waiting for the segments to load. It does so by checking if there are any segments loading into the datasource ingested into, which can cause some issues, like in cases where the segments would never be loaded, or would end up waiting for other ingests as well. This PR shifts this responsibility to the controller, which would have the list of segments created.
A new monitor SubqueryCountStatsMonitor which emits the metrics corresponding to the subqueries and their execution is now introduced. Moreover, the user can now also use the auto mode to automatically set the number of bytes available per query for the inlining of its subquery's results.
The issue due to which the custom rule was added has been fixed as a part of https://issues.apache.org/jira/browse/CALCITE-3763 and accommodated during Calcite upgrade
* Not pushing down not filters * New test case * Updating tests * Removing a stale comment
In smartSegmentLoading mode, use computed value of balancerComputeThreads rather than configured value.
* Automate adding labels. * Add metrics/event emitting label * ingestion and segment format
Changes: - Add new metric `kill/pendingSegments/count` with dimension `dataSource` - Add tests for `KillStalePendingSegments` - Reduce no-op logs that spit out for each datasource even when no pending segments have been deleted. This can get particularly noisy at low values of `indexingPeriod`. - Refactor the code in `KillStalePendingSegments` for readability and add javadocs
* update test * update test * format * test * fix0 * Revert "fix0" This reverts commit 44992cb. * ok resultset * add plan * update test * before rewind * test * fix toString/compare/test * move test * add timeColumn to hashCode
…ringFirstAggregatorFactory.factorizeVector (apache#14957)
Currently Druid is using google apis client 1.26.0 version and google-oauth-client-1.26.0.jar in particular is bringing following CVEs CVE-2020-7692, CVE-2021-22573. Despite the CVEs being false positives, they're causing red security scans on Druid distribution. Hence updating the version to latest version with these CVE fixes.
* add note about transparent_reconnection * Update docs/api-reference/sql-jdbc.md
* save work * Working * Fix runner constructor * Working runner * extra log lines * try using lifecycle for everything * clean up configs * cleanup /workers call * Use a single config * Allow selecting runner * debug changes * Work on composite task runner * Unit tests running * Add documentation * Add some javadocs * Fix spelling * Use standard libraries * code review * fix * fix * use taskRunner as string * checkstyl --------- Co-authored-by: Suneet Saldanha <suneet@apache.org>
* Set task location as k8sPodName for mm-less ingestion * tests
Lately, Query IT has been failing due to historical server running out of memory (OOM). We are investigating the historical heap dump from the test. Until the issue is resolved, we are increasing the heap size of historical server.
Currently, the redis-cache extension uses Jedis 2.9.0, which was released over seven years ago and is no longer listed in the official support matrix. This patch upgrades it to ensure the compatibility with the recent version of Redis and make future upgrades easier, including: Upgrade Jedis to v5.0.2, the latest version at this writing, and address the API changes and dependency version mismatch. Replace mock-jedis with jedis-mock, since the former has not been actively maintained any longer and not compatible with recent versions of Jedis.
…pache#15133)" (apache#15346) This reverts commit dc0b163.
…e#15334) Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
…#15347) * fix segment/count metric in Statsd-emitter * update doc * Update docs/development/extensions-contrib/prometheus.md Co-authored-by: Suneet Saldanha <suneet@apache.org> * Update docs/development/extensions-contrib/statsd.md Co-authored-by: Suneet Saldanha <suneet@apache.org> --------- Co-authored-by: Suneet Saldanha <suneet@apache.org>
* Bump commons-codec:commons-codec from 1.13 to 1.16.0 Bumps [commons-codec:commons-codec](https://github.com/apache/commons-codec) from 1.13 to 1.16.0. - [Changelog](https://github.com/apache/commons-codec/blob/master/RELEASE-NOTES.txt) - [Commits](apache/commons-codec@commons-codec-1.13...rel/commons-codec-1.16.0) --- updated-dependencies: - dependency-name: commons-codec:commons-codec dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * update licenses.yaml * update licences.yaml --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Xavier Léauté <xvrl@apache.org>
…datasources (apache#15355) In pull request apache#14985, a bug was introduced where periodic refresh would skip rebuilding a datasource's schema after encountering a non-existent datasource. This resulted in remaining datasources having stale schema information. This change addresses the bug and adds a unit test to validate the refresh mechanism's behaviour when a datasource is removed, and other datasources have schema changes.
* fix time shifting
…ationWithDefaults` (apache#15317) * + Fix for Flaky Test * + Replacing TreeMap with LinkedHashMap * + Changing data structure from LinkedHashMap to HashMap * Fixed flaky test in S3DataSegmentPusherConfigTest.testSerializationValidatingMaxListingLength * Minor Changes
…` query. (apache#15243) * MSQ generates tombstones honoring the query's granularity. This change tweaks to only account for the infinite-interval tombstones. For finite-interval tombstones, the MSQ query granualrity will be used which is consistent with how MSQ works. * more tests and some cleanup. * checkstyle * comment edits * Throw TooManyBuckets fault based on review; add more tests. * Add javadocs for both methods on reconciling the methods. * review: Move testReplaceTombstonesWithTooManyBucketsThrowsException to MsqFaultsTest * remove unused imports. * Move TooManyBucketsException to indexing package for shared exception handling. * lower max bucket for tests and fixup count * Advance and count the iterator. * checkstyle
Saw bug where MSQ controller task would continue to hold the task slot even after cancel was issued. This was due to a deadlock created on work launch. The main thread was waiting for tasks to spawn and the cancel thread was waiting for tasks to finish. The fix was to instruct the MSQWorkerTaskLauncher thread to stop creating new tasks which would enable the main thread to unblock and release the slot. Also short circuited the taskRetriable condition. Now the check is run in the MSQWorkerTaskLauncher thread as opposed to the main event thread loop. This will result in faster task failure in case the task is deemed to be non retriable.
* Document segment metadata cache behaviour * Fix typo * Minor update * Minor change
…n` by changing string to key:value pair (apache#15207) * Fix capacity response in mm-less ingestion (apache#14888) Changes: - Fix capacity response in mm-less ingestion. - Add field usedClusterCapacity to the GET /totalWorkerCapacity response. This API should be used to get the total ingestion capacity on the overlord. - Remove method `isK8sTaskRunner` from interface `TaskRunner` * Using Map to perform comparison * Minor Change --------- Co-authored-by: George Shiqi Wu <george.wu@imply.io>
There is a problem with Quantiles sketches and KLL Quantiles sketches.
Queries using the histogram post-aggregator fail if:
- the sketch contains at least one value, and
- the values in the sketch are all equal, and
- the splitPoints argument is not passed to the post-aggregator, and
- the numBins argument is greater than 2 (or not specified, which
leads to the default of 10 being used)
In that case, the query fails and returns this error:
{
"error": "Unknown exception",
"errorClass": "org.apache.datasketches.common.SketchesArgumentException",
"host": null,
"errorCode": "legacyQueryException",
"persona": "OPERATOR",
"category": "RUNTIME_FAILURE",
"errorMessage": "Values must be unique, monotonically increasing and not NaN.",
"context": {
"host": null,
"errorClass": "org.apache.datasketches.common.SketchesArgumentException",
"legacyErrorCode": "Unknown exception"
}
}
This behaviour is undesirable, since the caller doesn't necessarily
know in advance whether the sketch has values that are diverse
enough. With this change, the post-aggregators return [N, 0, 0...]
instead of crashing, where N is the number of values in the sketch,
and the length of the list is equal to numBins. That is what they
already returned for numBins = 2.
Here is an example of a query that would fail:
{"queryType":"timeseries",
"dataSource": {
"type": "inline",
"columnNames": ["foo", "bar"],
"rows": [
["abc", 42.0],
["def", 42.0]
]
},
"intervals":["0000/3000"],
"granularity":"all",
"aggregations":[
{"name":"the_sketch", "fieldName":"bar", "type":"quantilesDoublesSketch"}],
"postAggregations":[
{"name":"the_histogram",
"type":"quantilesDoublesSketchToHistogram",
"field":{"type":"fieldAccess","fieldName":"the_sketch"},
"numBins": 3}]}
I believe this also fixes issue apache#10585.
Fixing outdated query from deep storage docs.
…pache#14995) * Prevent a race that may cause multiple attempts to publish segments for the same sequence
Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>
ffb77be to
b8c57d5
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Combine fixes