Update by rRajivramachandran · Pull Request #6 · CaseyPan/druid

rRajivramachandran · 2023-11-17T00:33:46Z

Combine fixes

Changes: - Reduce log level of some coordinator stats, which only denote normal coordinator operation. These stats are still emitted and can be logged by setting debugDimensions in the coordinator dynamic config. - Initialize SegmentLoadingConfig only for historical management duties. This config is not needed in other duties and initializing it creates logs which are misleading.

* fix nvl in table * add query parameter dialog * pre-wrap in the tables * fix typo

Changes: - Make ServiceMetricEvent.Builder extend ServiceEventBuilder<ServiceMetricEvent> and thus convert it to a plain builder rather than a builder of builder. - Add methods setCreatedTime , setMetricAndValue to the builder

Changes: [A] Remove config `decommissioningMaxPercentOfMaxSegmentsToMove` - It is a complicated config 😅 , - It is always desirable to prioritize move from decommissioning servers so that they can be terminated quickly, so this should always be 100% - It is already handled by `smartSegmentLoading` (enabled by default) [B] Remove config `maxNonPrimaryReplicantsToLoad` This was added in apache#11135 to address two requirements: - Prevent coordinator runs from getting stuck assigning too many segments to historicals - Prevent load of replicas from competing with load of unavailable segments Both of these requirements are now already met thanks to: - Round-robin segment assignment - Prioritization in the new coordinator - Modifications to `replicationThrottleLimit` - `smartSegmentLoading` (enabled by default)

Check that a checkpoint is non-empty before adding it to the checkpoint sequence in a SeekableStreamSupervisor

Changes: - Simplify static `create` methods for `NoopTask` - Remove `FirehoseFactory`, `IsReadyResult`, `readyTime` from `NoopTask` as these fields were not being used anywhere - Update tests

…he#14408) * Vectorizing earliest for numeric * Vectorizing earliest string aggregator * checkstyle fix * Removing unnecessary exceptions * Ignoring tests in MSQ as earliest is not supported for numeric there * Fixing benchmarks * Updating tests as MSQ does not support earliest for some cases * Addressing review comments by adding the following: 1. Checking capabilities first before creating selectors 2. Removing mockito in tests for numeric first aggs 3. Removing unnecessary tests * Addressing issues for dictionary encoded single string columns where we can use the dictionary ids instead of the entire string * Adding a flag for multi value dimension selector * Addressing comments * 1 more change * Handling review comments part 1 * Handling review comments and correctness fix for latest_by when the time expression need not be in sorted order * Updating numeric first vector agg * Revert "Updating numeric first vector agg" This reverts commit 4291709. * Updating code for correctness issues * fixing an issue with latest agg * Adding more comments and removing an unnecessary check * Addressing null checks for tie selector and only vectorize false for quantile sketches

Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>

…pache#14322) Currently, after an MSQ query, the web console is responsible for waiting for the segments to load. It does so by checking if there are any segments loading into the datasource ingested into, which can cause some issues, like in cases where the segments would never be loaded, or would end up waiting for other ingests as well. This PR shifts this responsibility to the controller, which would have the list of segments created.

A new monitor SubqueryCountStatsMonitor which emits the metrics corresponding to the subqueries and their execution is now introduced. Moreover, the user can now also use the auto mode to automatically set the number of bytes available per query for the inlining of its subquery's results.

The issue due to which the custom rule was added has been fixed as a part of https://issues.apache.org/jira/browse/CALCITE-3763 and accommodated during Calcite upgrade

* Not pushing down not filters * New test case * Updating tests * Removing a stale comment

In smartSegmentLoading mode, use computed value of balancerComputeThreads rather than configured value.

* Automate adding labels. * Add metrics/event emitting label * ingestion and segment format

Changes: - Add new metric `kill/pendingSegments/count` with dimension `dataSource` - Add tests for `KillStalePendingSegments` - Reduce no-op logs that spit out for each datasource even when no pending segments have been deleted. This can get particularly noisy at low values of `indexingPeriod`. - Refactor the code in `KillStalePendingSegments` for readability and add javadocs

* update test * update test * format * test * fix0 * Revert "fix0" This reverts commit 44992cb. * ok resultset * add plan * update test * before rewind * test * fix toString/compare/test * move test * add timeColumn to hashCode

…ringFirstAggregatorFactory.factorizeVector (apache#14957)

Currently Druid is using google apis client 1.26.0 version and google-oauth-client-1.26.0.jar in particular is bringing following CVEs CVE-2020-7692, CVE-2021-22573. Despite the CVEs being false positives, they're causing red security scans on Druid distribution. Hence updating the version to latest version with these CVE fixes.

* add note about transparent_reconnection * Update docs/api-reference/sql-jdbc.md

* save work * Working * Fix runner constructor * Working runner * extra log lines * try using lifecycle for everything * clean up configs * cleanup /workers call * Use a single config * Allow selecting runner * debug changes * Work on composite task runner * Unit tests running * Add documentation * Add some javadocs * Fix spelling * Use standard libraries * code review * fix * fix * use taskRunner as string * checkstyl --------- Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Set task location as k8sPodName for mm-less ingestion * tests

…he#14806)

Lately, Query IT has been failing due to historical server running out of memory (OOM). We are investigating the historical heap dump from the test. Until the issue is resolved, we are increasing the heap size of historical server.

Currently, the redis-cache extension uses Jedis 2.9.0, which was released over seven years ago and is no longer listed in the official support matrix. This patch upgrades it to ensure the compatibility with the recent version of Redis and make future upgrades easier, including: Upgrade Jedis to v5.0.2, the latest version at this writing, and address the API changes and dependency version mismatch. Replace mock-jedis with jedis-mock, since the former has not been actively maintained any longer and not compatible with recent versions of Jedis.

…pache#15133)" (apache#15346) This reverts commit dc0b163.

…e#15334) Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>

…#15347) * fix segment/count metric in Statsd-emitter * update doc * Update docs/development/extensions-contrib/prometheus.md Co-authored-by: Suneet Saldanha <suneet@apache.org> * Update docs/development/extensions-contrib/statsd.md Co-authored-by: Suneet Saldanha <suneet@apache.org> --------- Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Bump commons-codec:commons-codec from 1.13 to 1.16.0 Bumps [commons-codec:commons-codec](https://github.com/apache/commons-codec) from 1.13 to 1.16.0. - [Changelog](https://github.com/apache/commons-codec/blob/master/RELEASE-NOTES.txt) - [Commits](apache/commons-codec@commons-codec-1.13...rel/commons-codec-1.16.0) --- updated-dependencies: - dependency-name: commons-codec:commons-codec dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * update licenses.yaml * update licences.yaml --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Xavier Léauté <xvrl@apache.org>

…datasources (apache#15355) In pull request apache#14985, a bug was introduced where periodic refresh would skip rebuilding a datasource's schema after encountering a non-existent datasource. This resulted in remaining datasources having stale schema information. This change addresses the bug and adds a unit test to validate the refresh mechanism's behaviour when a datasource is removed, and other datasources have schema changes.

* fix time shifting

…ationWithDefaults` (apache#15317) * + Fix for Flaky Test * + Replacing TreeMap with LinkedHashMap * + Changing data structure from LinkedHashMap to HashMap * Fixed flaky test in S3DataSegmentPusherConfigTest.testSerializationValidatingMaxListingLength * Minor Changes

…` query. (apache#15243) * MSQ generates tombstones honoring the query's granularity. This change tweaks to only account for the infinite-interval tombstones. For finite-interval tombstones, the MSQ query granualrity will be used which is consistent with how MSQ works. * more tests and some cleanup. * checkstyle * comment edits * Throw TooManyBuckets fault based on review; add more tests. * Add javadocs for both methods on reconciling the methods. * review: Move testReplaceTombstonesWithTooManyBucketsThrowsException to MsqFaultsTest * remove unused imports. * Move TooManyBucketsException to indexing package for shared exception handling. * lower max bucket for tests and fixup count * Advance and count the iterator. * checkstyle

Saw bug where MSQ controller task would continue to hold the task slot even after cancel was issued. This was due to a deadlock created on work launch. The main thread was waiting for tasks to spawn and the cancel thread was waiting for tasks to finish. The fix was to instruct the MSQWorkerTaskLauncher thread to stop creating new tasks which would enable the main thread to unblock and release the slot. Also short circuited the taskRetriable condition. Now the check is run in the MSQWorkerTaskLauncher thread as opposed to the main event thread loop. This will result in faster task failure in case the task is deemed to be non retriable.

* Document segment metadata cache behaviour * Fix typo * Minor update * Minor change

…n` by changing string to key:value pair (apache#15207) * Fix capacity response in mm-less ingestion (apache#14888) Changes: - Fix capacity response in mm-less ingestion. - Add field usedClusterCapacity to the GET /totalWorkerCapacity response. This API should be used to get the total ingestion capacity on the overlord. - Remove method `isK8sTaskRunner` from interface `TaskRunner` * Using Map to perform comparison * Minor Change --------- Co-authored-by: George Shiqi Wu <george.wu@imply.io>

There is a problem with Quantiles sketches and KLL Quantiles sketches. Queries using the histogram post-aggregator fail if: - the sketch contains at least one value, and - the values in the sketch are all equal, and - the splitPoints argument is not passed to the post-aggregator, and - the numBins argument is greater than 2 (or not specified, which leads to the default of 10 being used) In that case, the query fails and returns this error: { "error": "Unknown exception", "errorClass": "org.apache.datasketches.common.SketchesArgumentException", "host": null, "errorCode": "legacyQueryException", "persona": "OPERATOR", "category": "RUNTIME_FAILURE", "errorMessage": "Values must be unique, monotonically increasing and not NaN.", "context": { "host": null, "errorClass": "org.apache.datasketches.common.SketchesArgumentException", "legacyErrorCode": "Unknown exception" } } This behaviour is undesirable, since the caller doesn't necessarily know in advance whether the sketch has values that are diverse enough. With this change, the post-aggregators return [N, 0, 0...] instead of crashing, where N is the number of values in the sketch, and the length of the list is equal to numBins. That is what they already returned for numBins = 2. Here is an example of a query that would fail: {"queryType":"timeseries", "dataSource": { "type": "inline", "columnNames": ["foo", "bar"], "rows": [ ["abc", 42.0], ["def", 42.0] ] }, "intervals":["0000/3000"], "granularity":"all", "aggregations":[ {"name":"the_sketch", "fieldName":"bar", "type":"quantilesDoublesSketch"}], "postAggregations":[ {"name":"the_histogram", "type":"quantilesDoublesSketchToHistogram", "field":{"type":"fieldAccess","fieldName":"the_sketch"}, "numBins": 3}]} I believe this also fixes issue apache#10585.

Fixing outdated query from deep storage docs.

…pache#14995) * Prevent a race that may cause multiple attempts to publish segments for the same sequence

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>

…erTest

kfaraz and others added 30 commits August 30, 2023 11:30

Web console: dynamic query parameters UI (apache#14921)

d295b91

* fix nvl in table * add query parameter dialog * pre-wrap in the tables * fix typo

Added brush to time-chart (apache#14929)

42cfb99

line chart fix others not mapping correctly (apache#14931)

04a1153

show execution dialog in task view (apache#14930)

680669f

cleaning DruidProcessingConfig bindings (apache#14927)

dea9d4f

Simplify ServiceMetricEvent.Builder (apache#14933)

7f26b80

Changes: - Make ServiceMetricEvent.Builder extend ServiceEventBuilder<ServiceMetricEvent> and thus convert it to a plain builder rather than a builder of builder. - Add methods setCreatedTime , setMetricAndValue to the builder

Add checking for new checkpoint (apache#14353)

d4e972e

Check that a checkpoint is non-empty before adding it to the checkpoint sequence in a SeekableStreamSupervisor

Refactor: Cleanup NoopTask (apache#14938)

289ee1e

Changes: - Simplify static `create` methods for `NoopTask` - Remove `FirehoseFactory`, `IsReadyResult`, `readyTime` from `NoopTask` as these fields were not being used anywhere - Update tests

Verify statsd mock client interaction in unit test (apache#14939)

9d6ca61

Query tips doc (apache#14922)

425ebaa

Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>

fixup array and mvd sql docs (apache#14928)

706b57c

Remove DruidAggregateCaseToFilterRule (apache#14940)

23308c0

The issue due to which the custom rule was added has been fixed as a part of https://issues.apache.org/jira/browse/CALCITE-3763 and accommodated during Calcite upgrade

Unnest dont push down not (apache#14942)

a8fa979

* Not pushing down not filters * New test case * Updating tests * Removing a stale comment

Fix bug in computed value of balancerComputeThreads (apache#14947)

88f3c9b

In smartSegmentLoading mode, use computed value of balancerComputeThreads rather than configured value.

Updated documentation for OshiSysMonitor (apache#14912)

e100b18

Extend GHA autolabeler to other areas (apache#14903)

f9cf500

* Automate adding labels. * Add metrics/event emitting label * ingestion and segment format

docs: update docusaurus 2 stuff (apache#14864)

09f7dfe

use VectorValueSelector instead of BaseLongVectorValueSelector for St…

2b7f2c5

…ringFirstAggregatorFactory.factorizeVector (apache#14957)

Fix bug in KillStalePendingSegments (apache#14961)

7871e63

docs: add note about transparent_reconnection (apache#14953)

3a453f7

* add note about transparent_reconnection * Update docs/api-reference/sql-jdbc.md

Set task location as k8sPodName for mm-less ingestion (apache#14959)

757603a

* Set task location as k8sPodName for mm-less ingestion * tests

17px and others added 27 commits November 7, 2023 11:01

fix: Creating span label not closed (apache#15323)

54fa342

Refactor lookups behavior while loading/dropping the containers (apac…

e2fde8c

…he#14806)

Revert "Separate task lifecycle from kubernetes/location lifecycle (a…

130bfbf

…pache#15133)" (apache#15346) This reverts commit dc0b163.

fix ingest datasource detection falling over on paren (apache#15339)

d12f557

use is not distinct from (apache#15349)

fa48d4e

Optimize mark segments as unused (apache#15352)

895e535

docs: suggest metadata store with instant ADD COLUMN semantics (apach…

e7d0429

…e#15334) Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>

Change default inSubQueryThreshold (apache#15336)

a134cc3

+ Switching Comparison from String to JSON (apache#15364)

5edeac2

Web console: fix time shifting (apache#15359)

2cb7443

* fix time shifting

Fixing 1 flaky test in testAPIs() (apache#15375)

bedf246

Fixed 2 Flaky Tests (apache#15376)

4ca5acd

Document Nuances in SegmentMetadataCache Behaviour (apache#15367)

03a092f

* Document segment metadata cache behaviour * Fix typo * Minor update * Minor change

Query from deep storage doc fixes. (apache#15382)

857b8de

Fixing outdated query from deep storage docs.

Prevent multiple attempts to publish segments for the same sequence (a…

cdc192d

…pache#14995) * Prevent a race that may cause multiple attempts to publish segments for the same sequence

fix redirect for api docs and misc array-related typos (apache#15387)

6a5da5a

Co-authored-by: 317brian <53799971+317brian@users.noreply.github.com>

Fix flaky tests in CompatParquetReaderTest and FlattenSpecParquetRead…

f3629d2

…erTest

CaseyPan force-pushed the fix-CompatParquetReaderTest branch 3 times, most recently from ffb77be to b8c57d5 Compare November 18, 2023 00:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update #6

Update #6
rRajivramachandran wants to merge 258 commits into
CaseyPan:fix-CompatParquetReaderTestfrom
rRajivramachandran:bring-to-date

rRajivramachandran commented Nov 17, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

rRajivramachandran commented Nov 17, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants