Skip to content

fix cluster metrics when AggregatorReg isn't created on workers#464

Open
zbjornson wants to merge 1 commit into
prometheus:mainfrom
zbjornson:zb/fix-cluster
Open

fix cluster metrics when AggregatorReg isn't created on workers#464
zbjornson wants to merge 1 commit into
prometheus:mainfrom
zbjornson:zb/fix-cluster

Conversation

@zbjornson

Copy link
Copy Markdown
Contributor

When using the cluster module, you must now call promclient.setupClusterWorker() from each cluster worker.

v13.2.0 introduced a change from v13.1.0 (#449) that broke cluster metrics if an AggregatorRegistry was not instantiated on each cluster worker. The example in examples/cluster.js shows instantiation of an AggregatorRegistry in the workers, so users following that example were not affected by this change. However, the example was not written as intended: new AggregatorRegistry() should only be called on the cluster master. examples/cluster.js has been updated to show the full, correct usage.


This also unblocks a new approach to #182.

When using the cluster module, you must now call `promclient.setupClusterWorker()` from each cluster worker.

v13.2.0 introduced a change from v13.1.0 that broke cluster metrics if an `AggregatorRegistry` was not instantiated on each cluster worker. The example in `examples/cluster.js` shows instantiation of an `AggregatorRegistry` in the workers, so users following that example were not affected by this change. However, the example was not written as intended: `new AggregatorRegistry()` should only be called on the cluster master. `examples/cluster.js` has been updated to show the full, correct usage.
@zbjornson zbjornson requested review from SimenB and siimon September 19, 2021 17:26
@lucianodltec

lucianodltec commented Oct 2, 2021

Copy link
Copy Markdown

Desperated need this fix! I'm getting some errors running metrics on nodejs cluster. The metrics are not being aggregated, it´s simply does not work on cluster. Is this a fix for it?

Any plans on release this fix?

Comment thread CHANGELOG.md
- When using the cluster module, you must now call
`promclient.setupClusterWorker()` from each cluster worker.

Long explanation: v13.2.0 introduced a change from v13.1.0 that broke cluster

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds like we should revert that change, release a patch, then a new major with that and this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea, I can release a v13.2.1 with it reverted.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like v13.2.1 was never released so would it be possible to get this MR merged? The cleaner API is nice.

@lucianodltec

Copy link
Copy Markdown

I figured out master cluster process metrics are not being aggregated with worker process metrics on clusterMetrics() call.

Should it be? Or I'm guessing wrong?

I did a change pushing master cluster metrics and it works... please check below.

function addListeners() {
	if (listenersAdded) return;
	listenersAdded = true;

	if (cluster().isMaster) {
		// Listen for worker responses to requests for local metrics
		cluster().on('message', (worker, message) => {
			if (message.type === GET_METRICS_RES) {
				const request = requests.get(message.requestId);

				if (message.error) {
					request.done(new Error(message.error));
					return;
				}

				message.metrics.forEach(registry => request.responses.push(registry));
				request.pending--;

				if (request.pending === 0) {
					// finalize
					requests.delete(message.requestId);
					clearTimeout(request.errorTimeout);


// ------------------------------------
// CHECK HERE BEGIN
// ------------------------------------

					Promise.all(registries.map(r => r.getMetricsAsJSON()))
						.then(metrics => {
							request.responses.push(...metrics)
							const registry = AggregatorRegistry.aggregate(request.responses);
							const promString = registry.metrics();
							request.done(null, promString);
						})
						.catch(error => {
							console.error(error)
						});
// ------------------------------------
// CHECK HERE END
// ------------------------------------

				}
			}
		});
	}

	if (cluster().isWorker) {
		// Respond to master's requests for worker's local metrics.
		process.on('message', message => {
			if (message.type === GET_METRICS_REQ) {
				Promise.all(registries.map(r => r.getMetricsAsJSON()))
					.then(metrics => {
						process.send({
							type: GET_METRICS_RES,
							requestId: message.requestId,
							metrics,
						});
					})
					.catch(error => {
						process.send({
							type: GET_METRICS_RES,
							requestId: message.requestId,
							error: error.message,
						});
					});
			}
		});
	}
}

@zbjornson

Copy link
Copy Markdown
Contributor Author

@lucianodltec check out #280. That PR just needs tests. If you're willing to work on them, that would be awesome!

@theoilie

Copy link
Copy Markdown

Wondering if there are plans to merge this PR? It looks super helpful!

@JosemyDuarte

Copy link
Copy Markdown

Hi @zbjornson! Thanks for making this fix. Do you know if there is any plan to merge it?

@lebrak

lebrak commented Jan 12, 2023

Copy link
Copy Markdown

Hi thank you !!
I need this PR too, when this PR should be merge plz ?

@SimenB

SimenB commented Jan 17, 2023

Copy link
Copy Markdown
Collaborator

@zbjornson could you target next with this?

@zbjornson

Copy link
Copy Markdown
Contributor Author

@SimenB will do shortly. I want to revisit this PR to see if I can make both patterns work so it's not a breaking change in either direction.

@ethan-burrell

Copy link
Copy Markdown

@zbjornson

This would be great to unblock some work, looking forward to this being merged!

@MichaelSitter

Copy link
Copy Markdown

👋 👋 Hey team!! Any movement on this issue? We are in need of this fix, hoping it's landing soon 🤞

@SimenB

SimenB commented Mar 9, 2023

Copy link
Copy Markdown
Collaborator

@zbjornson I landed next on master and did a prerelease. So any breaking changes should be fine to land

@jasongornall-dd

Copy link
Copy Markdown

Bumping this! Would love to have this merged.

@morgaan

morgaan commented Jun 16, 2023

Copy link
Copy Markdown

@zbjornson Any update on this one? We are stuck with 13.1.0 and would be happy to migrate to latest. Thank you in advance 🤞

@SimenB

SimenB commented Oct 2, 2023

Copy link
Copy Markdown
Collaborator

@zbjornson I plan to release a stable v15 this week. Will you be able to work on the two issues in https://github.com/siimon/prom-client/issues?q=is:open+label:semver-major? If yes, I'll hold off - if not I'll just roll it out 🙂 Been more than half a year since exemplars landed without a release 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.