TXM: Rpc liveness#562
Conversation
Deploying happychain with
|
| Latest commit: |
c884815
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://2608acea.happychain.pages.dev |
| Branch Preview URL: | https://gabriel-txm-rpc-liveness.happychain.pages.dev |
HAPPY-366 TXM: Add RPC liveness monitor
Goal: avoid creating an unbounded amount of attempts when the RPC is down, which then creates a lot of load on the service or on the RPC. At the same time, we want to retry once the RPC comes back up. This service can receive pings from other components (e.g. block monitor or tx submitter) to determine if the service is alive. This is a policy that could be customized by the user. |
102b106 to
aa82f76
Compare
8fb618e to
b708472
Compare
aa82f76 to
5b83c84
Compare
b708472 to
db14c74
Compare
5b83c84 to
dafe8fa
Compare
db14c74 to
db7aedd
Compare
dafe8fa to
1beb802
Compare
db7aedd to
cc22eb4
Compare
1beb802 to
3bfc7d8
Compare
cc22eb4 to
32771ef
Compare
| return | ||
| } | ||
|
|
||
| this.txmgr.rpcLivenessMonitor.onSuccess() |
There was a problem hiding this comment.
onSuccess/onFailure sound like callback listeners to me
onSuccess(args => console.log("success", args))maybe trackSuccess() or something?
There was a problem hiding this comment.
Yes, much better!
3bfc7d8 to
5692dc3
Compare
32771ef to
5801281
Compare
| * @default 2000 (2 seconds) | ||
| * @unit milliseconds | ||
| */ | ||
| livenessPingInterval?: number |
There was a problem hiding this comment.
i think this was supposed to be used in conjunction with livenessSuccessCount but dont see it anywhere
There was a problem hiding this comment.
Yup, good catch. It was renamed to livenessCheckInterval, and I forgot to remove livenessPingInterval
| occurredAt: new Date(), | ||
| success: true, | ||
| }) | ||
| this.checkIfDown() |
There was a problem hiding this comment.
I'm wondering if that doesn't add a lot of overhead? If we're making 1000 RPC calls per second (which is not insane, each tx might require a few of these calls), then we're calling this 1000/s, and there are ~10k events in the event window, so 1000 times per second we're filtering through a list of 10,000 events.
Maybe we could maintain a rolling list of counters? Like for 10s period, 10 counters for success events, 10 counters for error events. And just maintain a single timestamp corresponding to the second-granularity timestamp of the oldest counters?
There was a problem hiding this comment.
I think you're right. I refactored the code, and I believe this approach is much better
There was a problem hiding this comment.
Nice! If we ever need to optimize this more, we can replace the object by an array, and we could also update the count dynamically instead of recomputing it in ratioOfSuccess but this works nicely, merging this.
ce73860 to
0c6f204
Compare
a3d1d4d to
0c8f54d
Compare
bb3d2ab to
d03ec85
Compare
d72d783 to
75f46bf
Compare
75f46bf to
c884815
Compare

Linked Issues
Description
Added a liveness monitor to the Transaction Manager
Toggle Checklist
Checklist
Basics
norswap/build-system-caching).Reminder: PR review guidelines
Correctness
testnet, mainnet, standalone wallet, ...).
< INDICATE BROWSER, DEMO APP & OTHER ENV DETAILS USED FOR TESTING HERE >
< INDICATE TESTED SCENARIOS (USER INTERFACE INTERACTION, CODE FLOWS) HERE >
and have updated the code & comments accordingly.
Architecture & Documentation
(2) commenting these boundaries correctly, (3) adding inline comments for context when needed.
comments.
in a Markdown document.
packages/coreandpackages/react), see here for more info.