SvApp: Add UnhideRewardCouponV2Trigger and ExpireRewardCouponV2Trigger#5975
SvApp: Add UnhideRewardCouponV2Trigger and ExpireRewardCouponV2Trigger#5975dfordivam wants to merge 14 commits into
Conversation
meiersi-da
left a comment
There was a problem hiding this comment.
Thanks for the good work. Not exactly a trivial change. I would value @rautenrieth-da review on this as well.
| alter table dso_acs_store | ||
| add column reward_beneficiary_is_observer boolean; | ||
|
|
||
| create index dso_acs_store_sid_mid_pn_tid_rbio |
There was a problem hiding this comment.
@rautenrieth-da : this index creation will result in downtime for the SV app won't it? One option to limit the downtime would be to make the partial index more partial and also constrain it on the template-id, so the corresponding index can be used during its creation.
Or would you rather recommend to use the delayed index creation infrastructure?
There was a problem hiding this comment.
note @dfordivam : downtime for the SV app is a problem as the SVs will may miss a reward coupon if they are down for too long.
There was a problem hiding this comment.
I have done investigations and confirmed that there is no way to make the index creation faster.
The index creation wont make use of other indices and will go through all the rows
I did some experiments locally and confirmed that all rows are in fact read when creating the index.
Even a delayed index creation will also scan all the rows, and it appears to be a heavy change for our needs, given that we know that we don't have any existing rows in DB to index.
So I looked into alternate ways to make this work. The simplest approach looks like reuse of existing indexes for our query.
Specifically we have the following index already, and the sv_onboarding_token is a very specific column.
"dso_acs_store_sid_pn_tid_sot" btree (store_id, migration_id, package_name, template_id_qualified_name, sv_onboarding_token) WHERE sv_onboarding_token IS NOT NULL
We could make this column and index generic by rename to aux_lookup_key, and add the provider to this when providerIsBeneficiary is false. This would work just fine for our query as the shape of dso_acs_store_sid_pn_tid_sot is identical to our reward_party one.
The usage of sv_onboarding_token is minimal/straightforward in the code, onlylookupSvOnboardingRequestByTokenWithOffset .
There was a problem hiding this comment.
Good proposal. I like it. @rautenrieth-da wdyt?
There was a problem hiding this comment.
I just discussed the case with @rautenrieth-da , and we concluded that it seems better to avoid the confusion, and instead use the infrastructure that exists for lazy index creation in:
It was purpose-built for this problem and should work fine for this use-case.
| PackageIdResolver.Package.SpliceAmulet, | ||
| payload => (payload.dso +: observerParties(payload)).map(PartyId.tryFromProtoPrimitive(_)), | ||
| ) | ||
| with SvTaskBasedTrigger[Task] { |
There was a problem hiding this comment.
Comparing to this recent addition of a trigger https://github.com/canton-network/splice/pull/5964/changes#diff-415ea762d792a54415fcb004b7ec7ebe6b3355217b7a089edde6e22b5bf2900eR48, you are not use the IgnoredAmuletVersionGuard, probably by intention.
Any particular reason why that trigger's tooling is insufficient for this usecase?
There was a problem hiding this comment.
I missed the IgnoredAmuletVersionGuard and it looks necessary even for our trigger. Though it is somewhat orthogonal to the supportsTrafficBasedAppRewards based filtering, I think I can make integrate both together.
There was a problem hiding this comment.
Hmm now on re-review I think the vetting based filtering is sufficient in our case right?
We need not do ignore for the parties specified in IgnoredPartiesStore as these appear to be for contracts with party as signatory, and are probably offline?
Or do we need to have an option to do ignore via the unresponsive parties list?
There was a problem hiding this comment.
Hmm now on re-review I think the vetting based filtering is sufficient in our case right?
We need not do ignore for the parties specified in IgnoredPartiesStore as these appear to be for contracts with party as signatory, and are probably offline?
Or do we need to have an option to do ignore via the unresponsive parties list?
You are probably right that we don't need it. Assuming we got everything right. I'm OK with making the assumption and adding the more complex code after the fact.
| s"Skipping ${skippedCoupons.size} contracts whose observers are not correctly vetted." | ||
| ) | ||
| vetted.flatMap(_._1) | ||
| } |
There was a problem hiding this comment.
@moritzkiefer-da @julientinguely-da : you've recently worked on improved support for unresponsive parties or parties with the wrong vetting state. Is there support code that @dfordivam could reuse here to implement this? Any concern with this code?
There was a problem hiding this comment.
This code block is now gone, please take a look at refactored/simplified trigger.
| result <- loop(totalUnhidden + coupons.size) | ||
| } yield result | ||
| } yield result | ||
| loop(0).map(count => TaskSuccess(s"unhid $count reward coupons for ${task.providerParty}")) |
There was a problem hiding this comment.
Do we need this loop here? Can't we just rely on the top-level retrieve-tasks to take care of the looping?
I'm a bit worried that this extra loop will make interpreting the trigger's activity unnecessarily confusing.
I suspect changing the Task definition to Task(provider: Party, coupons: Seq[CouponCid]) would be required to make the staleness check after completion work out. That doesn't seem like a problem. It actually feels like a simplification in terms of reasoning about the trigger's behavior.
There was a problem hiding this comment.
I interpreted the design-doc description as a single task execution handling all coupons. But it could be a separate task.
task completion
- unhide all the beneficiaries RewardCouponV2 contracts in batches by calling DsoRules_UnhideRewardCouponsV2
- complete once no new batch remains
Also In case of multiple SVs trying the unhide, there would be clash, so the staleness check will need to handle that. This probably needs some more thinking
There was a problem hiding this comment.
On further consideration, I think it makes sense to get rid of this loop because of the following reasoning.
The V2 daml was introduced in the 0.6.7,and the TBA would be enabled on 0.6.11 onwards, so most of validators would have vetted by time. The cause of the wrong vetting state would be mostly operator errors.
And I think that the average size of coupons to unhide will be lower than 100, which is around 16 hours(16*6=96)
as the 16 hours time window of wrong wetting state for the app providers appears to be on the high end.
If we assume that the average count is around/below 100, and the max is 36*6=216.
I think that we should just increase the default batch size to 220, such that we almost have a guarantee that all the coupons should be handled in a single trigger run.
This would avoid the complications around SVs race.
There was a problem hiding this comment.
Removed the loop and bumped the default size to 220
Signed-off-by: Divam <dfordivam@gmail.com>
Signed-off-by: Divam <dfordivam@gmail.com>
Signed-off-by: Divam <dfordivam@gmail.com>
Signed-off-by: Divam <dfordivam@gmail.com>
Signed-off-by: Divam <dfordivam@gmail.com>
Signed-off-by: Divam <dfordivam@gmail.com>
Signed-off-by: Divam <dfordivam@gmail.com>
Signed-off-by: Divam <dfordivam@gmail.com>
Signed-off-by: Divam <dfordivam@gmail.com>
Signed-off-by: Divam <dfordivam@gmail.com>
Signed-off-by: Divam <dfordivam@gmail.com>
Signed-off-by: Divam <dfordivam@gmail.com>
26ead2e to
3a30f85
Compare
| damlBatch, | ||
| // TODO (#5715) determine 'providersWithWrongVettingState' | ||
| new DamlSet(java.util.Collections.emptyMap()), | ||
| providersWithWrongVettingState, |
There was a problem hiding this comment.
Please record the size of this set in a metric.
There was a problem hiding this comment.
And create an issue for @rautenrieth-da to add it to the dashboard, and alerts.
| result <- | ||
| if (coupons.isEmpty) | ||
| Future.successful( | ||
| TaskSuccess(s"no reward coupons to unhide for ${task.providerParty}") |
There was a problem hiding this comment.
| TaskSuccess(s"no reward coupons to unhide for ${task.providerParty}") | |
| TaskSkipped(s"no reward coupons to unhide for ${task.providerParty}") |
| alter table dso_acs_store | ||
| add column reward_beneficiary_is_observer boolean; | ||
|
|
||
| create index dso_acs_store_sid_mid_pn_tid_rbio |
There was a problem hiding this comment.
I just discussed the case with @rautenrieth-da , and we concluded that it seems better to avoid the confusion, and instead use the infrastructure that exists for lazy index creation in:
It was purpose-built for this problem and should work fine for this use-case.
Fixes #5715
Pull Request Checklist
Cluster Testing
/cluster_teston this PR to request it, and ping someone with access to the DA-internal system to approve it./upgrade_teston this PR to request it, and ping someone with access to the DA-internal system to approve it./hdm_teston this PR to request it, and ping someone with access to the DA-internal system to approve it./lsu_teston this PR to request it, and ping someone with access to the DA-internal system to approve it.PR Guidelines
Fixes #n, and mention issues worked on using#nMerge Guidelines