Skip to content

Upgrade ACA-Py Version#142

Draft
WadeBarnes wants to merge 1 commit into
bcgov:mainfrom
WadeBarnes:main
Draft

Upgrade ACA-Py Version#142
WadeBarnes wants to merge 1 commit into
bcgov:mainfrom
WadeBarnes:main

Conversation

@WadeBarnes

Copy link
Copy Markdown
Member
  • Upgrade to ACA-Py 0.12.1
  • Ensure the agent includes the full webhook payload.

- Upgrade to ACA-Py 0.12.1
- Ensure the agent includes the full webhook payload.

Signed-off-by: Wade Barnes <wade@neoterictech.ca>
@WadeBarnes WadeBarnes requested review from esune and ianco May 24, 2024 20:58
@WadeBarnes

Copy link
Copy Markdown
Member Author

Marked as a draft because I haven't applied these changes yet. I'm waiting to resolve the Sovrin TestNet issues before doing this.

@WadeBarnes

Copy link
Copy Markdown
Member Author

Deploy this upgrade before bcgov/von-bc-registries-agent-configurations#79. We want to test the compatibility with the older ACA-Py version on the BC Reg side first.

@WadeBarnes

Copy link
Copy Markdown
Member Author

I've applied the AGENT_DEBUG_WEBHOOKS=true environment variable to all of the environments so it's in place when the updated image is deployed.

@WadeBarnes

WadeBarnes commented May 28, 2024

Copy link
Copy Markdown
Member Author

Deploying the image requires a bit more coordination, since there are secure storage (wallet) upgrades that will be performed once started, and the OrgBook wallets are rather large.

Steps for each environment:

  • Update the agent's HPA, set it to min 1, max 1.
  • Disable health checks on the agent. We can't have the agent restarting in the middle of an upgrade.
  • Backup the wallet.
  • Deploy the new image.
  • Scale the agent to 1.
  • Wait for all of the upgrades to complete.
  • Test
  • Restore health checks
  • Restore HPA
  • Celebrate

Use the upgrade process in dev and test to get a sense of how long the process will take for the prod environment. Refine the process as needed.

Notes:

  • Upgrading from - to:
    • artifacts.developer.gov.bc.ca/docker-remote/bcgovimages/aries-cloudagent:py36-1.16-1_0.7.1
    • artifacts.developer.gov.bc.ca/github-docker-remote/hyperledger/aries-cloudagent-python:py3.9-indy-1.16.0-0.12.1

@WadeBarnes

WadeBarnes commented Nov 26, 2024

Copy link
Copy Markdown
Member Author

Related Indy to Askar secure storage migration test results:

DEV:

  • 12.46 GiB before migration
  • 5.29 GiB after migration
  • 81705 Credentials
  • Backup Restore: 39s
  • Migration: 47m 3.5s

TEST:

  • 86.87 GiB
  • 3658802 Credentials
  • Backup Restore: 31m 59.1s
  • Migration: Started yesterday afternoon, has not completed yet.

PROD:

  • 163.7 GiB
  • 5994241 Credentials
  • Backup Restore: 66m 25.6s
  • Migration: Failed after 524m 7.3s (8.7 hours) - ran out of disk space.
    • 2,524,200 items migrated before failure.
    • PVC Size 200GiB

@WadeBarnes

WadeBarnes commented Nov 27, 2024

Copy link
Copy Markdown
Member Author

Progress on OrgBook secure storage migration tests - since yesterday:

prod:

  • Migrated ~8.3 million item records of 10,172,915. Migration still in progress, it has not switched to the update (post migration) steps yet.
  • PVC usage - 260.9 GiB of 300 GiB

test:

  • Updated (post migration) ~2.9 million credentials of 3,659,022.
  • PVC usage - 89.94 GiB of 200 GiB

@swcurran

Copy link
Copy Markdown
Contributor

How much of your time is being taken on this @WadeBarnes ? Given that we are looking to revamp how data is fed into OrgBook (likely eliminating the DIDComm and AnonCreds processing), should we abandon this effort? It will likely be necessary to load the data from scratch into a new wallet.

@WadeBarnes

WadeBarnes commented Nov 27, 2024

Copy link
Copy Markdown
Member Author

Currently just monitoring the migration process occasionally so we get metrics.

@WadeBarnes

WadeBarnes commented Nov 27, 2024

Copy link
Copy Markdown
Member Author

Progress on OrgBook secure storage migration tests:

prod:

  • Still in progress

test:

  • Updated (post migration) 3,513,600 credentials of 3,659,022.
    • Failed after 2622m 20.1s (43.7 hours). The wallet container restarted, interrupting (possibly corrupting) the process.

Confirmed the upgrade/migration process does not continue where it left off. Restarting the process on the already partly upgraded and converted wallet encounters the following error:

Traceback (most recent call last):
  File "/usr/local/bin/askar-upgrade", line 8, in <module>
    sys.exit(entrypoint())
  File "/usr/local/lib/python3.10/site-packages/acapy_wallet_upgrade/__main__.py", line 207, in entrypoint
    asyncio.run(main(**vars(args)))
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.10/site-packages/acapy_wallet_upgrade/__main__.py", line 202, in main
    await strategy_inst.run()
  File "/usr/local/lib/python3.10/site-packages/acapy_wallet_upgrade/strategies.py", line 553, in run
    await self.conn.pre_upgrade()
  File "/usr/local/lib/python3.10/site-packages/acapy_wallet_upgrade/pg_connection.py", line 59, in pre_upgrade
    raise UpgradeError("No metadata table found: not an Indy wallet database")
acapy_wallet_upgrade.error.UpgradeError: No metadata table found: not an Indy wallet database

Not going to restart this test, we know it would take somewhere over 45 hour to complete.

@WadeBarnes

Copy link
Copy Markdown
Member Author

Progress on OrgBook secure storage migration tests:

prod:

  • Failed after 1801m 53.2s (~31 hours)
  • Migrated 10,171,178 item records of 10,172,915. Migration still in progress, it had just switched to the update (post migration) steps yet.
  • Error:
Migrating items... 10171178
Opening wallet with Askar...
Updating keys... 42
Updating master secret(s)...Traceback (most recent call last):
  File "/usr/local/bin/askar-upgrade", line 8, in <module>
    sys.exit(entrypoint())
  File "/usr/local/lib/python3.10/site-packages/acapy_wallet_upgrade/__main__.py", line 207, in entrypoint
    asyncio.run(main(**vars(args)))
  File "/usr/local/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/usr/local/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
    return future.result()
  File "/usr/local/lib/python3.10/site-packages/acapy_wallet_upgrade/__main__.py", line 202, in main
    await strategy_inst.run()
  File "/usr/local/lib/python3.10/site-packages/acapy_wallet_upgrade/strategies.py", line 562, in run
    await self.convert_items_to_askar(self.conn.uri, self.wallet_key)
  File "/usr/local/lib/python3.10/site-packages/acapy_wallet_upgrade/strategies.py", line 436, in convert_items_to_askar
    await self.update_master_keys(store)
  File "/usr/local/lib/python3.10/site-packages/acapy_wallet_upgrade/strategies.py", line 260, in update_master_keys
    raise Exception("Encountered multiple master secrets")
Exception: Encountered multiple master secrets

So we know we wouldn't be able to upgrade/migrate the production wallet database without further investigation and testing and the process would likely take well over 60 hours to complete. Again, I'm not going to restart this test.

@esune

esune commented Jan 28, 2025

Copy link
Copy Markdown
Member

Thinking we can close this issue based on recent direction decisions. Thoughts/objections @swcurran ?

@swcurran

Copy link
Copy Markdown
Contributor

Yup — let’s hold off.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants