Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: multisig stale data after failed refresh #9863

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

SNeedlewoods
Copy link

Without this PR you should be able to reproduce the error with the following steps:

  1. 2/2 multisig-wallets W1 and W2 both own some enotes with partial key images
  2. use export_multisig_info for both wallets
  3. use import_multisig_info for W1
  4. turn off daemon (to reproduce the "no connection to daemon" error in next step, in the real world this happens by accident)
  5. attempt import_multisig_info for W2 (it fails as expected)
  6. turn on daemon
  7. use import_multisig_info for W2 (it looks like it worked, but it actually dropped enotes with partial key images from m_transfers, this becomes a problem e.g. if you try to sign a tx from this wallet)
  8. use transfer from W1
  9. use sign_multisig from W2, here we get Error: Multisig error: This signature was made with stale data: export fresh multisig data, which other participants must then use

@SNeedlewoods SNeedlewoods force-pushed the x_multisig_stale_data_bugfix branch 3 times, most recently from f6096cb to 10c6802 Compare March 25, 2025 16:54
@SNeedlewoods
Copy link
Author

SNeedlewoods commented Mar 25, 2025

Edit: this should fix it.


Revert previous commit that leads to failing functional test.

If anyone knows if we really need this, please let me know how to use it.

auto wiper = epee::misc_utils::create_scope_leave_handler([&](){for (auto &v: m_multisig_rescan_k) memwipe(v.data(), v.size() * sizeof(v[0]));});

@SNeedlewoods SNeedlewoods force-pushed the x_multisig_stale_data_bugfix branch from 10c6802 to f492351 Compare March 25, 2025 18:55
Copy link
Collaborator

@j-berman j-berman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an improvement to me as is. Nice work!

Would be nice to have @woodser test this out as well :)

Couple things to take into consideration

  • If refresh fails while m_multisig_rescan_k and m_multisig_rescan_info are still set, and the wallet then closes before refreshing again, it looks like that would resurface this bug.
    • One way to solve this would be to persist m_multisig_rescan_k and m_multisig_rescan_info to the wallet cache.
  • If m_multisig_rescan_k and m_multisig_rescan_info are still set after returning from import_multisig (since refreshfails to execute), wallets should probably be sure to complete a refresh before allowing the user to do any other multisig actions.
    • Looking at the multisig code, you'll see there is get_multisig_status which all multisig functions rely on.
    • We could add a boolean needs_refresh to multisig_account_status that gets set to true in get_multisig_status if m_multisig_rescan_k and m_multisig_rescan_info are non-empty.
    • Then all those functions that check get_multisig_status could also check for a falsey needs_refresh boolean where necessary.

I think this PR as is works as an improvement (because wallets should have UX that refreshes the wallet when it isn't synced or encourages refresh), and the above would be ok for a future PR.

@SNeedlewoods SNeedlewoods force-pushed the x_multisig_stale_data_bugfix branch from f492351 to c848e21 Compare March 28, 2025 12:13
@SNeedlewoods
Copy link
Author

Thank you for the review.

One thing I noticed, unrelated to this PR, I had the functional tests fail with Failed to start wallet or daemon. Figured out this is because it uses .bitmonero/bitmonero.conf. I think, depending on your config, this could also lead to unexpected behavior in the functional tests. To prevent this we may want to add --config-file flag in tests/functional_tests/functional_tests_rpc.py.

@rbrunner7
Copy link
Contributor

I tried to reproduce again, and failed again. I was careful to follow exactly the steps you outlined. No idea why it works for me.

Would you predict that the old code fails reliably, not only intermittently, or only rarely by sheer bad luck?

Maybe my failure to reproduce is unimportant, and a total fluke, but still mildly worries me.

Looking at your modifications, I understood technically what you did, but unfortunately I still don't understand yet what the core problem was, and why switching from pointers to some vectors to vectors as direct class members would change anything important. Can you maybe drop a few high-level hints?

@SNeedlewoods
Copy link
Author

Can you maybe drop a few high-level hints?

AFAIUI, in import_multisig() we detach the blockchain and remove enotes with partial key images from m_transfers (src). With the new imported multisig information we will add them back to m_transfers with full key images in refresh().

The problem is, if refresh() fails, we lose access to the imported multisig information here, so we can't get enotes with full key images into m_transfers and the enotes with partial key images are still missing.

The change to class members was made to extend the scope, because info (src) and k (src) are local variables in import_multisig() and so m_multisig_rescan_* (get set here) were pointing to garbage after the method finished, if we just remove the m_multisig_rescan_* = NULL; lines.

Would you predict that the old code fails reliably, not only intermittently, or only rarely by sheer bad luck?

From all the testing I did I would say it fails reliably if you're quick enough to call import_multisig_info before this message pops up, after the first attempt to import_multisig_info when the daemon got relaunched:

Height <height>, txid <txid>, <amount>, idx <minor>/<major>

I assume if you have a fast wallet <-> daemon connection it may be harder to reproduce, because the wallet connects and uses auto-refresh. It seems simplewallet::refresh_main() does restore m_transfers for enotes with partial key images, though I haven't investigated the code (my guess m_wallet->rescan_blockchain() before m_wallet->refresh() does the trick here)), but I tested using refresh before the second import_multisig_info to confirm you can then use sign_multisig afterwards without getting "stale data error".

Copy link
Contributor

@rbrunner7 rbrunner7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @SNeedlewoods, for the detailed explanation. Things make sense now.

I found that preventing refresh before the second import_multisig_info (step 7) is indeed crucial with my setup to reproduce the problem. After I did a set auto-refresh 0 in the wallet as a step 0 I was finally successful.

So I could reproduce the error with the master code, could not reproduce it anymore with the code from this branch, and also reviewed the code.

@iamamyth
Copy link

Please don't wait for my signoff, as my one comment has been addressed, and I don't have time to review the PR in full.

jeffro256 added a commit to jeffro256/monero that referenced this pull request Apr 1, 2025
Copy link
Contributor

@jeffro256 jeffro256 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good long term fix would be to modify the m_transfers in-place, instead of detaching the blockchain and requiring a live refresh, but the behavior introduced in this PR is significantly better than current. Good job!

@j-berman
Copy link
Collaborator

j-berman commented Apr 1, 2025

It looks like the live refresh is there because it's calculating key images for receives, and then refreshing from the first point of a calculated key image to identify respective spends (without revealing to the daemon which key images are the user's).

Sync via full balance view key would negate the need for the live refresh.

With the legacy view key, another option to avoid revealing the user's key images to a daemon would for multisig wallets to use "background sync" to make sure the wallet picks up all receives and potential spends, and then upon calculating key images via import_multisig, process the background cache. With FCMP++, since all txs should have at least one receive output back to the wallet, the concept of a "potential" spend is more straightforward / simpler to manage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants