Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File backup data gets corrupted on an interrupted nextcloud backup #867

Open
EnforcerE opened this issue Feb 10, 2025 · 22 comments · May be fixed by #875
Open

File backup data gets corrupted on an interrupted nextcloud backup #867

EnforcerE opened this issue Feb 10, 2025 · 22 comments · May be fixed by #875
Labels
needs info Requires more information from reporter

Comments

@EnforcerE
Copy link

Every time the backup process for storage starts, it says "Scanning for files..." meanwhile deleting everything available on the nextcloud server which contains the previous backup. The feature supporting reusing of app backups appears to be working fine, so I assume that it is probably intended for it to work the same for storage backup at some point.

Seedvault version: 15-5.2 on GrapheneOS

@EnforcerE EnforcerE marked this as a duplicate of #868 Feb 10, 2025
@grote
Copy link
Collaborator

grote commented Feb 11, 2025

Can you please explain in more detail what behavior of the app you are observing and would you would expect instead?

@grote grote added the needs info Requires more information from reporter label Feb 11, 2025
@EnforcerE
Copy link
Author

EnforcerE commented Feb 11, 2025

My file backup was around 80 GB (just for the files, apps were another 20 GB) and it completed successfully onto my nextcloud. I would expect seedvault to detect the previous backup and only update the files that changed in the meantime like the way it does with applications and their data. The current behaviour is that it appears to wipe the entire file backup before starting again. Obviously this is quite problematic with the amount of files I am looking to back up.

@grote
Copy link
Collaborator

grote commented Feb 11, 2025

That is not really the expected behavior. How do you know it is wiping file backup data? When exactly is it wiping file backup data? It is always happening reproducibly at this same time? Please provide more details!

@grote grote changed the title Seedvault storage backup never reuses existing backups. File backup data is not getting reused Feb 11, 2025
@EnforcerE
Copy link
Author

EnforcerE commented Feb 12, 2025

The way I know that it's wiping my data is by refreshing the nextcloud webpage and seeing the usage for the files folder go waaay down. After erasing 20 GB I stopped it and removed the backed up files and disabled file backup. There is absolutely no way that 20 GB of files changed inbetween the two backups. The thing to note is that the nextcloud client was uploading a long time after the backup process ended so it could be an issue with something timing out. I will try again with the roughly 80 GB but it will take me a very long time. I have tried testing right now with 4.8 GB of files and everything seems to be in order. Nothing is getting erased/everything is being reused. This leads me even further in the direction of a timeout because the client is nowhere near done uploading the files by the time seedvault is "done" backing up apps and declares success. In fact the queue in the client was around 2500 of mostly 15 MB files.

@EnforcerE
Copy link
Author

EnforcerE commented Feb 13, 2025

Everything seems to be working properly right now (I'm at 91 GB). I can't replicate the issue right now and am out of ideas. The only difference that comes to mind is that perhaps I was being a lot more incremental with enabling file backup unlike before when it was all at once.

@EnforcerE
Copy link
Author

EnforcerE commented Feb 13, 2025

OK, I just encountered what seems to be an error which might be related. The notification says removing old backups, but the nextcloud upload queue keeps getting bigger. If that's what happened last time, I probably assumed the upload was over and did something like restart the phone and when the next one came along it was in a broken state. This works if a broken state is resolved by removing the previous transaction (you tell me).

@EnforcerE
Copy link
Author

Ok, the backup is now over and I can in fact press backup now again, nevertheless the queue grows evermore. A bunch of files also just failed with "Local file not found"

@grote
Copy link
Collaborator

grote commented Feb 13, 2025

The thing to note is that the nextcloud client was uploading

Let me stop you right here! Please don't use the Nextcloud app for such large backups. If you can't use the built-in WebDAV support for some reason, use at least the DavX5 app instead. I am tempted to declare the Nextcloud app unsupported due to the number of issues it causes.

@grote grote changed the title File backup data is not getting reused File backup data is not getting reused with Nextcloud app Feb 13, 2025
@EnforcerE
Copy link
Author

EnforcerE commented Feb 13, 2025

Just tried the integrated WebDAV, first thing it did was try to erase the previous incomplete backup. Seems pretty unavoidable.

@EnforcerE
Copy link
Author

EnforcerE commented Feb 13, 2025

The issue is that a backup which was interrupted (presumably) corrupts previous backups and then the client simply wipes them which is of course really bad in case the backup is for example interrupted by the phone dying. The previous backup for me was 91GB as stated so it shouldn't go lower than that but the folder size has gone down to 0.

@EnforcerE EnforcerE changed the title File backup data is not getting reused with Nextcloud app File backup data gets corrupted on an interrupted backup Feb 13, 2025
@EnforcerE EnforcerE changed the title File backup data gets corrupted on an interrupted backup File backup data gets corrupted on an interrupted nextcloud backup Feb 13, 2025
@grote
Copy link
Collaborator

grote commented Feb 13, 2025

Just tried the integrated WebDAV, first thing it did was try to erase the previous incomplete backup. Seems pretty unavoidable.

I doubt this very much. Maybe something is wrong with your Nextcloud. How do you know it is deleting something? If you are sure, it really deleted something, please export a log right after it deleted something for the first time. The log will show if it really deleted files and if so why.

@EnforcerE
Copy link
Author

EnforcerE commented Feb 13, 2025

I managed to figure it out. Interrupting a backup which might be getting delivered out of order can result in no snapshot being uploaded, but the cache nevertheless gets deleted because the backup process on seedvault's end is over. This results in no blobs being referenced and everything getting pruned. A fix for this would be upload the snapshot immediately and give it an extension to indicate that it's incomplete e.g. XXXXX.snappart and rename it after the backup is complete. This would ensure that every single recoverable file can be recovered instead of being pruned or unusable such as in the case of an actual device failure. This is also why you have few problems with integrated WebDAV as it ensures uploads occur in proper order. Though I still don't know why a previous snapshot wouldn't get detected. Perhaps I just jumped the gun in giving up. In any case the issue seems architectural.

@grote
Copy link
Collaborator

grote commented Feb 14, 2025

Uploads should happen in the order we make them, how else what we know if they completed successfully or not? Nextcloud app is an offender here. It caches stuff on disk and with luck uploads it or it doesn't. It won't be able to tell us.
We don't know what the app does and thus it doing stuff out of order doesn't affect us.

The ChunksCache gets only cleared if there's a mismatch with what is in the cache and what we find on the server, which I guess could happen if an app misbehaves like Nextcloud does. However, it gets repopulated with what we can reconstruct from server data. It does delete chunks that are not referenced by a snapshot which can also happen if Nextcloud is not uploading snapshots. In short: Don't use the Nextcloud app for your backups, use DavX5 or built-in support.

Just tried the integrated WebDAV, first thing it did was try to erase the previous incomplete backup. Seems pretty unavoidable.

Yeah, because the local cache isn't matching what we find on the server. This should resolve itself after the first backup. What it deletes from the server can't be used anyway, because there's no snapshot referencing it.

@EnforcerE
Copy link
Author

EnforcerE commented Feb 14, 2025

I still think that the behaviour should be changed to make available server side snapshots of partial backups (and potentially tha ability to copy them into a separate folder or something like that for manual recovery). Just in case. If this feature is specifically something being planned againsty then maybe "Strongly discouraged" rather than "Not recommended" should be added next to nextcloud when selecting backup location and if possible some info button for why it is strongly discouraged, otherwise this failure mode is quite opaque to the end user. Either way, this is now a feature request.

@grote
Copy link
Collaborator

grote commented Feb 14, 2025

I still think that the behaviour should be changed to make available server side snapshots of partial backups (and potentially tha ability to copy them into a separate folder or something like that for manual recovery). Just in case.

Sounds like a big mess with big potential for user confusion. If people like you use unreliable and buggy backend connections, there would be lots if snapshots with partial data and you would complain that they don't contain all data. Also if the backend connection isn't reliable and it doesn't save everything, how can we save the partial snapshot?

"Not recommended" should be added next to nextcloud when selecting backup location

That's already there, maybe doesn't show in all situations like when you already have it installed.

@EnforcerE
Copy link
Author

EnforcerE commented Feb 15, 2025

Perhaps this is just me, but to me "Not recommended" translates to "I wouldn't use it myself", rather than "I see literally no use case and many bugs are expected". That's why I think the language should be made stronger if you're as you say yourself

tempted to declare the Nextcloud app unsupported

Regarding the pre-uploading of snapshots, I realize that it would require literally precomputing all hashes so it might not be practical unless that's whats already being done anyway. But there wouldn't be a bunch of incomplete snapshots on the server, there would be at most one, the most recent one which can simply be overwritten once the next backup starts with a once again precomputed snapshot. This would avoid the problem that I think you're trying to avoid by having complete backups pruned because there are too many partial ones.
The point I'm trying to make here is that for purposes of data recovery, having a file deleted or invisible is bad practice in my opinion. As for displaying it to the end user, it can just be marked as corrupted or incomplete and instead of letting the user restore it normally, force exporting it to a user specified folder for manual processing.

@grote
Copy link
Collaborator

grote commented Feb 17, 2025

#874 will not allow the nextcloud app to be chosen as backup location again.

The point I'm trying to make here is that for purposes of data recovery, having a file deleted or invisible is bad practice in my opinion.

Note that the data that is being deleted is not references at all, so you are not actually losing important data, but useless data. Also, as I said above, you can't do partial snapshots when the backup fails, because the backup is unreliable. Also partial snapshots would be confusing and we would never be able to delete any old data from the backend, so people would complain about storage requirements always growing.

@EnforcerE
Copy link
Author

EnforcerE commented Feb 17, 2025

I suppose that's one way to solve the Nextcloud specific issue. Regarding the partial backups, it seems like I am saying one thing and you are hearing something else, so I've made a diagram to clear up any confusion as to what I imagine should happen. Note that "Backup normally" here requires precomputing a full snapshot and uploading it immediately as a partial snapshot. I understand that this architecture would be quite complex to implement but would also be more robust in case of hardware or software failure as the only requirement it has is that an entire file can be uploaded properly rather than a full nackup for new data to be recoverable. Partial backups of app data are going to break at most one app since files are uploaded sequentially so all others will have either full backups or nothing at all. Note that I left only one place in the diagram to delete data.

Image

@grote
Copy link
Collaborator

grote commented Feb 17, 2025

It seems I am talking about the difficulty and problems associated with even making partial snapshots and you seem to be talking about what to do when we already have partial snapshots.

Here's the documentation of the backup feature: https://github.com/seedvault-app/seedvault/blob/android15/storage/doc/design.md#overview

You can read this, and propose an implementation plan.

precomputing a full snapshot

You would need to iterate over all files and do all computations twice, throwing away the chunks.

Also if you'd do this, you need to fundamentally change lots of parts of the current implementation to account for the existence of partial snapshots without any data.

It seems to be that you want to solve one issue (that in theory is rare and should almost never happen) with an obvious simple solution which in reality is a huge can of worms and not worth the effort to solve the original issue.

EnforcerE added a commit to EnforcerE/seedvault that referenced this issue Feb 17, 2025
@EnforcerE
Copy link
Author

EnforcerE commented Feb 17, 2025

It seems to be that you want to solve one issue (that in theory is rare and should almost never happen) with an obvious simple solution which in reality is a huge can of worms and not worth the effort to solve the original issue.

You are completely correct. It's just that my proposal seems to have gone unanswered up until today.

What it deletes from the server can't be used anyway, because there's no snapshot referencing it.

lots if snapshots with partial data

and we would never be able to delete any old data from the backend

Note that the data that is being deleted is not references at all, so you are not actually losing important data, but useless data

Glad we are finally on the same page. Regarding a formal proposal, I'm afraid that 403a19a is the best that I can do. With some 6500 files in the codebase, I am completely lost.

@EnforcerE EnforcerE linked a pull request Feb 17, 2025 that will close this issue
@grote
Copy link
Collaborator

grote commented Feb 18, 2025

Please just use a backend that isn't randomly eating your data. Other modern backup systems for non-mobile like restic or borgbackup also don't have magic tricks up their sleeves that can protect you from that.

@EnforcerE
Copy link
Author

EnforcerE commented Feb 18, 2025

Smartswitch does if you use unencrypted backups, but that's a bad example. Anyway, I thought of a way to achieve partial backups. It should be much simpler to implement than the previous diagram I provided. All it requires are accurate timestamps, preferrably in milliseconds (in case someone has very fast internet, or just forcing at least a second to pass between each upload) and that the files be uploaded sequentially (Nextcloud client is out). Correct me if I'm wrong, but the chunk name is just HMAC of its hash so this would be a relatively easy solution. There is no need to have a local cache with this implementation (although it would be practical).

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs info Requires more information from reporter
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants