-
Notifications
You must be signed in to change notification settings - Fork 7
run caching and performance? #7
Comments
Hello @Pomax, First of all thank you for your feedback. About caching or any other way to preserve the state: the main thing I not wanted to do when wrote the imgsum and Deduplicator - is to make them working isolated, i.e. not producing any side effects to current environment. I've seen some similar Plug-ins to Lightroom and their main flaw was in exactly that place: they left their data in Lightroom catalog (which is actually SQLite Database). Unfortunately Lightroom SDK provides (provided actually - at the moment of writing Deduplicator) no way to track database schemata changes in context of Plug-In to make it possible to completely clean up the Plug-In data on Plug-In uninstallation and produce no garbage to stay in Lightroom Catalog file forever. Another approach here is storing the state of analysis process separately in some kind of cache file which is also not perfect - you could want to use dedicated scratch dir, don't have enough disk space, etc. So it's kinda painful part and while I couldn't imagine any rock-solid solution it was done as stateless task so that's why it begins from start each time - there's no way to make it reliable without trade-offs. The only way I could imagine how it could be done - is to make it with web-service storing image identifies bound to user ID - but it's another story. The pretty simple workaround how to deal with it: start analysis by parts, i.e. by year, month, place, etc. manually - this will obviously work since it doesn't need anything from Deduplicator and the whole control is on user side. Now about performance: And it's good idea to perform the benchmark and provide some info about that - I could probably do that in both of imgsum and Deduplicator README's for better transparency in a couple of days. So that's it. Thanks again for your feedback. Please feel free to post any updates or other issues :) PS. Not closing the issue is there's no solution at the moment. |
This would be a great solution though, as long as it comes with a checkbox that's disabled by default, so that the plugin keeps doing exactly what it's doing already right now when updated, while giving people the option to accept the downsides if they consider that a price worth paying for the upside. E.g:
(removing the file after a completed run, so as not to have to run into the problem of having to resync based on old data) The idea that people who use Lighroom may not have enough diskspace is probably an overly-safe one: I have hundreds of gigs of data (even a casual shooter using RAW will have tens of gigs), so lightroom catalogs tend to live on large drives with plenty of space. If some app or plugin needs to write a temporary file that grows to several hundred mb I will literally never notice. So I personally consider that perfectly fine, since every application and plugin already has their own settings files and folders. I do agree that writing it into the lrcat sqlite3 file would be bad. No matter what table name(s) someone comes up with, it's never guaranteed to be a safe name, and it's quite a lot harder to remove the orphaned data if you remove the plugin. Having a real file on disk, that the plugin tells you about up front, is much better. (Even if that file is also a .sqlite3 database file rather than a flat text file). |
I got here via https://exchange.adobe.com/creativecloud.details.20053.deduplicator.html, which has a review that notes:
Can the readme be updated with a performance benchmark so folks have an idea of what to expect, and does this extension perform run-caching, so it doesn't "Start from zero" but just resumes for catalog entries it hasn't checked? (if not, is that's something that be added?)
I'm currently using Duplicate Photo Finder outside of LR, with a script that patches the LR catalog sqlite file after duplicates get removed, but it's a pretty crappy workflow.
The text was updated successfully, but these errors were encountered: