Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement deduplication #15

Open
PotcFdk opened this issue Apr 29, 2019 · 1 comment
Open

Implement deduplication #15

PotcFdk opened this issue Apr 29, 2019 · 1 comment

Comments

@PotcFdk
Copy link
Owner

PotcFdk commented Apr 29, 2019

While any video belongs to exactly one channel, we do support playlists and thus can have a lot of cases where a video belongs to several different profiles.
However, each profile should be able to define a format (see #11), so it's not enough to just check for identical video ids, because they might impose different requirements upon the to-be-stored data.

One way to deal with this issue would be to only deduplicate if the format is identical - which might be good enough, seeing as "maximum video and audio quality" and perhaps separate audio-only profiles are the most likely use-case for this project.

Also, there are multiple ways of deduplicating.

  • The easiest would be to just store one real copy and symlink (or hardlink) everything else. Downside: Simply backing up a profile directory might not backup 100 % of the data because duplicate videos might be stored outside of it.
  • A more complex and less portable way would be to use file system features. For example, btrfs is CoW and we could use the FIDEDUPERANGE ioctl to save disk space.
  • ?
@PotcFdk
Copy link
Owner Author

PotcFdk commented Apr 29, 2019

Downside: Simply backing up a profile directory might not backup 100 % of the data because duplicate videos might be stored outside of it.

This doesn't hold true for the hardlink case.
Unless there's a better idea, my preference is supporting

  1. CoW-copies
  2. hardlinks
  3. symlinks

in descending order of preference, depending on which of those are supported by the filesystem and which ones we have the permissions for.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant