Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'Split and Stitch' Encoding #9

Open
4 tasks
in03 opened this issue Nov 12, 2021 · 3 comments · May be fixed by #257
Open
4 tasks

'Split and Stitch' Encoding #9

in03 opened this issue Nov 12, 2021 · 3 comments · May be fixed by #257
Assignees
Labels
feature New feature or request

Comments

@in03
Copy link
Owner

in03 commented Nov 12, 2021

I've been messing around with running separate FFmpeg processes on segments of the same video file.
There are a bunch of benefits here:

  • Encoding speed is only limited by chunk-duration and worker-pool size
  • Greater resource utilisation
  • Chunked job structure lends to more reliable progress-metrics

I've got this working reliably locally - with no performance gains obviously since I'm running all the FFmpeg processes on the same machine.

To get this working here we'll need a few things:

  • A new task to parse the original job into segments
  • The encoding task's group will become a chord so we can join the segments after encode
  • Job must be pickled, not sent as JSON. We need to transport Resolve's PyRemoteObj MediaPoolItem as task results
  • Additional cleanup required for temporary files (original segments, temp folder structure, etc.)
@in03 in03 added the feature New feature or request label Nov 12, 2021
@in03 in03 self-assigned this Nov 12, 2021
@in03 in03 added this to the Improve encoding performance milestone Jun 2, 2022
@in03
Copy link
Owner Author

in03 commented Oct 18, 2022

A lot of water under the bridge since this issue was opened!

Adding split and stitch encoding (a.k.a chunking), is going to be a huge refactor of the way job objects are handled. The job ojects should be refactored anyway, so adding chunking is a good reason to do it.

My list above is a little outdated now:

"Chunking" the jobs

  • A new task to parse the original job into segments

This doesn't need to be a task. We can chunk quickly and easily in the queuer. We can even only chunk jobs that are over a certain duration. Since Celery's object primitives allow for some pretty flexible nesting, we can check the duration of the source media and if it's over a certain duration, split it into a group of tasks with a callback (i.e. a "chord"), then we can wrap unchunked job plain-Jane-tasks and chunked job chords alike in a group.

  • The encoding task's group will become a chord so we can join the segments after encode

Our chunked job callbacks can be Celery tasks themselves to do the necessary ffmpeg join and cleanup. We can add a custom Celery task state for concatenating and removign temporary files to have this show up in queuer-side progress too.
Each primitive has its own results and those results are accessible asynchronously, even when deeply nested. This makes it possible to accomodate chunking in our queuer side progressbar with little modification.

Pickling for postencode

  • Job must be pickled, not sent as JSON. We need to transport Resolve's PyRemoteObj MediaPoolItem as task results

PyRemoteObjs are in-memory references. There is no way for media pool items to survive the round-trip between queuer and worker, except for using an ID to reference the in-memory objects on the queuer.

Maybe now with Resolve 18 it would be possible to use the new unique_id attributes to retain a reference to a media pool item across machines, but even so that would iterating all databases, projects, timelines, timeline items and finally media pool items by their respective ids. This may be just fine for a SQL database, but with the Python API, it's slow and made worse by the need for Resolve to actually open the projects and timelines to do this.

We're keeping it simple. Link if the same project is open, leave it to the user to reiterate the timelines manually when the proejct is open next time.

Sorry, something went wrong.

@in03
Copy link
Owner Author

in03 commented Feb 10, 2023

/cib

@github-actions
Copy link
Contributor

Branch feature/issue-9--Split-and-Stitch-Encoding created!

github-actions bot pushed a commit that referenced this issue Feb 10, 2023
@github-actions github-actions bot linked a pull request Feb 10, 2023 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant