Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix motion vector computation after #17688. #17717

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

pcwalton
Copy link
Contributor

@pcwalton pcwalton commented Feb 7, 2025

PR #17688 broke motion vector computation, and therefore motion blur, because it enabled retention of MeshInputUniforms, and MeshInputUniforms contain the indices of the previous frame's transform and the previous frame's skinned mesh joint matrices. On frame N, if a MeshInputUniform is retained on GPU from the previous frame, the previous_input_index and previous_skin_index would refer to the indices for frame N - 2, not the index for frame N - 1.

This patch fixes the problems. It solves these issues in two different ways, one for transforms and one for skins:

  1. To fix transforms, this patch supplies the frame index to the shader as part of the view uniforms, and specifies which frame index each mesh's previous transform refers to. So, in the situation described above, the frame index would be N, the previous frame index would be N - 1, and the previous_input_frame_number would be N - 2. The shader can now detect this situation and infer that the mesh has been retained, and can therefore conclude that the mesh's transform hasn't changed.

  2. To fix skins, this patch replaces the explicit previous_skin_index with an invariant that the index of the joints for the current frame and the index of the joints for the previous frame are the same. This means that the MeshInputUniform never has to be updated even if the skin is animated. The downside is that we have to copy joint matrices from the previous frame's buffer to the current frame's buffer in extract_skins.

The rationale behind (2) is that we currently have no mechanism to detect when joints that affect a skin have been updated, short of comparing all the transforms and setting a flag for extract_meshes_for_gpu_building to consume, which would regress performance as we want extract_skins and
extract_meshes_for_gpu_building to be able to run in parallel.

To test this change, use cargo run --example motion_blur.

PR bevyengine#17688 broke motion vector computation, and therefore motion blur,
because it enabled retention of `MeshInputUniform`s, and
`MeshInputUniform`s contain the indices of the previous frame's
transform and the previous frame's skinned mesh joint matrices. On frame
N, if a `MeshInputUniform` is retained on GPU from the previous frame,
the `previous_input_index` and `previous_skin_index` would refer to the
indices for frame N - 2, not the index for frame N - 1.

This patch fixes the problems. It solves these issues in two different
ways, one for transforms and one for skins:

1. To fix transforms, this patch supplies the *frame index* to the
   shader as part of the view uniforms, and specifies which frame index
   each mesh's previous transform refers to. So, in the situation
   described above, the frame index would be N, the previous frame index
   would be N - 1, and the `previous_input_frame_number` would be N - 2.
   The shader can now detect this situation and infer that the mesh has
   been retained, and can therefore conclude that the mesh's transform
   hasn't changed.

2. To fix skins, this patch replaces the explicit `previous_skin_index`
   with an invariant that the index of the joints for the current frame
   and the index of the joints for the previous frame are the same. This
   means that the `MeshInputUniform` never has to be updated even if the
   skin is animated. The downside is that we have to copy joint matrices
   from the previous frame's buffer to the current frame's buffer in
   `extract_skins`.

The rationale behind (2) is that we currently have no mechanism to
detect when joints that affect a skin have been updated, short of
comparing all the transforms and setting a flag for
`extract_meshes_for_gpu_building` to consume, which would regress
performance as we want `extract_skins` and
`extract_meshes_for_gpu_building` to be able to run in parallel.

To test this change, use `cargo run --example motion_blur`.
@pcwalton pcwalton added C-Bug An unexpected or incorrect behavior P-Regression Functionality that used to work but no longer does. Add a test for this! A-Rendering Drawing game state to the screen S-Needs-Review Needs reviewer attention (from anyone!) to move forward labels Feb 7, 2025
Copy link
Contributor

@tychedelia tychedelia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Tested on macOS.

@aevyrie
Copy link
Member

aevyrie commented Feb 8, 2025

Motion blur seems to be broken, not sure if introduced here though
image

The car is not moving relative to the camera, but it is blurred. For comparison, this is what it should look like (from 0.15):

image

After poking for a bit, it looks like motion vectors are just the camera motion?

Getting rid of previous_input_is_valid makes the output more correct - the car is not blurred, but meshes like trees are super blurred and flicker.

Okay, even more interesting. If I get rid of previous_input_is_valid, the first few moments look correct - everything is blurred as expected, until more trees come into view, at which point they become massively over blurred. This almost seems like an indexing error?

Ex: everything is correct except the trees in the distance on the left:

image

Then, once the car rounds the corner and a bunch of meshes come into view, all the newly visible/disoccluded meshes have broken motion vectors:
image

Huh, it also appears the previous_input_frame_count > view.frame_count always returns true, which seems very wrong?

@alice-i-cecile alice-i-cecile added this to the 0.16 milestone Feb 10, 2025
Copy link
Member

@aevyrie aevyrie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issues appear to be resolved in latest commit.

pub previous_skin_index: u32,
/// The material and lightmap indices, packed into 32 bits.
///
/// Low 16 bits: index of the material inside the bind group data.
/// High 16 bits: index of the lightmap in the binding array.
pub material_and_lightmap_bind_group_slot: u32,
pub timestamp: u32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why call it timestamp if it's a frame count?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe frame_count_when_updated or something like that? That seems to be the semantics of it.

@@ -232,3 +233,18 @@ pub fn no_automatic_skin_batching(
commands.entity(entity).try_insert(NoAutomaticBatching);
}
}

pub fn mark_meshes_as_changed_if_their_skins_changed(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency:

Suggested change
pub fn mark_meshes_as_changed_if_their_skins_changed(
pub fn mark_3d_meshes_as_changed_if_their_skins_changed(

It would be nice to port all this to 2D as well... :)

Copy link
Contributor

@superdump superdump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor variable naming thing for clarity. Otherwise LGTM.

pcwalton added a commit to pcwalton/bevy that referenced this pull request Feb 12, 2025
Currently, Bevy rebuilds the buffer containing all the transforms for
joints every frame, during the extraction phase. This is inefficient in
cases in which many skins are present in the scene and their joints
don't move, such as the Caldera test scene.

To address this problem, this commit switches skin extraction to use a
set of retained GPU buffers with allocations managed by the offset
allocator. I use fine-grained change detection in order to determine
which skins need updating. Note that the granularity is on the level of
an entire skin, not individual joints. Using the change detection at
that level would yield poor performance in common cases in which an
entire skin is animated at once. Also, this patch yields additional
performance from the fact that changing joint transforms no longer
requires the skinned mesh to be re-extracted.

Note that this optimization can be a double-edged sword. In
`many_foxes`, fine-grained change detection regressed the performance of
`extract_skins` by 3.4x. This is because every joint is updated every
frame in that example, so change detection is pointless and is pure
overhead. Because the `many_foxes` workload is actually representative
of animated scenes, this patch includes a heuristic that disables
fine-grained change detection if the number of transformed entities in
the frame exceeds a certain fraction of the total number of joints.
Currently, this threshold is set to 25%. Note that this is a crude
heuristic, because it doesn't distinguish between the number of
transformed *joints* and the number of transformed *entities*; however,
it should be good enough to yield the optimum code path most of the
time.

Finally, this patch fixes a bug whereby skinned meshes are actually
being incorrectly retained if the buffer offsets of the joints of those
skinned meshes changes from frame to frame. To fix this without
retaining skins, we would have to re-extract every skinned mesh every
frame. Doing this was a significant regression on Caldera. With this PR,
by contrast, mesh joints stay at the same buffer offset, so we don't
have to update the `MeshInputUniform` containing the buffer offset every
frame. This also makes PR bevyengine#17717 easier to implement, because that PR
uses the buffer offset from the previous frame, and the logic for
calculating that is simplified if the previous frame's buffer offset is
guaranteed to be identical to that of the current frame.

On Caldera, this patch reduces the time spent in `extract_skins` from
1.79 ms to near zero. On `many_foxes`, this patch regresses the
performance of `extract_skins` by approximately 10%-25%, depending on
the number of foxes. This has only a small impact on frame rate.
@pcwalton
Copy link
Contributor Author

I'd like #17818 to land first as that will simplify things.

@pcwalton pcwalton added the S-Blocked This cannot move forward until something else changes label Feb 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen C-Bug An unexpected or incorrect behavior P-Regression Functionality that used to work but no longer does. Add a test for this! S-Blocked This cannot move forward until something else changes S-Needs-Review Needs reviewer attention (from anyone!) to move forward
Projects
Status: No status
Development

Successfully merging this pull request may close these issues.

6 participants