Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🚑 Fix performance form definition update #5085

Merged

Conversation

sergei-maertens
Copy link
Member

Closes #5084

Changes

  • Fixed the time spent in Python figuring out which variables to create/update/delete
  • Optimized the UPSERT query to avoid flooding it with data we can already ignore in Python

Checklist

Check off the items that are completed or not relevant.

  • Impact on features

    • Checked copying a form
    • Checked import/export of a form
    • Config checks in the configuration overview admin page
    • Problem detection in the admin email digest is handled
  • Release management

    • I have labelled the PR as "needs-backport" accordingly
  • I have updated the translations assets (you do NOT need to provide translations)

    • Ran ./bin/makemessages_js.sh
    • Ran ./bin/compilemessages_js.sh
  • Dockerfile/scripts

    • Updated the Dockerfile with the necessary scripts from the ./bin folder
  • Commit hygiene

    • Commit messages refer to the relevant Github issue
    • Commit messages explain the "why" of change, not the how

@sergei-maertens sergei-maertens added the needs-backport Fix must be backported to stable release branch label Feb 5, 2025
@sergei-maertens
Copy link
Member Author

Performance numbers on my local dev environment:

Before patch

# components in FD # forms using FD resulting # form variables Time (s) Query time (ms)
95 1 95 1.5 60
103 71 7313 176 1150

After patch

# components in FD # forms using FD resulting # form variables Time (s) Query time (ms)
95 1 95 0.20 55
103 71 7313 0.55 42

In this case, nothing was actually changed in the formio configuration, so the UPSERT query is essentially skipped (which made the bad performance all the more worse in the old situation before the patch).

We noticed 3min wall clock time when updating a form definition that
results in 7000 form variables being updated (local dev environment),
which is completely unacceptable.

Isolating the culprit/hot code path revealed that for 1000 form
variables about 10-12 seconds was being spent, so a conservative time
limit of 5 seconds is set up in tests to catch future performance
regressions. However, realistically this should really take less than
a second, but nobody like flaky tests so we pick something in the
middle to account for different hardware in CI.

Updates with small diffs should even take substantially less time,
as we can ignore work that needs not to be done and avoid creating
instances in the first place.
There are a number of aspects to this performance patch.

1. Avoid database load

The UPSERT query when sending all of the desired form variables
leads to hefty queries (250ms for 1100 variables, 1100ms for ~7K).
Splitting this up in work that can be ignore brings down the query
duration to about 50-60ms for 1000 variables.

2. Avoid expensive formio definition processing

By far the most expensive operation is scanning whether a component
is inside an editgrid ("repeating group") through the utility
component_in_editgrid, which was parsing the same configuration over
and over again. Instead, we can use the already-existing flag of
iter_components to not recurse into edit grids in the first place.

This fixes most of the time spent in Python.

3. Replace deepcopy with shallow copy

This is probably the most controversial one - when deepcopying
a django model instance, it goes through all the motions to serialize
it for picking, which means that it must figure out the reconstruction
function to use and capture all the necessary data, and deepcopy
recurses, which means it also goes into the related form_definition
and thus the formio configuration object which is full of lists/dicts
that are expensive to deep copy.

The replacement with a shallow copy should be fine because:

* we're not copying any existing database instances (pk=None)
* all the kwargs in initial instantiation are calculated and
  provided explicitly, there is no reliance on mutation
* when 'copying' the desired variables for each form, we assign
  the optimized form_id attribute and don't do any mutations,
  i.e. all operations remain on the shallow level

This is covered with some tests to hopefully prevent future
regressions.

Other ideas considered:

* don't store FormVariables in the database, but instead create them
  in memory on the fly. This will be a problem once we no longer store
  prefill configuration in the component, we will require actual DB
  instances. It's also not very intuitive

* offload to celery again. This is what we initially patched as it
  was leading to race conditions and dead locks and general performance
  issues too. It's may also strangely affect existing submissions.
  Given the complexity and the performance gains, this was not further
  explored.

On my local machine, this brings the worst case insert in the test
(1000 form variables from a form definition with 1000 components)
from 10+ seconds down to 400ms, so about a factor 25 improvement.
@sergei-maertens sergei-maertens force-pushed the issue/5084-fix-performance-form-definition-update branch from 4fcbe35 to 86b83cc Compare February 6, 2025 08:39
Copy link

codecov bot commented Feb 6, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 96.73%. Comparing base (7c10f99) to head (86b83cc).
Report is 3 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5085   +/-   ##
=======================================
  Coverage   96.73%   96.73%           
=======================================
  Files         771      771           
  Lines       26597    26619   +22     
  Branches     3460     3463    +3     
=======================================
+ Hits        25729    25751   +22     
  Misses        606      606           
  Partials      262      262           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@sergei-maertens sergei-maertens merged commit 87c96f1 into master Feb 6, 2025
32 checks passed
@sergei-maertens sergei-maertens deleted the issue/5084-fix-performance-form-definition-update branch February 6, 2025 09:06
sergei-maertens added a commit that referenced this pull request Feb 6, 2025
There are a number of aspects to this performance patch.

1. Avoid database load

The UPSERT query when sending all of the desired form variables
leads to hefty queries (250ms for 1100 variables, 1100ms for ~7K).
Splitting this up in work that can be ignore brings down the query
duration to about 50-60ms for 1000 variables.

2. Avoid expensive formio definition processing

By far the most expensive operation is scanning whether a component
is inside an editgrid ("repeating group") through the utility
component_in_editgrid, which was parsing the same configuration over
and over again. Instead, we can use the already-existing flag of
iter_components to not recurse into edit grids in the first place.

This fixes most of the time spent in Python.

3. Replace deepcopy with shallow copy

This is probably the most controversial one - when deepcopying
a django model instance, it goes through all the motions to serialize
it for picking, which means that it must figure out the reconstruction
function to use and capture all the necessary data, and deepcopy
recurses, which means it also goes into the related form_definition
and thus the formio configuration object which is full of lists/dicts
that are expensive to deep copy.

The replacement with a shallow copy should be fine because:

* we're not copying any existing database instances (pk=None)
* all the kwargs in initial instantiation are calculated and
  provided explicitly, there is no reliance on mutation
* when 'copying' the desired variables for each form, we assign
  the optimized form_id attribute and don't do any mutations,
  i.e. all operations remain on the shallow level

This is covered with some tests to hopefully prevent future
regressions.

Other ideas considered:

* don't store FormVariables in the database, but instead create them
  in memory on the fly. This will be a problem once we no longer store
  prefill configuration in the component, we will require actual DB
  instances. It's also not very intuitive

* offload to celery again. This is what we initially patched as it
  was leading to race conditions and dead locks and general performance
  issues too. It's may also strangely affect existing submissions.
  Given the complexity and the performance gains, this was not further
  explored.

On my local machine, this brings the worst case insert in the test
(1000 form variables from a form definition with 1000 components)
from 10+ seconds down to 400ms, so about a factor 25 improvement.

Backport-of: #5085
sergei-maertens added a commit that referenced this pull request Feb 6, 2025
There are a number of aspects to this performance patch.

1. Avoid database load

The UPSERT query when sending all of the desired form variables
leads to hefty queries (250ms for 1100 variables, 1100ms for ~7K).
Splitting this up in work that can be ignore brings down the query
duration to about 50-60ms for 1000 variables.

2. Avoid expensive formio definition processing

By far the most expensive operation is scanning whether a component
is inside an editgrid ("repeating group") through the utility
component_in_editgrid, which was parsing the same configuration over
and over again. Instead, we can use the already-existing flag of
iter_components to not recurse into edit grids in the first place.

This fixes most of the time spent in Python.

3. Replace deepcopy with shallow copy

This is probably the most controversial one - when deepcopying
a django model instance, it goes through all the motions to serialize
it for picking, which means that it must figure out the reconstruction
function to use and capture all the necessary data, and deepcopy
recurses, which means it also goes into the related form_definition
and thus the formio configuration object which is full of lists/dicts
that are expensive to deep copy.

The replacement with a shallow copy should be fine because:

* we're not copying any existing database instances (pk=None)
* all the kwargs in initial instantiation are calculated and
  provided explicitly, there is no reliance on mutation
* when 'copying' the desired variables for each form, we assign
  the optimized form_id attribute and don't do any mutations,
  i.e. all operations remain on the shallow level

This is covered with some tests to hopefully prevent future
regressions.

Other ideas considered:

* don't store FormVariables in the database, but instead create them
  in memory on the fly. This will be a problem once we no longer store
  prefill configuration in the component, we will require actual DB
  instances. It's also not very intuitive

* offload to celery again. This is what we initially patched as it
  was leading to race conditions and dead locks and general performance
  issues too. It's may also strangely affect existing submissions.
  Given the complexity and the performance gains, this was not further
  explored.

On my local machine, this brings the worst case insert in the test
(1000 form variables from a form definition with 1000 components)
from 10+ seconds down to 400ms, so about a factor 25 improvement.

Backport-of: #5085
@sergei-maertens
Copy link
Member Author

Backports:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-backport Fix must be backported to stable release branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Saving a (re-usable) form definition with many components is unreasonably slow
2 participants