Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libmbedtls: use mempool_calloc() for temporary memory #7177

Merged
merged 4 commits into from
Dec 16, 2024

Conversation

jenswi-linaro
Copy link
Contributor

mbedtls_mpi_exp_mod_optionally_safe() need a large chunk of temporary memory for the mbedtls_mpi_core_exp_mod() function. The amount of memory is too much to to reliably allocate from the heap. So use mempool_calloc() instead of mbedtls_calloc(), similar to using mbedtls_mpi_init_mempool() instead of mbedtls_mpi_init().

@jenswi-linaro
Copy link
Contributor Author

This should fix the recent random errors with MEASURED_BOOT_FTPM=y

@jforissier
Copy link
Contributor

Good catch!

Acked-by: Jerome Forissier <[email protected]>

Would you mind also pushing a temporary commit to enable fTPM in the CI? I mean similar to this one: 718c357 but with MEASURED_BOOT_FTPM=y instead of =n. If all is good then we know we can re-instate =y by default in build.git.

I am also running my release tests. I will post a Tested-by once they're good. Thanks!

@jenswi-linaro
Copy link
Contributor Author

It looks like fTPM uses a lot of heap:

xtest --stats --alloc (with MEASURED_BOOT_FTPM=n)
Pool:                Heap       
Bytes allocated:     20704

xtest --stats --alloc (with MEASURED_BOOT_FTPM=y)
Pool:                Heap
Bytes allocated:     44288

I'm trying to figure out where the memory went.

@etienne-lms
Copy link
Contributor

fTPM has quite many objects kept opened. See microsoft/ms-tpm-20-ref@1b288eb. IHIH.

@jenswi-linaro
Copy link
Contributor Author

It looks like fTPM has a few objects open in secure storage. There are many small allocations that may have caused fragmentation of the heap.

@jenswi-linaro
Copy link
Contributor Author

Thanks, Etienne, there are those 20kB I also noted.

@jenswi-linaro
Copy link
Contributor Author

@etienne-lms, can you create a PR against https://github.com/OP-TEE/optee_ftpm with those patches?

@etienne-lms
Copy link
Contributor

Ok, i'll do.

@jenswi-linaro
Copy link
Contributor Author

Replacing another malloc() with mempool_alloc().

@jforissier
Copy link
Contributor

For "core: pta: secstore: use mempool_alloc()":

Reviewed-by: Jerome Forissier <[email protected]>

Copy link
Contributor

@etienne-lms etienne-lms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed-by: Etienne Carriere <[email protected]> for commits
"libmbedtls: use mempool_calloc() for temporary memory" (s/need/needs/ in commit message)
and "core: pta: secstore: use mempool_alloc()".

@@ -54,6 +54,7 @@ endif

CFG_WITH_STATS ?= y
CFG_ENABLE_EMBEDDED_TESTS ?= y
CFG_CORE_BGET_BESTFIT ?= y
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enable BestFit or increase the heap size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to increase the heap also.

Copy link
Contributor

@etienne-lms etienne-lms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using the secure DDR mempool for allocating a 8kByte buffer makes sense, especially when the heap is quite small, likely to be when pager is enabled. I would be in favor of not reverting "core: pta: secstore: use mempool_alloc()".

@jenswi-linaro
Copy link
Contributor Author

That commit introduced a possible deadlock discovered with CFG_LOCKDEP=y. The problem is that normally, while the tadb_mutex is held, we also use the mempool (taking mempool_default->mu), but in install_ta(), the order is reversed.

I'm waiting for the CI checks to pass. Assuming they do what should be the next step?
The fTPM TA is active due to probing by the kernel during a few of the first tests with make check so fTPM is likely to peak in memory usage. I've just tried with a 1kB buffer in install_ta() and I didn't notice that the function took much more time. So 1Kb should be fine and that's also a reasonable heap allocation.

@etienne-lms
Copy link
Contributor

I agree 1kB in the heap is affordable. It would multiply by 8 the TEE/REE world switches when installing a TA but I think it's ok since installing a TA is not performance critical.

@jenswi-linaro
Copy link
Contributor Author

The "CI / make check (QEMUv8) 1 / 2 (pull_request)" error is with CFG_FTRACE_SUPPORT=y CFG_SYSCALL_FTRACE=y and the secure world is quite slow with this. Probing fTPM takes a long time and keyctl tests times out because of this. I think we should disable fTPM with CFG_FTRACE_SUPPORT=y CFG_SYSCALL_FTRACE=y.

@jforissier
Copy link
Contributor

The "CI / make check (QEMUv8) 1 / 2 (pull_request)" error is with CFG_FTRACE_SUPPORT=y CFG_SYSCALL_FTRACE=y and the secure world is quite slow with this. Probing fTPM takes a long time and keyctl tests times out because of this. I think we should disable fTPM with CFG_FTRACE_SUPPORT=y CFG_SYSCALL_FTRACE=y.

Sounds reasonable.

Copy link
Contributor

@jforissier jforissier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For "ci: QEMUv8: disable fTPM with ftrace":
Reviewed-by: Jerome Forissier <[email protected]>

Copy link
Contributor

@jforissier jforissier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For "core: pta: secstore: decrease TA buffer":
Acked-by: Jerome Forissier <[email protected]>

mbedtls_mpi_exp_mod_optionally_safe() needs a large chunk of temporary
memory for the mbedtls_mpi_core_exp_mod() function. The amount of memory
is too much to reliably allocate from the heap. So use mempool_calloc()
instead of mbedtls_calloc(), similar to using mbedtls_mpi_init_mempool()
instead of mbedtls_mpi_init().

Signed-off-by: Jens Wiklander <[email protected]>
Acked-by: Jerome Forissier <[email protected]>
Reviewed-by: Etienne Carriere <[email protected]>
@jenswi-linaro
Copy link
Contributor Author

Tags applied and fingers crossed. :-)

@jenswi-linaro
Copy link
Contributor Author

The core heap usage is increased by around 20kB with fTPM enabled so it makes sense if this has to be compensated.

@jforissier
Copy link
Contributor

For "plat-vexpress: increase QEMU heap size":
Reviewed-by: Jerome Forissier <[email protected]>

The core heap usage is increased by around 20kB with fTPM enabled so it makes sense if this has to be compensated.

Perhaps mention this in the commit message?

Copy link
Contributor

@etienne-lms etienne-lms left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed-by: Etienne Carriere <[email protected]> for commit
"core: pta: secstore: decrease TA buffer".

Acked-by: Etienne Carriere <[email protected]> for commit
"ci: QEMUv8: disable fTPM with ftrace".

Regarding "plat-vexpress: increase QEMU heap size": better explicit why it's increased as @jforissier suggests in case you want to step back becasue ftpm P-R#4 (or like) gets merged and this extra heap will no longer be needed.

@jforissier
Copy link
Contributor

I could not reproduce the CI failure locally (timeout in regression_1006 when pager is enabled). I restarted the job. We might have some race condition or yet another memory shortage perhaps?

@jenswi-linaro
Copy link
Contributor Author

regression_1006 might be executed while fTPM is still performing self-tests, which will put some pressure on the pager. Can we increase the timeout value when the pager is enabled?

install_ta() uses a buffer allocated from the heap while hashing a TA
while installing it. The buffer size is 8kB which is a bit large to
reliably allocate from the heap, so decrease it to 1kB.

Signed-off-by: Jens Wiklander <[email protected]>
Acked-by: Jerome Forissier <[email protected]>
Reviewed-by: Etienne Carriere <[email protected]>
Disable fTPM because it takes too long to probe with ftrace enabled in
OP-TEE.

Signed-off-by: Jens Wiklander <[email protected]>
Reviewed-by: Jerome Forissier <[email protected]>
Acked-by: Etienne Carriere <[email protected]>
The core heap usage is increased by around 20kB with fTPM enabled so it
makes sense if this has to be compensated.

Increase heap size for the QEMU variants:
- QEMU v7 from 64kB to 96kB
- QEMU v8 from 128kB to 192kB

Signed-off-by: Jens Wiklander <[email protected]>
Reviewed-by: Jerome Forissier <[email protected]>
@jenswi-linaro
Copy link
Contributor Author

Comments addressed and tags applied. I've dropped "[DO NOT MERGE] ci: QEMUv8: set MEASURED_BOOT_FTPM=y".
We could disable tests with fTPM and paging if it turns out that we have sporadic timeouts with that configuration.

@jforissier
Copy link
Contributor

This is OK for merging. @jenswi-linaro could you please enable fTPM again in build.git? Thanks!

@jforissier jforissier merged commit b4ed37a into OP-TEE:master Dec 16, 2024
10 checks passed
@jenswi-linaro jenswi-linaro deleted the mbedtls-fix branch December 16, 2024 14:05
jenswi-linaro added a commit to jenswi-linaro/build that referenced this pull request Dec 16, 2024
By default, enable MEASURED_BOOT_FTPM. The previous issues when MEASURED_BOOT_FTPM was enabled has been resolved in [1].

Link: [1] OP-TEE/optee_os#7177
Signed-off-by: Jens Wiklander <[email protected]>
Acked-by: Jerome Forissier <[email protected]>
@jenswi-linaro
Copy link
Contributor Author

Sure, OP-TEE/build#797

jforissier pushed a commit to OP-TEE/build that referenced this pull request Dec 16, 2024
By default, enable MEASURED_BOOT_FTPM. The previous issues when MEASURED_BOOT_FTPM was enabled has been resolved in [1].

Link: [1] OP-TEE/optee_os#7177
Signed-off-by: Jens Wiklander <[email protected]>
Acked-by: Jerome Forissier <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants