-
Notifications
You must be signed in to change notification settings - Fork 6.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mcu_mgmt: Memory corruption (cborattr suspected) - test case with smp_svr #7924
Comments
@oliviermartin first thing I would try here is to increase your stack sizes. Try to replicate at the very least the sizes that you get when building the |
@oliviermartin ^^ Any update on that? |
I am on holiday for the next week. But increasing the stack to hide the memory corruption seems to be a bad idea to me. |
Are you sure whether it is memory corruption or app runs out of stack available? |
I am almost sure it is memory corruption. I checked with gdb the state of the the worker thread and there are plenty of space. When adding state canaries, it is not the stack of the current function that is corrupted but other function stacks. |
Unfortunately I couldn't reproduce exactly this behavior due problems with BLE interfaces on my desktop (I'm using VM with linux as quest, and for some reason VM stooped to connect to any of BLE interfaces It used to - so I was unable to resolve this malfunction during few hours). So I tried to reproduce this behavior via serial connection - but I didn't get exactly the same - I observed the timeout of image upload command, but not a landing in stack fault handler (or any else fault handler). After that the app was deaf for further commands. Will debug this further at the Friday. |
Have you enabled stack canaries? I suspect the reason the device could not get process more command is because of the memory corruption. I was lucky in my case the cborattr function overwrote the stack canaries otherwise I would not have seen the issue was from the Zephyr''s code. I have not acces to the code. If you could explain (or even better add comments in the code) how memory is allocated for CborAttrByteStringType in cbor_internal_read_object. |
@oliviermartin - can you try to extract the problem? I have tried (and I will) - but due other duties I have only limited amount of time to act. |
I potentially have one fix:
As the comment says In its current implementation, the null-character is added at the end of the buffer (and not after the end of the byte string). In our case, the image I still have a timing issue (in my Zephyr application, I do not know if it exists |
This update to the latest master of mcumgr fixes a memory corruption in the image management and updates the readme. Fixes zephyrproject-rtos#7924 Origin: mcumgr License: Apache 2.0 URL: https://github.com/apache/mynewt-mcumgr commit: a837a731b94927c6198e39744cd6d979be23942a Purpose: Fix memory corruption Maintained-by: External Signed-off-by: Johannes Hutter <[email protected]>
@carlescufi As I mentionned earlier, my fix does not fix the issue. It does not still work. For some reason, I cannot re-open the issue. Should I create a new one? |
@oliviermartin no need, reopened now |
@oliviermartin - can you recheck whether it is still visible after newest mcumgr fixes (#8937, #8711 - apache/mynewt-mcumgr#5 ) - so the master. I was unable to reproduce using this version. |
@nvlsianpu I saw your patch, I was thinking to test it to see if it fixes my issue. I will try to test it in the next 10 days 👍 I will leave a message in this issue and hopefully close it! |
@nvlsianpu At least with the latest fixes I do not see a crash anymore. I have a strange issue but it has nothing to do with this specific github ticket. I will investigate it later. This github ticket can be closed (for some reason I cannot close it myself). FYI, here is my issue:
|
what you had observed is the expected behavior, see the very last lines from doc: |
I reported an issue to |
I was trying to enable FOTA on my Nordic nRF52832 based device. I initially tested it with
zephyr/samples/subsys/mgmt/mcumgr/smp_svr
and it was working fine. But when trying to integrate themcumgr
module into my Zephyr application, the update was crashing with***** Stack Check Fail! *****
.I took me a while to understand why it works with
smp_svr
. But after enablingCONFIG_STACK_CANARIES
insmp_svr
it was crashing as well with the same***** Stack Check Fail! *****
. I tried to narrow down the code responsible and I am suspectingzephyr/samples/subsys/mgmt/mcumgr/cboattr
.To duplicate the issue I added to
zephyr/samples/subsys/mgmt/mcumgr/smp_svr/prj.con
:The
mcumgr
command that seems to trigger the issue isimage test
for me (whatever the image signature).The stack corruption occurs in
img_mgmt_state_write
. But I tried to comment some code incbor_internal_read_object
and following the lines I commented, there was or there was not a stack corruption. It looks like the typeCborAttrByteStringType
might cause the issue (and may other variable size CBOR type, to be confirmed).I noticed a recent raised issue related to
mcumgr
and memory corruption: #7722But from the comments it was looking like a false positive.
Maybe this issue #7613 might also be due to the memory corruption. If it is really a
zephyr/samples/subsys/mgmt/mcumgr/cboattr
issue then all the MCUMgr commands will be affected.The text was updated successfully, but these errors were encountered: