Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[T2][202405] Orchagent crashing while running create_switch with SAI_STATUS_INVALID_PARAMETER #21032

Open
arista-nwolfe opened this issue Dec 4, 2024 · 3 comments
Assignees
Labels
Chassis 🤖 Modular chassis support Triaged this issue has been triaged

Comments

@arista-nwolfe
Copy link
Contributor

On various reboot and config reload sonic-mgmt tests occasionally we'll see orchagent crash while creating the switch.

2024 Dec  2 20:54:34.757031 cmp227-4 ERR syncd1#syncd: [07:00.0] SAI_API_SWITCH:_brcm_sai_dnx_cos_mmu_instru_synced_counter_init:10721 instru synced counters config set failed with error Invalid parameter (0xfffffffc).
2024 Dec  2 20:54:34.757095 cmp227-4 ERR syncd1#syncd: [07:00.0] SAI_API_SWITCH:brcm_sai_dnx_create_switch:9107 DNX cos mmu init failed with error -5.
2024 Dec  2 20:54:34.757158 cmp227-4 ERR syncd1#syncd: :- sendApiResponse: api SAI_COMMON_API_CREATE failed in syncd mode: SAI_STATUS_INVALID_PARAMETER
2024 Dec  2 20:54:34.757158 cmp227-4 INFO syncd1#supervisord: syncd #015#015
2024 Dec  2 20:54:34.757158 cmp227-4 INFO syncd1#supervisord: syncd 0:bcm_dnx_instru_synced_triggers_enable_set:  Error 'Invalid parameter' indicated, absolute time cannot be older then current time.#015
2024 Dec  2 20:54:34.757158 cmp227-4 INFO syncd1#supervisord: syncd #015#015
2024 Dec  2 20:54:34.757158 cmp227-4 INFO syncd1#supervisord: syncd 0:bcm_dnx_instru_synced_counters_config_set:  Error 'Invalid parameter' indicated ; #015#015

Here is the BT of orchagent:

#0  0x00007f61b2a5cebc in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7f61b22d3a40 (LWP 57))]
(gdb) bt
#0  0x00007f61b2a5cebc in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007f61b2a0dfb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007f61b29f8472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x000055928d6e84ad in handleSaiFailure (abort_on_failure=abort_on_failure@entry=true) at ./orchagent/saihelper.cpp:834
#4  0x000055928d739635 in main (argc=<optimized out>, argv=<optimized out>) at ./orchagent/main.cpp:681

This is the create_switch SAI call which is failing:
https://github.com/sonic-net/sonic-swss/blob/202405/orchagent/main.cpp#L677

    status = sai_switch_api->create_switch(&gSwitchId, (uint32_t)attrs.size(), attrs.data());
    if (status != SAI_STATUS_SUCCESS)
    {
        SWSS_LOG_ERROR("Failed to create a switch, rv:%d", status);
        handleSaiFailure(true);
    }
    SWSS_LOG_NOTICE("Create a switch, id:%" PRIu64, gSwitchId);

Some of the sonic-mgmt tests we've reproduced this on are:

platform_tests/test_reload_config.py::test_reload_config::test_reload_configuration_checks
platform_tests/test_reboot.py::test_cold_reboot
platform_tests/test_link_down.py::test_link_status_on_host_reboot
@rlhui
Copy link
Contributor

rlhui commented Dec 4, 2024

is this a SAI issue? @arista-nwolfe

@arista-nwolfe
Copy link
Contributor Author

is this a SAI issue? @arista-nwolfe

I believe so, I'm going to open a CSP with broadcom shortly, I'll update this issue with the ID when I have it.

@arista-nwolfe
Copy link
Contributor Author

Opened CS00012381504 with BRCM for this issue

@tjchadaga tjchadaga added Chassis 🤖 Modular chassis support Triaged this issue has been triaged labels Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Chassis 🤖 Modular chassis support Triaged this issue has been triaged
Projects
Status: No status
Development

No branches or pull requests

3 participants