[BUG] Qual tool gives inaccurate speedup score of `1.0` when there are failed stages #1551

kuhushukla · 2025-02-19T19:08:19Z

Describe the bug
For a given query run, take 2 eventlogs where one had stage failures (eg. fetch failures) and run the qual tool on it. The eventlog with failed stages will give a score of 1.0 while the other will give a legitimate score. This is an understood limitation(works as intended) of the tools model since we don't want to include stage failure related information to murk the data points for different features. However, with large enterprise customers, we get a very large number of eventlogs with some or the other failure (a lot of times outside the app dev team's full control) and we should try and handle this case better.

Steps/Code to reproduce bug
Note above

Expected behavior
Attempt to give a valid score

Environment details (please complete the following information)

Any but in this particular case - YARN

The text was updated successfully, but these errors were encountered:

kuhushukla added ? - Needs Triage bug Something isn't working labels Feb 19, 2025

mattahrens assigned sayedbilalbari and amahussein Feb 28, 2025

mattahrens removed the ? - Needs Triage label Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Qual tool gives inaccurate speedup score of `1.0` when there are failed stages #1551

[BUG] Qual tool gives inaccurate speedup score of `1.0` when there are failed stages #1551

kuhushukla commented Feb 19, 2025

[BUG] Qual tool gives inaccurate speedup score of 1.0 when there are failed stages #1551

[BUG] Qual tool gives inaccurate speedup score of 1.0 when there are failed stages #1551

Comments

kuhushukla commented Feb 19, 2025

[BUG] Qual tool gives inaccurate speedup score of `1.0` when there are failed stages #1551

[BUG] Qual tool gives inaccurate speedup score of `1.0` when there are failed stages #1551