Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checkable: Don't skip redundancy group checks for parent dependencies #10228

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

yhabteab
Copy link
Member

@yhabteab yhabteab commented Nov 12, 2024

Previously, all parent dependencies were recursively checked right at the beginning of Checkable::IsReachable() for reachability and immediately returned false if one of them fails. However, since the introduction of redundancy_group, that failed parent might be within a redundancy group, which shouldn't necessarily cause the dependency to fail completely. This PR changes this insofar as not all parents are checked at the beginning of the method, but only a single parent per a dependency object is evaluated.

Tests

Icinga 2 Config
include <itl>

object CheckerComponent "checker" {}
object ApiListener "api" {}

object ApiUser "root" {
  password = "icinga"
  permissions = [ "*" ]
}

template Host "default" default {
	check_command = "dummy"
	max_check_attempts = 1
	check_interval = 10s

	vars.dummy_state = 0
	vars.dummy_text = "I'm just testing something"
}

object Host "paternal_grandfather" {
    vars.dummy_state = 2
}
object Host "maternal_grandfather" {}
object Host "father" {}
object Host "mother" {
    vars.dummy_state = 2
}
object Host "son" {}

object Dependency "paternal_grandfather" {
    parent_host_name = "paternal_grandfather"
    child_host_name = "father"
}

object Dependency "father" {
    parent_host_name = "father"
    child_host_name = "son"
    redundancy_group = "parents"
}
object Dependency "mother" {
    parent_host_name = "mother"
    child_host_name = "son"
    redundancy_group = "parents"
}

Before

<63> => get_host("paternal_grandfather").last_check_result.state
2.000000
<64> => get_host("father").last_check_result.vars_after.reachable
false
<65> => get_host("mother").last_check_result.state
2.000000
<66> => get_host("son").last_check_result.vars_after.reachable
false

Now, change the dummy state for the mother Host to vars.dummy_state = 0.

<55> => get_host("paternal_grandfather").last_check_result.state
2.000000
<56> => get_host("father").last_check_result.vars_after.reachable
false
<57> => get_host("mother").last_check_result.state
0.000000
<58> => get_host("son").last_check_result.vars_after.reachable
false

After

<30> => get_host("paternal_grandfather").last_check_result.state
2.000000
<31> => get_host("father").last_check_result.vars_after.reachable
false
<32> => get_host("mother").last_check_result.state
2.000000
<33> => get_host("son").last_check_result.vars_after.reachable
false

Now, change the dummy state for the mother Host to vars.dummy_state = 0.

<40> => get_host("paternal_grandfather").last_check_result.state
2.000000
<41> => get_host("father").last_check_result.vars_after.reachable
false
<42> => get_host("mother").last_check_result.state
0.000000
<43> => get_host("son").last_check_result.vars_after.reachable
true

fixes #10014

@yhabteab yhabteab added this to the 2.15.0 milestone Nov 12, 2024
@yhabteab yhabteab self-assigned this Nov 12, 2024
@icinga-probot icinga-probot bot added area/runtime Downtimes, comments, dependencies, events bug Something isn't working labels Nov 12, 2024
@cla-bot cla-bot bot added the cla/signed label Nov 12, 2024
@yhabteab yhabteab force-pushed the evaluate-parent-dependencies-correctly branch from 4ad59ed to 492621a Compare November 12, 2024 16:11
lib/icinga/checkable-dependency.cpp Show resolved Hide resolved
lib/icinga/checkable-dependency.cpp Outdated Show resolved Hide resolved
lib/icinga/checkable-dependency.cpp Outdated Show resolved Hide resolved
lib/icinga/checkable-dependency.cpp Outdated Show resolved Hide resolved
std::string redundancy_group = dep->GetRedundancyGroup();
bool parentReachable{true};
if (Checkable::Ptr parent = dep->GetParent(); parent.get() != this) {
parentReachable = parent->IsReachable(dt, failedDependency, rstack + 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly other construction area, but while on it, you could increment rstack just once before the loop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, because currently the sum of all direct and intermediate dependencies is limited to 256. You want me to change this so that the limit applies to each dependency object individually? I can do this, but as you already said, it's not related to this issue.

lib/icinga/checkable-dependency.cpp Outdated Show resolved Hide resolved
@yhabteab yhabteab force-pushed the evaluate-parent-dependencies-correctly branch from 492621a to ddc9e46 Compare November 13, 2024 15:39
@yhabteab yhabteab force-pushed the evaluate-parent-dependencies-correctly branch from ddc9e46 to cab27fc Compare November 14, 2024 13:15
lib/icinga/checkable-dependency.cpp Outdated Show resolved Hide resolved
lib/icinga/checkable-dependency.cpp Outdated Show resolved Hide resolved
@yhabteab yhabteab force-pushed the evaluate-parent-dependencies-correctly branch from cab27fc to af33327 Compare November 14, 2024 15:38
Previously, all parent dependencies were recursively checked right at
the beginning of `Checkable::IsReachable()` for reachability and
immediately returned `false` if one of them fails. However, since the
introduction of `redundancy_group`, that failed parent might be within a
redundancy group, which shouldn't necessarily cause the dependency to
fail completely. This PR changes this insofar as not all parents are
checked at the beginning of the method, but only a single parent per
a dependency object is evaluated.
@yhabteab yhabteab force-pushed the evaluate-parent-dependencies-correctly branch from af33327 to d7936cd Compare November 27, 2024 09:00
Copy link
Member

@Al2Klimov Al2Klimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/runtime Downtimes, comments, dependencies, events bug Something isn't working cla/signed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Child host becomes unreachable in redundant group when parent is unreachable
2 participants