Skip to content

docs: add atm troubleshooting doc #293

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 7, 2025
Merged

Conversation

zhiying-lin
Copy link
Contributor

What type of PR is this?

/kind documentation

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Requirements:

How has this code been tested

Special notes for your reviewer

Copy link

kaito-pr-agent bot commented Apr 25, 2025

Title

(Describe updated until commit 14f9887)

Add DNS Based Global Load Balancing Troubleshooting Guide


Description

  • Added troubleshooting guide for DNS Based Global Load Balancing

  • Included explanations for TrafficManagerProfile and TrafficManagerBackend issues

  • Provided examples and error messages for clarity


Changes walkthrough 📝

Relevant files
Documentation
README.md
Update README with weight calculation details                       

docs/concepts/DNSBasedGlobalLoadBalancing/README.md

  • Added detailed explanation of how weights are calculated for Azure
    Traffic Manager endpoints
  • Explained how to disable traffic for specific clusters or services
  • +12/-0   
    DNSBasedGlobalLoadBalancing.md
    Add DNS Based Global Load Balancing Troubleshooting Guide

    docs/toubleshooting/DNSBasedGlobalLoadBalancing.md

  • Created new troubleshooting guide for DNS Based Global Load Balancing
  • Included common issues and solutions for TrafficManagerProfile and
    TrafficManagerBackend
  • Provided sample status messages and error codes
  • +117/-0 

    Need help?
  • Type /help how to ... in the comments thread for any questions about PR-Agent usage.
  • Check out the documentation for more information.
  • Copy link

    kaito-pr-agent bot commented Apr 25, 2025

    PR Reviewer Guide 🔍

    (Review updated until commit 14f9887)

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 3 🔵🔵🔵⚪⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Recommended focus areas for review

    Formatting Issue

    The sample status YAML in the document is not properly indented, which may cause readability issues.

    sample status:
    ```yaml
    status:
      conditions:
      - lastTransitionTime: "2025-04-29T02:57:33Z"
        message: |
          Invalid profile: GET https://management.azure.com/subscriptions/xxx/resourceGroups/your-fleet-atm-rg/providers/Microsoft.Network/trafficmanagerprofiles/fleet-34ec2e40-5cc4-4a30-8c09-4b787169cef0
          --------------------------------------------------------------------------------
          RESPONSE 403: 403 Forbidden
          ERROR CODE: AuthorizationFailed
          --------------------------------------------------------------------------------
          {
            "error": {
              "code": "AuthorizationFailed",
              "message": "The client 'xxx' with object id 'xxx' does not have authorization to perform action 'Microsoft.Network/trafficmanagerprofiles/read' over scope '/subscriptions/xxx/resourceGroups/your-fleet-atm-rg/providers/Microsoft.Network/trafficmanagerprofiles/fleet-34ec2e40-5cc4-4a30-8c09-4b787169cef0' or the scope is invalid. If access was recently granted, please refresh your credentials."
            }
          }
    Formatting Issue

    The sample status YAML in the document is not properly indented, which may cause readability issues.

    ```yaml
     status:
        conditions:
        - lastTransitionTime: "2025-04-29T06:39:10Z"
          message: Domain name is not available. Please choose a different profile name
            or namespace
          observedGeneration: 2
          reason: DNSNameNotAvailable
          status: "False"
          type: Programmed
    
    </details>
    
    <details><summary><a href='https://github.com/Azure/fleet-networking/pull/293/files#diff-29a8d18842958c88964cda2c80f9d979f84fd0792562c21193bb03904c1c5662R61-R70'><strong>Formatting Issue</strong></a>
    
    The sample status YAML in the document is not properly indented, which may cause readability issues.
    </summary>
    
    ```markdown
    ```yaml
    status:
     conditions:
     - lastTransitionTime: "2025-04-29T06:43:57Z"
     message: TrafficManagerProfile "invalid-profile" is not found
     observedGeneration: 1
     reason: Invalid
     status: "False"
     type: Accepted
    
    
    </details>
    
    <details><summary><a href='https://github.com/Azure/fleet-networking/pull/293/files#diff-29a8d18842958c88964cda2c80f9d979f84fd0792562c21193bb03904c1c5662R87-R95'><strong>Formatting Issue</strong></a>
    
    The sample status YAML in the document is not properly indented, which may cause readability issues.
    </summary>
    
    ```markdown
    status:
     conditions:
     - lastTransitionTime: "2025-04-29T07:50:49Z"
       message: ServiceImport "invalid-service" is not found
       observedGeneration: 1
       reason: Invalid
       status: "False"
       type: Accepted
    
    
    </details>
    
    <details><summary><a href='https://github.com/Azure/fleet-networking/pull/293/files#diff-29a8d18842958c88964cda2c80f9d979f84fd0792562c21193bb03904c1c5662R99-R109'><strong>Formatting Issue</strong></a>
    
    The sample status YAML in the document is not properly indented, which may cause readability issues.
    </summary>
    
    ```markdown
    status:
     conditions:
     - lastTransitionTime: "2025-04-29T07:56:05Z"
       message: '1 service(s) exported from clusters cannot be exposed as the Azure
         Traffic Manager, for example, service exported from aks-member-5 is invalid:
         unsupported service type "ClusterIP"'
       observedGeneration: 1
       reason: Invalid
       status: "False"
       type: Accepted
    
    
    </details>
    
    </td></tr>
    </table>
    

    Copy link

    kaito-pr-agent bot commented Apr 25, 2025

    PR Code Suggestions ✨

    Latest suggestions up to 14f9887

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Impact
    General
    Add verification step

    Suggest checking Azure portal or using Azure CLI to verify profile existence.

    docs/toubleshooting/DNSBasedGlobalLoadBalancing.md [84]

    -- Ensure that the Azure traffic manager profile exists, which could be manually deleted by other users. To recover this profile, you need to delete the `TrafficManagerBackend` and re-create it.
    +- Ensure that the Azure traffic manager profile exists, which could be manually deleted by other users. Verify its existence in the Azure portal or using Azure CLI. To recover this profile, you need to delete the `TrafficManagerBackend` and re-create it.
    Suggestion importance[1-10]: 6

    __

    Why: Adding a verification step enhances the troubleshooting process by providing clear actions for the user.

    Low
    Specify service type

    Clarify that the service must be of type LoadBalancer.

    docs/toubleshooting/DNSBasedGlobalLoadBalancing.md [97]

    -- Ensure that the exported `Service` is load balancer type and exposed via an Azure public IP address, which must have a DNS name assigned to be used in a Traffic Manager profile.
    +- Ensure that the exported `Service` is of type LoadBalancer and exposed via an Azure public IP address, which must have a DNS name assigned to be used in a Traffic Manager profile.
    Suggestion importance[1-10]: 6

    __

    Why: Specifying the service type improves clarity and ensures users understand the requirement for a LoadBalancer service.

    Low
    Specify ceiling application

    Clarify that the ceiling function is applied to the calculated weight.

    docs/concepts/DNSBasedGlobalLoadBalancing/README.md [63]

    -The weight of actual Azure Traffic Manager endpoint created for a single cluster is the ceiling value of a number computed as `trafficManagerBackend` weight/(sum of all `serviceExport` weights behind the `trafficManagerBackend`)
    +The weight of the actual Azure Traffic Manager endpoint created for a single cluster is the ceiling value of the number computed as `trafficManagerBackend` weight / (sum of all `serviceExport` weights behind the `trafficManagerBackend`).
    Suggestion importance[1-10]: 5

    __

    Why: The suggestion clarifies the application of the ceiling function, improving readability but does not address a critical issue.

    Low

    Previous suggestions

    Suggestions up to commit 3dc8c93
    CategorySuggestion                                                                                                                                    Impact
    General
    Specify log location

    Clarify the location of the logs for better user understanding.

    docs/toubleshooting/DNSBasedGlobalLoadBalancing.md [17]

    -Please check the `status` field of the `TrafficManagerProfile` or the `trafficmanagerprofile/controller.go` file in hub-net-controller-manager logs for more information.
    +Please check the `status` field of the `TrafficManagerProfile` or the `trafficmanagerprofile/controller.go` file in the hub-net-controller-manager pod logs for more information.
    Suggestion importance[1-10]: 2

    __

    Why: The suggestion clarifies the location of the logs but offers only a minor improvement in user understanding. It does not address any critical issues or significantly enhance functionality.

    Low
    Suggestions up to commit 3dc8c93
    CategorySuggestion                                                                                                                                    Impact
    General
    Clarify log location

    Clarify the location of the logs for better user understanding.

    docs/toubleshooting/DNSBasedGlobalLoadBalancing.md [17]

    -Please check the `status` field of the `TrafficManagerProfile` or the `trafficmanagerprofile/controller.go` file in hub-net-controller-manager logs for more information.
    +Please check the `status` field of the `TrafficManagerProfile` or the `trafficmanagerprofile/controller.go` logs in the hub-net-controller-manager for more information.
    Suggestion importance[1-10]: 2

    __

    Why: The suggestion is correct but offers only a minor improvement in clarity. It does not significantly impact the functionality or correctness of the documentation.

    Low

    @zhiying-lin zhiying-lin merged commit 935e776 into Azure:main May 7, 2025
    7 checks passed
    @zhiying-lin zhiying-lin deleted the add-tsg branch May 7, 2025 01:11
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    2 participants