Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server disappears from iOS app for no apparent reason #3141

Open
jaredgisin opened this issue Nov 10, 2024 · 37 comments
Open

Server disappears from iOS app for no apparent reason #3141

jaredgisin opened this issue Nov 10, 2024 · 37 comments

Comments

@jaredgisin
Copy link

jaredgisin commented Nov 10, 2024

iOS device model, version and app version

Model Name: iPhone 13 Pro Max
Software Version: iOS 18.0.1
App version: 2024.9.4

Home Assistant Core Version
2024.11.1

Describe the bug

I have a server on my local network that appears in the "Add Server" list. It appears there with a DNS name that works on my local network. I can add the server, authenticate, login just fine. The server shows up in the list with the cloud icon next to it. However at some point after, not sure what triggers this, the server just disappears from the list.

To Reproduce
See steps above.

Expected behavior
Once added, a server should NEVER disappear from the list of servers that I have added. It's a horrific UX to have items that the users adds to a list simply disappear for no logical reason. If there's a problem communicating with the server, authentication issue, network problem, whatever, the server should stay in the list and it should indicate that it's in an error state and the UI should provide a way for the user to correct the problem. It should NOT just disappear from the list for ANY reason whatsoever. Any server that was valid when it's add should persist in that list even if it later becomes invalid for any reason.

Screenshots

Additional context

This HA instance is a docker container behind an NGINX reverse proxy that is using a Let's Encrypt certificate using a custom DNS name. The HA instance itself ifs NOT directly exposed to my network but instead sits behind the NGINX.

I have the following in my configuration.yaml file

homeassistant:
internal_url: https://custom-dns.name.here.this.is.an.example

@jaredgisin jaredgisin changed the title Server disappears from iOS app for no apparently reason Server disappears from iOS app for no apparent reason Nov 10, 2024
@bgoncal
Copy link
Member

bgoncal commented Nov 11, 2024

Any chances that your instance returns "unauthorized" at any point? If the app receives unauthorized from the host it reopens the onboarding flow and asks to login again.

@guidau
Copy link

guidau commented Nov 14, 2024

I have had the same problem for about 4 days. I use Nabu Casa and a virtual HAOS image. I have the same behavior on my iPhone and iPad.

@jaredgisin
Copy link
Author

Any chances that your instance returns "unauthorized" at any point? If the app receives unauthorized from the host it reopens the onboarding flow and asks to login again.

I don't actually know, but I highly doubt it. I am able to log in the first time just fine when the server is added. The server stays in the app and functions normally, and then at some point it just disappears.

Even if it were getting "unauthorized" at any point, at no point do I ever see the onboarding flow reopened. I am never prompted to re-authenticate. If that were the case then I wouldn't have opened this issue.

To be clear, this issue is about the fact that the iOS app DROPS the server from the list. It flat out DISAPPEARS from the list of server which is a significant usability problem because it doesn't give me any change to correct issues with it. I argue that the correct usability is that if I add a server to the list, and I am able to authenticate the server the first time to have it then appear in the list, there should be never be any circumstance in which the iOS app decides to drop the server from that list. If it starts unauthenticating, or being unreachable or whatever the hell DO NOT DROP THE SERVER FROM THE LIST. The user should see the server in the list at all time, be notified there is an issue with it and given the ability to reuathentica it or restore the connectivity to it, or delete themselves.

So the question here and the issue that needs to be resolved is to figure out under what circumstances the iOS app simply just deletes the server record from the servers list and to stop that from happening. Secondarily, then we can figure out if there is some connectivity or authentication issue with my setup. I would be far less upset about this and more likely to resolve networking or authentication issue if the server were to remain int he list and to show me an error about why the app is unable to connect, but as it stands now as the server is flat out dropped with zero explanation, I don't even know where to start to figure out if I have an issue.

@bgoncal
Copy link
Member

bgoncal commented Nov 15, 2024

Do you have more than 1 server? Because if the server vanishes from the list and there are no other servers, you should see the onboarding again. can you share a screen recording when that happens again?

@ChristopherGerdes
Copy link

ChristopherGerdes commented Nov 15, 2024

I'm having this issue, I do not have a second server. This started happening about a month to a month and a half ago on every single device that uses the app (two iphones, an ipad). It has, to my knowledge, only ever happened when I'm away from home. When it happens the server just completely disappears from the list and the app behaves as if it's never been connected to a server.

To make matters worse, for some reason when this happens I'm completely unable to connect to the server when I'm not physically at home on my network. If while not on my network I try to re-add the device by manually giving it the external address I will see the device show back up in home assistant (telling me it's able to make a connection) but I will get the error "Alamofire.AFError 9" and it won't connect. This holds true even if I VPN into my network on the device. If I re-add the device when back on my network it discovers the instance and adds it just fine. After adding it back I can use it when not on my network without issue until the app decides to randomly delete the HA instance again. When I've attempted to troubleshoot the Alamofire error I've found numerous people getting the error but almost no answers to how to fix it. The few answers that did come were basic things that I'd already tried like deleting the app and reinstalling etc. When I posted myself to the homeassistant subreddit I got one reply, which was someone with the same issue asking if I'd solved it and one telling me to restart my phone and change the default browser neither of which worked.

As the person above said, regardless of what the reason was for it losing connectivity, the behavior of the app is bad user experience.....and this is something that has changed for the worse. No matter what, I shouldn't have the entire instance disappear from the app and the app behave as if it's never been used before. I should be told there's been some sort of error, what the error is, and given the opportunity to correct it. Because of this user experience I don't even have a way of troubleshooting and solving this issue because I have no clue WHAT is happening. All I know is I open the app and my entire HA instance is gone...that's all the info any of us can give you.

Edit: i just had this happen again on my iPad which has not left my house since the last time, so this definitely isn't related to being off my local network.

@guidau
Copy link

guidau commented Nov 15, 2024

It seems I have solved my problem! In my case, the unauthorized access was triggered via Nabu Casa, the local IP 127.0.0.1 was blocked.
I have found the following help for this: [https://nabucasa.com/config/troubleshooting/403_forbidden/]
However, I still can't understand why the server entry in the configuration is simply deleted in the event of access problems!

@bgoncal
Copy link
Member

bgoncal commented Nov 15, 2024

Message received that this is a bad UX, now we need to find out what is triggering that, as my initial theory at least for @guidau the issue was the unauthorized response

@bgoncal
Copy link
Member

bgoncal commented Nov 15, 2024

@ChristopherGerdes yes you can find out what is happening because if you export your companion app logs there should be something there to help.
Feel free to submit here as well https://forms.gle/Uoqz127Phx4mMTpS6

@ChristopherGerdes
Copy link

It seems I have solved my problem! In my case, the unauthorized access was triggered via Nabu Casa, the local IP 127.0.0.1 was blocked. I have found the following help for this: [https://nabucasa.com/config/troubleshooting/403_forbidden/] However, I still can't understand why the server entry in the configuration is simply deleted in the event of access problems!

ok, I may be encountering this. Did you also need to add anything to your config to prevent it from being banned in the future? I don't see anything in the instructions about that, but it seems like it would need to be added to trusted proxies or something to keep it from happening again.

@ChristopherGerdes
Copy link

@ChristopherGerdes yes you can find out what is happening because if you export your companion app logs there should be something there to help. Feel free to submit here as well https://forms.gle/Uoqz127Phx4mMTpS6

Unfortunately, I just reconnected the server. Will I still be able to export this log file from the companion app if it is in new user mode where it's telling me I need to add a server? If so where do I go to do that in that situation? I can grab the log next time it deletes the server.

@bgoncal
Copy link
Member

bgoncal commented Nov 15, 2024

@guidau in case you receive "unauthorized" and you are lead to the onboarding, if you sign in again the server will return, having the server in the list but not working is also useless right?

Again, the UX CAN and WILL be improved, I'm just brainstorming with you so we find a nice approach

@ChristopherGerdes
Copy link

@guidau in case you receive "unauthorized" and you are lead to the onboarding, if you sign in again the server will return, having the server in the list but not working is also useless right?

Again, the UX CAN and WILL be improved, I'm just brainstorming with you so we find a nice approach

Thank you for this. You may be hearing frustration from us simply because I think a number of us have been facing this for a while and it's like slamming your head against the wall. I know I definitely appreciate the help.

@bgoncal
Copy link
Member

bgoncal commented Nov 15, 2024

@ChristopherGerdes after the server gets deleted, add it again then go to companion app settings >> debugging >> export logs

@guidau
Copy link

guidau commented Nov 15, 2024

@guidau in case you receive "unauthorized" and you are lead to the onboarding, if you sign in again the server will return, having the server in the list but not working is also useless right?

Again, the UX CAN and WILL be improved, I'm just brainstorming with you so we find a nice approach

I think if you are only local with your cell phone, it's okay, but if you have to access your HA on the road via the Internet, all devices lose their configuration and you have to log in again with UserID, Pwd and 2FA on all devices after the problem has been solved. That's really annoying.

@ChristopherGerdes
Copy link

@ChristopherGerdes yes you can find out what is happening because if you export your companion app logs there should be something there to help. Feel free to submit here as well https://forms.gle/Uoqz127Phx4mMTpS6

Thank you for this! I've grabbed the logs from my ipad and submitted them.

@JManch
Copy link

JManch commented Nov 15, 2024

For anyone running into this issue whilst hosting Home Assistant behind their own reverse proxy, I was able to workaround it by responding with a 404 instead of a 403 in the case of unauthorized access.

case .serverError(400 ... 403, _, _) = underlying {
/// Server rejected the refresh token. All is lost.
let event = ClientEvent(
text: "Refresh token is invalid, showing onboarding",
type: .networkRequest,
payload: [
"error": String(describing: underlying),
]
)
Current.clientEventStore.addEvent(event).cauterize()
Current.servers.remove(identifier: server.identifier)
Current.onboardingObservation.needed(.error)

@jaredgisin
Copy link
Author

Do you have more than 1 server? Because if the server vanishes from the list and there are no other servers, you should see the onboarding again. can you share a screen recording when that happens again?

Yes, I have more than one server. Two of them never go away. They are running the HAOS/supervisor setup on RaspberryPI and I have them added using the IP address of each host. They never disappear from the iOS app.

There is no way to do any screen recording here as other have stated the server simply just disappear at some point. Likely ti ay be when I leave the local network, but this seems strange to as when they are connected locally they show the cloud icon next to them and when they are added they show they are using the internal URL, so I would expect them to work when I am not on the local network but should switch to cloud. That doesn't seem to happen, but I can't say for sure that's when/what causes these to disappear as I have never tested.

@jaredgisin
Copy link
Author

jaredgisin commented Nov 16, 2024

@guidau in case you receive "unauthorized" and you are lead to the onboarding, if you sign in again the server will return, having the server in the list but not working is also useless right?

Again, the UX CAN and WILL be improved, I'm just brainstorming with you so we find a nice approach

I would imagine that simple code inspection should reveal all of the places in the code where the server is removed from the list, where the configuration is deleted. As a developer myself, I would look at how that can happen in the code and remove that functionality. That should also point to the reason why it can happen.

I do greatly appreciate your investigation into this so we can fix this issue. It's been a problem for months. Thanks for you help!

@bgoncal
Copy link
Member

bgoncal commented Nov 16, 2024

The cloud icon is just a label that shows if you have HA Cloud configured, nothing else.

Between your 3 servers, what's the difference from the 2 that never get deleted to the third one in terms of configuration? Does any of them have a user with local access only? Any kind of security measure that can prevent the remote access?

@bgoncal
Copy link
Member

bgoncal commented Nov 16, 2024

It's not "just" remove the code that deletes/hide the server, there are many things and flows that took this into consideration so they all need to be changed as well.
Before moving forward I want to validate if all cases that this happened were due to unauthorized access, till now, all were.

@bgoncal
Copy link
Member

bgoncal commented Nov 16, 2024

For anyone running into this issue whilst hosting Home Assistant behind their own reverse proxy, I was able to workaround it by responding with a 404 instead of a 403 in the case of unauthorized access.

case .serverError(400 ... 403, _, _) = underlying {
/// Server rejected the refresh token. All is lost.
let event = ClientEvent(
text: "Refresh token is invalid, showing onboarding",
type: .networkRequest,
payload: [
"error": String(describing: underlying),
]
)
Current.clientEventStore.addEvent(event).cauterize()
Current.servers.remove(identifier: server.identifier)
Current.onboardingObservation.needed(.error)

Thanks for pointing out

@jaredgisin
Copy link
Author

jaredgisin commented Nov 17, 2024

The cloud icon is just a label that shows if you have HA Cloud configured, nothing else.

Between your 3 servers, what's the difference from the 2 that never get deleted to the third one in terms of configuration? Does any of them have a user with local access only? Any kind of security measure that can prevent the remote access?

The difference is that 2 that never get deleted have been manually added using the IPv4 address of the Raspberry Pi 4 that they are running on my local network. These are 192.168.x.x addresses. Those never get deleted and are always there.

The one that gets deleted is running as a docker container Nginx reverse proxy on the same host. That system has a local DNS name such as ha.blah.network that my local Unifi device serves DNS for. That DNS name is then configured as a host the Nginx reverse proxy which directs traffic over the local docker network within that host to the HA container. This setup is using a Lets Encrypt certificate so that I can use https://ha.blah.network

This all used to work very well until a few months ago. Nothing I my setup has changed in a year or so, but sometime over the summer the server was being dropped.

When I look at the configuration in the mobile app after I add the server back, it looks right to me.

Connected via: Internal URL
Version: 2024.11.1
WebSocket: Connected
Local Push: Disabled

Internal URL: https://ha.blah.network --- not the real DNS name
External URL: Home Assistant Cloud

When I click into the Internal UR settingsL, I see the SSID of my local network listed correctly and "Local Push" is enabled there.

So it all looks like it should once it's added. Then at some point, I guarantee I'll open the app and it will be gone.

@ChristopherGerdes
Copy link

The one that gets deleted is running as a docker container Nginx reverse proxy on the same host. That system has a local DNS name such as ha.blah.network that my local Unifi device serves DNS for. That DNS name is then configured as a host the Nginx reverse proxy which directs traffic over the local docker network within that host to the HA container. This setup is using a Lets Encrypt certificate so that I can use https://ha.blah.network

Mine is behind an nginx proxy with a lets encrypt cert as well, so we seem to have that in common.

@bgoncal
Copy link
Member

bgoncal commented Nov 18, 2024

@ChristopherGerdes yes you can find out what is happening because if you export your companion app logs there should be something there to help. Feel free to submit here as well https://forms.gle/Uoqz127Phx4mMTpS6

Unfortunately, I just reconnected the server. Will I still be able to export this log file from the companion app if it is in new user mode where it's telling me I need to add a server? If so where do I go to do that in that situation? I can grab the log next time it deletes the server.

I was checking your logs and right away when I open your HA cloud URL I get "403", did you disable remote access?

@bgoncal
Copy link
Member

bgoncal commented Nov 18, 2024

@jaredgisin I couldn't find your logs, have you submited it?

@bgoncal
Copy link
Member

bgoncal commented Nov 18, 2024

I'm experimenting in this PR with showing a message when this happens: #3171

@ChristopherGerdes
Copy link

@ChristopherGerdes yes you can find out what is happening because if you export your companion app logs there should be something there to help. Feel free to submit here as well https://forms.gle/Uoqz127Phx4mMTpS6

Unfortunately, I just reconnected the server. Will I still be able to export this log file from the companion app if it is in new user mode where it's telling me I need to add a server? If so where do I go to do that in that situation? I can grab the log next time it deletes the server.

I was checking your logs and right away when I open your HA cloud URL I get "403", did you disable remote access?

I access home assistant via reverse proxy and do not use the gateway functionality offered by HA cloud. Is that what you're asking?

@ChristopherGerdes
Copy link

I'm experimenting in this PR with showing a message when this happens: #3171

In the past I've gotten a message similar to this when my lets encrypt cert expires (I think). It is an annoyance but eventually it gets renewed and starts working again. Is it possible that the app has changed how it reacts to this situation and that's what's causing this?

@bgoncal
Copy link
Member

bgoncal commented Nov 18, 2024

@ChristopherGerdes got it, weirdly in your logs it shows it trying to connect to HA cloud, can you send me a screenshot of how your external url is configured in the app? of course your can cover any private info

@ChristopherGerdes
Copy link

@ChristopherGerdes got it, weirdly in your logs it shows it trying to connect to HA cloud, can you send me a screenshot of how your external url is configured in the app? of course your can cover any private info

Yup, I think this is what you're wanting:
image

Incidentally, is there a way to get to these settings without killing the app and doing the race to hit the config icon before it disappears? It usually takes me 5 or more tries to get into the menu.

@bgoncal
Copy link
Member

bgoncal commented Nov 18, 2024

@ChristopherGerdes the message you saw was probably a red background message saying your SSL certificate was invalid, this message still exists and it is unrelated to this issue as far as I can tell

@bgoncal
Copy link
Member

bgoncal commented Nov 18, 2024

@ChristopherGerdes swipe up with 3 fingers to access server switcher/access to settings

@ChristopherGerdes
Copy link

@ChristopherGerdes the message you saw was probably a red background message saying your SSL certificate was invalid, this message still exists and it is unrelated to this issue as far as I can tell

Gotcha, and ya that sounds like the message. was just sharing in case it helped.

@ChristopherGerdes
Copy link

@ChristopherGerdes swipe up with 3 fingers to access server switcher/access to settings

ohh, this is super helpful, thank you!

@bgoncal
Copy link
Member

bgoncal commented Nov 21, 2024

New TestFlight beta available

@poldim
Copy link

poldim commented Nov 22, 2024

I also had this issue a few times while traveling in EU this week. Unfortunately, I have no idea how to replicate it.

I didn't implement any different settings on my HA/RP config so I'm guessing it's something app based?

@jaredgisin
Copy link
Author

New TestFlight beta available

So far so good on the new beta.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants