Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoiding getting stuck when failing due to uncaught exception #435

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Sp3EdeR
Copy link

@Sp3EdeR Sp3EdeR commented Apr 21, 2024

When an exception propagated outside of the script, the script failed to run for a while, exiting with the message:

Another iteration is currently running! Exiting...

This avoids such a case by resetting the last run time in case of an uncaught exception

@derekantrican
Copy link
Owner

Can you clarify why this is needed? What exception are you seeing?

At the very start (just below the Another iteration... log message) is where the LastRun prop is set:

PropertiesService.getUserProperties().setProperty('LastRun', new Date().getTime());

There shouldn't need to be any resetting of this property because it's just a timestamp. The script will run the next iteration when that timestamp is "too old".

@Sp3EdeR
Copy link
Author

Sp3EdeR commented Apr 22, 2024

Hello @derekantrican,
The timeout is currently set to 6 minutes. This can cause issues when the synchronization frequency is set to a more frequent time than this, by causing instantly exiting runs within the log. Furthermore, when testing or debugging a modification, having to wait 6 minutes between failing runs seems inefficient. With this modification, the script can be deterministically rerun right away, regardless of whether it succeeded or failed due to an exception.

@derekantrican
Copy link
Owner

derekantrican commented Apr 22, 2024

It is 6 minutes right now because that is the maximum execution time allowed by Google Apps Script: https://developers.google.com/apps-script/guides/services/quotas#current_limitations

We don't usually encourage syncing calendars faster than 5 minutes (I usually even encourage 15 minutes as this is still MUCH faster than Google Calendar's current speeds of many hours),

EDIT: removing my suggestion. I think I'm going to classify this PR suggestion in one of two buckets: 1) if you're getting some sort of exception, we should investigate the exception and see if we can fix it or 2) if your calendar is taking more time to sync than your howFrequent setting, then that is user error and you need to adjust your howFrequent setting to longer.

As to the point of "when testing or debugging...", I think if you are frequently testing changes with the script, you can figure out how to comment out this if block while testing.

@Sp3EdeR
Copy link
Author

Sp3EdeR commented Apr 22, 2024

Oh, I already fixed my issues. There is a branch in my fork that does some specialized work for my use-case. My primary problem was that Facebook randomly sends a reply that is HTTP 200, but does not parse as a valid ICS. In this case, the synchronization removed deleted all events from the affected calendar, which was a big problem for the service I based on the affected calendar. I did not go into this deep enough to try to log and catch what they send when it is an invalid file. I just modified the code so that invalid ICS files interrupt the sync run (since in my use-case, there is only one calendar to be synced anyway). This initially went through as an uncaught exception, for which I added specific handling in the callWithBackoff function, since essentially that was the behaviour I needed here. But before then, this behaviour caused one extra unnecessary testing round for me, so I thought I'd contribute it back, as it could be helpful.

As for commenting that code while debugging... I didn't really want to remove this check in a "live" deployment scenario, and the intermittentness of the issue + this behaviour made it a bit more annoying to sort out than I felt was necessary. No biggie though.

I understand the time recommendation, and it makes sense in most cases of course. But since the original code resets the LastRun on successful completion, this feature doesn't do anything about that (nor should it, I suppose). The only thing that this try..catch does is that it makes the fail-behaviour the same as the current success-behaviour is.

@darylmonge
Copy link

I had the same problem, with our cycling club calendar feed (strautomator.com) throwing a 500 error that was not caught by callWithBackoff and deleting all our (future) calendar entries. This was especially nasty because we have a zapier.com task that emails deleted (cancelled) ride events.

Oh, I already fixed my issues. There is a branch in my fork that does some specialized work for my use-case. My primary problem was that Facebook randomly sends a reply that is HTTP 200, but does not parse as a valid ICS. In this case, the synchronization removed deleted all events from the affected calendar, which was a big problem for the service I based on the affected calendar. I did not go into this deep enough to try to log and catch what they send when it is an invalid file. I just modified the code so that invalid ICS files interrupt the sync run (since in my use-case, there is only one calendar to be synced anyway). This initially went through as an uncaught exception, for which I added specific handling in the callWithBackoff function, since essentially that was the behaviour I needed here. But before then, this behaviour caused one extra unnecessary testing round for me, so I thought I'd contribute it back, as it could be helpful.

As for commenting that code while debugging... I didn't really want to remove this check in a "live" deployment scenario, and the intermittentness of the issue + this behaviour made it a bit more annoying to sort out than I felt was necessary. No biggie though.

I understand the time recommendation, and it makes sense in most cases of course. But since the original code resets the LastRun on successful completion, this feature doesn't do anything about that (nor should it, I suppose). The only thing that this try..catch does is that it makes the fail-behaviour the same as the current success-behaviour is.

@Sp3EdeR
Copy link
Author

Sp3EdeR commented Apr 24, 2024

I had the same problem, with our cycling club calendar feed (strautomator.com) throwing a 500 error that was not caught by callWithBackoff and deleting all our (future) calendar entries. This was especially nasty because we have a zapier.com task that emails deleted (cancelled) ride events.

If your issue persists, my customization for this is here: Sp3EdeR@5b3081b. Disclaimer, this does not handle the case when maxtries is reached, and callWithBackoff gives up. If the issue is that bad, it may be useful to throw an exception in the callWithBackoff ( tries > maxRetries) condition to fully fail the script run. This is off-topic for this PR though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants