pgBoss.stop doesn't remove active jobs #303

dolegi · 2022-02-04T14:37:06Z

Hey,
first off thanks so much for pgboss is an extremely useful library!

when calling pgBoss.stop() and waiting for the stopped event, jobs that take longer than the timeout get stuck in an active state.

What currently happens

We have some singleton jobs that run for between ~10mins up to just over 1hour. So we have set them to only expire after 120 minutes. When we re-deploy our job workers, the active jobs stay in pgboss until they expire, so the job doesn't get re-triggered until the active job (that no worker is working on) expires.

Request

Ideally when re-deploying we can catch the SIGTERM, call pgboss.stop({timeout: x}) which will stop the worker and remove any active jobs.

TL;DR Request

Have pgBoss.stop() delete/update active jobs when the worker stops.

Or should we be manually deleting active jobs, by tracking jobId's and manually updating the pgboss.job table. Is there a recommended way to approach this?

Related issues

#268

Thanks!

The text was updated successfully, but these errors were encountered:

timgit · 2022-02-05T00:48:49Z

Hey, thanks! I agree with your suggestion, which is pretty similar to the expiration promise that is started along with jobs in the worker. I will look into an ideal way of opting into this.

Also, have you considered listening to SIGTERM in your worker callback function to do your own failure?

dolegi · 2022-02-07T13:54:37Z

Hi tim, thanks for looking into it. We are considering updating the job statuses directly but it feels wrong and against the way to properly work with pgboss.

UPDATE pgboss.jobs SET state = '<abandoned>' where state=active and id in <ids from the instance worker>;

We have to be careful to only update the job ids from the current instance, since other instance workers could still be actively processing jobs.

StarpTech · 2023-08-27T10:05:17Z

Hi @timgit any updates on this?

timgit · 2023-08-30T22:05:07Z

No work is being planned for this request right now. There is a reason SQS doesn't allow you to hold on to a message for hours, first of all. But long-running promises aside, I think the best approach would be to fail the jobs after the timeout. They would be eligible for retry at that point by another worker.

timgit · 2023-08-31T14:17:42Z

I'll consider adding this into v10

dolegi changed the title ~~stop doesn't remove active jobs~~ pgBoss.stop doesn't remove active jobs Feb 4, 2022

timgit added the enhancement label Feb 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pgBoss.stop doesn't remove active jobs #303

pgBoss.stop doesn't remove active jobs #303

dolegi commented Feb 4, 2022 •

edited

Loading

timgit commented Feb 5, 2022

dolegi commented Feb 7, 2022

StarpTech commented Aug 27, 2023

timgit commented Aug 30, 2023

timgit commented Aug 31, 2023

pgBoss.stop doesn't remove active jobs #303

pgBoss.stop doesn't remove active jobs #303

Comments

dolegi commented Feb 4, 2022 • edited Loading

What currently happens

Request

TL;DR Request

Related issues

timgit commented Feb 5, 2022

dolegi commented Feb 7, 2022

StarpTech commented Aug 27, 2023

timgit commented Aug 30, 2023

timgit commented Aug 31, 2023

dolegi commented Feb 4, 2022 •

edited

Loading