Jobs got stuck - need advice

Hi guys, 

We’re experiencing issues with Redis memory usage and job states in our queue system after enabling automatic scaling on our API services.
Our services automatically scale up and down based on the number of incoming requests. Each API instance creates and connects to a queue to process incoming jobs. However, we haven’t implemented a graceful shutdown process — meaning queues aren’t explicitly closed when a service shuts down (queue.close()).

During scale-up events, we noticed Redis memory usage steadily increased but never returned to its original level. Although we remove both completed and failed jobs, memory usage kept growing. Eventually, Redis reached 100% memory utilization, causing downtime because no new jobs could be processed.

After increasing the Redis memory limit and disabling auto-scaling, everything stabilized. However, we now have around 4 million jobs in a “stuck” state (job.getState() === "stuck").

**Questions**

- Is there any way to move these "stuck" jobs back to the "waiting" or "active" state?
- If not, what’s the best way to safely delete them — should we filter by timestamp, or by checking job.getState() === "stuck" (documentation advise this is not performance wise)?
- Could this issue be caused by improper scale-down behaviour (e.g., shutting down services without properly closing their queues)?

**Additional Notes**

- All those stuck jobs are rate-limited, so we assume the 4M jobs were added to the queue before the memory limit was reached and were simply waiting to be processed
- We have another queues that are not rate-limited and all their jobs were processed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Jobs got stuck - need advice #2797

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Jobs got stuck - need advice #2797

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions