# Queues in Postgres

Problem: We've got a queue of tasks that workers can consume and work on. The database stores all the tasks, a worker will come along to take the oldest task available, assign itself to it then begin its work.

Let's create a simple schema to represent the list of tasks and workers:

CREATE TABLE task {
  id SERIAL PRIMARY KEY,
  worker_id INTEGER
};

CREATE TABLE worker {
  id SERIAL PRIMARY KEY  
};

Assumptions:

Assume other external systems are responsible for inserting tasks and managing workers.
Assume task ids are implicitly ordered by their task id; i.e. older tasks have smaller ids.
Assume workers eventually delete the task from the queue when it's done.
Generally, there are enough workers around that the size of the queue will be small, so we won't add an index.

How do we assign workers to tasks?

# Naive Implementation

Here's a simple implementation:

UPDATE task
SET worker_id = ?
WHERE id = (
  SELECT MIN(id) AS task_id
  FROM task
  WHERE worker_id IS NULL
  LIMIT 1
);

The query selects the oldest task id currently in the queue that isn't assigned to a worker. If we find a task, we set the worker to be the current one executing the query (using the ? as a placeholder).

On the surface this query is correct, however, how would it behave if there are multiple workers competing for tasks?

# Implementation 2: Fix Concurrency Bugs

Try and spot some concurrency bugs in this implementation. Can you find any?

...

Here's one: two workers work on the same task. But how? Let's see if we can reproduce this scenario.

Worker 0	Worker 1
`BEGIN;`	`BEGIN;`
`UPDATE task SET worker_id = 0 WHERE id = ( SELECT MIN(id) AS task_id FROM task WHERE worker_id IS NULL LIMIT 1 );`	`UPDATE task SET worker_id = 1 WHERE id = ( SELECT MIN(id) AS task_id FROM task WHERE worker_id IS NULL LIMIT 1 );`
`COMMIT;`
	`COMMIT;`

Result: the task is assigned to worker 1!

What we experienced here is a lost update. Implicit in this query, is that it's running at the READ COMMITTED transaction isolation level -- the default in Postgres ^[1].

When the UPDATE query for Worker 1 is executed, it's blocked on the same row that Worker 0 updated until that worker commits. Then once the change is committed, Worker 1's UPDATE query rereads the row, rechecks all of its conditions in the WHERE clause, and successfully executes the UPDATE and commits.

You can find this in the full set of the documentation on READ COMMITTED here.

The change is pretty simple: just add an extra condition on the WHERE clause to ensure the worker_id is still NULL before doing the UPDATE:

UPDATE task
SET worker_id = ?
WHERE id = (
    SELECT MIN(id) AS task_id
    FROM task
    WHERE worker_id IS NULL
    LIMIT 1
  )
  AND worker_id IS NULL;

# Implementation 3: Better Performance

We're still faced with a subtle (but important) problem: blocking. As indicated in the documentation, when a transaction is making an update (or delete, etc.) to a row that has been changed by another pending transaction, it's required to block on that transaction until completion and load the new row if it had changed. If the pending transaction had rolled back, the UPDATE simply locks the row and changes it as usual.

For a highly-contended task queue, this scales poorly when more workers are competing for tasks to work on.

Is there a way to prevent the worker from waiting?

Postgres provides explicit locking clauses that let us choose certain behaviors when locking on rows. Two of them (as of Postgres 17) are NOWAIT and SKIP LOCKED.

# NOWAIT

Using this clause, if a row-level lock cannot be acquired the moment it is needed, Postgres will abort the transaction an issue an error to the client.

This may be useful for client-sided retry logic and error-handling, but it's likely for most queues we'll want the second option.

# SKIP LOCKED

This option will skip a row if the row-level lock cannot be acquired.

For queueing systems, this is likely the behavior that would be desired; the worker will simply give up on that task and attempt to find the next one to work on (if there are any). Since only one worker will be able to lock a row at a given time, it's not possible for a lost update to occur because any other competing workers would have lost the race and moved on.

Here's what the new query looks like:

UPDATE task
SET worker_id = ?
WHERE id = (
  SELECT MIN(id) AS task_id
  FROM task
  WHERE worker_id IS NULL
  FOR UPDATE OF task SKIP LOCKED
  LIMIT 1
);

Three things to observe with these changes:

FOR UPDATE grants the transaction an exclusive lock on the row for the task that's selected. With the SELECT .. FOR UPDATE command, if a row had been changed by a concurrently committed transaction (i.e. after it committed and released its lock, and while the current transaction still sees the old row in its snapshot), it will reload and lock the latest version of the row to ensure it still satisfies the WHERE clause's conditions.
SKIP LOCKED will skip this row if the transaction is unable to acquire a lock.
The condition in the UPDATE for worker_id is gone since we will be the exclusive reader of the row that's to be updated. No other transaction can see this row in their task selection query.

To summarize, this query won't cause transactions to block on a task that may have been taken by another. So for example, if 50 workers had tried to access this task at the same time, the winner will get to take ownership of the task, while the other 49 will move on to the next oldest task to take, and so on.

# Isolation Levels

While the condition to take a task is pretty simple, in your production system you may have more requirements on how workers should take a task to work on.

Keep in mind the complexity of your task selection, for you might need the stronger guarantees that other levels such as REPEATABLE READ and SERIALIZABLE provide. At these levels, transaction rollbacks are common and may come at a performance cost for your application.

At work, I had seen hundreds of thousands of transaction rollbacks per day when our task selection query was both expensive and at the REPEATABLE READ isolation level. Once we refactored the query and were able to drop down to READ COMMITTED, our average task queue times dropped significantly. A task queue time is defined by how long a task takes to be picked up and owned by a worker.

Now because we dropped down to this level, our task selection logic had to accept stale reads in certain circumstances, therefore affecting how workers accepted tasks. This is a change we were fine with for our use case.

So to put it simply: if your task selection process is complicated, the tradeoff in correctness by choosing a stronger isolation level is performance cost, and that tradeoff may be quite significant. If your application can retry and do it quickly for potential rollbacks, the cost of using a higher isolation level may be acceptable. On the contrary, if transaction rollbacks are too slow to be acceptable, dropping down to READ COMMITTED and using a locking strategy like in Implementation 3 is likely to be more beneficial.

# Other Implementations

Postgres also provides LISTEN and NOTIFY for alerting workers to new tasks. This provides an opportunity for a different task selection implementation.