Spring Top

I just announced the new Learn Spring course, focused on the fundamentals of Spring 5 and Spring Boot 2:

>> CHECK OUT THE COURSE

1. Introduction

Asynchronous messaging is a type of loosely-coupled distributed communication that is becoming increasingly popular for implementing event-driven architectures. Fortunately, the Spring Framework provides the Spring AMQP project allowing us to build AMQP-based messaging solutions.

On the other hand, dealing with errors in such environments can be a non-trivial task. So in this tutorial, we'll cover different strategies for handling errors.

2. Environment Setup

For this tutorial, we’ll use RabbitMQ which implements the AMQP standard. Also, Spring AMQP provides the spring-rabbit module which makes integration really easy.

Let's run RabbitMQ as a standalone server. We’ll run it in a Docker container by executing the following command:

docker run -d -p 5672:5672 -p 15672:15672 --name my-rabbit rabbitmq:3-management

For detailed configuration and project dependencies setup, please refer to our Spring AMQP article.

3. Failure Scenario

Usually, there are more types of errors that can occur in messaging-based systems compared to a monolith or single-packaged applications due to its distributed nature.

We can point out some of the types of exceptions:

  • Network- or I/O-related – general failures of network connections and I/O operations
  • Protocol- or infrastructure-related – errors that usually represent misconfiguration of the messaging infrastructure
  • Broker-related – failures that warn about improper configuration between clients and an AMQP broker. For instance, reaching defined limits or threshold, authentication or invalid policies configuration
  • Application- and message-related – exceptions that usually indicate a violation of some business or application rules

Certainly, this list of failures is not exhaustive but contains the most common type of errors.

We should note that Spring AMQP handles connection-related and low-level issues out of the box, for example by applying retry or requeue policies. Additionally, most of the failures and faults are converted into an AmqpException or one of its subclasses.

In the next sections, we’ll mostly focus on application-specific and high-level errors and then cover global error handling strategies.

4. Project Setup

Now, let’s define a simple queue and exchange configuration to start:

public static final String QUEUE_MESSAGES = "baeldung-messages-queue";
public static final String EXCHANGE_MESSAGES = "baeldung-messages-exchange";

@Bean
Queue messagesQueue() {
    return QueueBuilder.durable(QUEUE_MESSAGES)
      .build();
}
 
@Bean
DirectExchange messagesExchange() {
    return new DirectExchange(EXCHANGE_MESSAGES);
}
 
@Bean
Binding bindingMessages() {
    return BindingBuilder.bind(messagesQueue()).to(messagesExchange()).with(QUEUE_MESSAGES);
}

Next, let’s create a simple producer:

public void sendMessage() {
    rabbitTemplate
      .convertAndSend(SimpleDLQAmqpConfiguration.EXCHANGE_MESSAGES,
        SimpleDLQAmqpConfiguration.QUEUE_MESSAGES, "Some message id:" + messageNumber++);
}

And finally, a consumer that throws an exception:

@RabbitListener(queues = SimpleDLQAmqpConfiguration.QUEUE_MESSAGES)
public void receiveMessage(Message message) throws BusinessException {
    throw new BusinessException();
}

By default, all failed messages will be immediately requeued at the head of the target queue over and over again.

Let's run our sample application by executing the next Maven command:

mvn spring-boot:run -Dstart-class=com.baeldung.springamqp.errorhandling.ErrorHandlingApp

Now we should see the similar resulting output:

WARN 22260 --- [ntContainer#0-1] s.a.r.l.ConditionalRejectingErrorHandler :
  Execution of Rabbit message listener failed.
Caused by: com.baeldung.springamqp.errorhandling.errorhandler.BusinessException: null

Consequently, by default, we will see an infinite number of such messages in the output.

To change this behavior we have two options:

  • Set the default-requeue-rejected option to false on the listener side – spring.rabbitmq.listener.simple.default-requeue-rejected=false
  • Throw an AmqpRejectAndDontRequeueException – this might be useful for messages that won’t make sense in the future, so they can be discarded.

Now, let’s discover how to process failed messages in a more intelligent way.

5. Dead Letter Queue

A Dead Letter Queue (DLQ) is a queue that holds undelivered or failed messages. A DLQ allows us to handle faulty or bad messages, monitor failure patterns and recover from exceptions in a system.

More importantly, this helps to prevent infinite loops in queues that are constantly processing bad messages and degrading system performance.

Altogether, there are two main concepts: Dead Letter Exchange (DLX) and a Dead Letter Queue (DLQ) itself. In fact, DLX is a normal exchange that we can define as one of the common types: direct, topic or fanout.

It's very important to understand that a producer doesn’t know anything about queues. It's only aware of exchanges and all produced messages are routed according to the exchange configuration and the message routing key.

Now let’s see how to handle exceptions by applying the Dead Letter Queue approach.

5.1. Basic Configuration

In order to configure a DLQ we need to specify additional arguments while defining our queue:

@Bean
Queue messagesQueue() {
    return QueueBuilder.durable(QUEUE_MESSAGES)
      .withArgument("x-dead-letter-exchange", "")
      .withArgument("x-dead-letter-routing-key", QUEUE_MESSAGES_DLQ)
      .build();
}
 
@Bean
Queue deadLetterQueue() {
    return QueueBuilder.durable(QUEUE_MESSAGES_DLQ).build();
}

In the above example, we’ve used two additional arguments: x-dead-letter-exchange and x-dead-letter-routing-key. The empty string value for the x-dead-letter-exchange option tells the broker to use the default exchange.

The second argument is as equally important as setting routing keys for simple messages. This option changes the initial routing key of the message for further routing by DLX.

5.2. Failed Messages Routing

So, when a message fails to deliver, it's routed to the Dead Letter Exchange. But as we’ve already noted, DLX is a normal exchange. Therefore, if the failed message routing key doesn’t match the exchange, it won’t be delivered to the DLQ.

Exchange: (AMQP default)
Routing Key: baeldung-messages-queue.dlq

So, if we omit the x-dead-letter-routing-key argument in our example, the failed message will be stuck in an infinite retry loop.

Additionally, the original meta information of the message is available in the x-death header:

x-death:
  count: 1
  exchange: baeldung-messages-exchange
  queue: baeldung-messages-queue 
  reason: rejected
  routing-keys: baeldung-messages-queue 
  time: 1571232954

The information above is available in the RabbitMQ management console usually running locally on port 15672.

Besides this configuration, if we are using Spring Cloud Stream we can even simplify the configuration process by leveraging configuration properties republishToDlq and autoBindDlq.

5.3. Dead Letter Exchange

In the previous section, we’ve seen that the routing key is changed when a message is routed to the dead letter exchange. But this behavior is not always desirable. We can change it by configuring DLX by ourselves and defining it using the fanout type:

public static final String DLX_EXCHANGE_MESSAGES = QUEUE_MESSAGES + ".dlx";
 
@Bean
Queue messagesQueue() {
    return QueueBuilder.durable(QUEUE_MESSAGES)
      .withArgument("x-dead-letter-exchange", DLX_EXCHANGE_MESSAGES)
      .build();
}
 
@Bean
FanoutExchange deadLetterExchange() {
    return new FanoutExchange(DLX_EXCHANGE_MESSAGES);
}
 
@Bean
Queue deadLetterQueue() {
    return QueueBuilder.durable(QUEUE_MESSAGES_DLQ).build();
}
 
@Bean
Binding deadLetterBinding() {
    return BindingBuilder.bind(deadLetterQueue()).to(deadLetterExchange());
}

This time we’ve defined a custom exchange of the fanout type, so messages will be sent to all bounded queues. Furthermore, we've set the value of the x-dead-letter-exchange argument to the name of our DLX. At the same time, we’ve removed the x-dead-letter-routing-key argument.

Now if we run our example the failed message should be delivered to the DLQ, but without changing the initial routing key:

Exchange: baeldung-messages-queue.dlx
Routing Key: baeldung-messages-queue

5.4. Processing Dead Letter Queue Messages

Of course, the reason we moved them to the Dead Letter Queue is so they can be reprocessed at another time.

Let’s define a listener for the Dead Letter Queue:

@RabbitListener(queues = QUEUE_MESSAGES_DLQ)
public void processFailedMessages(Message message) {
    log.info("Received failed message: {}", message.toString());
}

If we run our code example now, we should see the log output:

WARN 11752 --- [ntContainer#0-1] s.a.r.l.ConditionalRejectingErrorHandler :
  Execution of Rabbit message listener failed.
INFO 11752 --- [ntContainer#1-1] c.b.s.e.consumer.SimpleDLQAmqpContainer  : 
  Received failed message:

We’ve got a failed message, but what should we do next? The answer depends on specific system requirements, the kind of the exception or type of the message.

For instance, we can just requeue the message to the original destination:

@RabbitListener(queues = QUEUE_MESSAGES_DLQ)
public void processFailedMessagesRequeue(Message failedMessage) {
    log.info("Received failed message, requeueing: {}", failedMessage.toString());
    rabbitTemplate.send(EXCHANGE_MESSAGES, 
      failedMessage.getMessageProperties().getReceivedRoutingKey(), failedMessage);
}

But such exception logic is not dissimilar from the default retry policy:

INFO 23476 --- [ntContainer#0-1] c.b.s.e.c.RoutingDLQAmqpContainer        :
  Received message: 
WARN 23476 --- [ntContainer#0-1] s.a.r.l.ConditionalRejectingErrorHandler :
  Execution of Rabbit message listener failed.
INFO 23476 --- [ntContainer#1-1] c.b.s.e.c.RoutingDLQAmqpContainer        : 
  Received failed message, requeueing:

A common strategy may need to retry processing a message for n times and then reject it. Let’s implement this strategy by leveraging message headers:

public void processFailedMessagesRetryHeaders(Message failedMessage) {
    Integer retriesCnt = (Integer) failedMessage.getMessageProperties()
      .getHeaders().get(HEADER_X_RETRIES_COUNT);
    if (retriesCnt == null) retriesCnt = 1;
    if (retriesCnt > MAX_RETRIES_COUNT) {
        log.info("Discarding message");
        return;
    }
    log.info("Retrying message for the {} time", retriesCnt);
    failedMessage.getMessageProperties()
      .getHeaders().put(HEADER_X_RETRIES_COUNT, ++retriesCnt);
    rabbitTemplate.send(EXCHANGE_MESSAGES, 
      failedMessage.getMessageProperties().getReceivedRoutingKey(), failedMessage);
}

At first, we are getting the value of the x-retries-count header, then we compare this value with the maximum allowed value. Subsequently, if the counter reaches the attempts limit number the message will be discarded:

WARN 1224 --- [ntContainer#0-1] s.a.r.l.ConditionalRejectingErrorHandler : 
  Execution of Rabbit message listener failed.
INFO 1224 --- [ntContainer#1-1] c.b.s.e.consumer.DLQCustomAmqpContainer  : 
  Retrying message for the 1 time
WARN 1224 --- [ntContainer#0-1] s.a.r.l.ConditionalRejectingErrorHandler : 
  Execution of Rabbit message listener failed.
INFO 1224 --- [ntContainer#1-1] c.b.s.e.consumer.DLQCustomAmqpContainer  : 
  Retrying message for the 2 time
WARN 1224 --- [ntContainer#0-1] s.a.r.l.ConditionalRejectingErrorHandler : 
  Execution of Rabbit message listener failed.
INFO 1224 --- [ntContainer#1-1] c.b.s.e.consumer.DLQCustomAmqpContainer  : 
  Discarding message

We should add that we can also make use of the x-message-ttl header to set a time after that the message should be discarded. This might be helpful for preventing queues to grow infinitely.

5.5. Parking Lot Queue

On the other hand, consider a situation when we cannot just discard a message, it could be a transaction in the banking domain for example. Alternatively, sometimes a message may require manual processing or we simply need to record messages that failed more than n times.

For situations like this, there is a concept of a Parking Lot Queue. We can forward all messages from the DLQ, that failed more than the allowed number of times, to the Parking Lot Queue for further processing.

Let’s now implement this idea:

public static final String QUEUE_PARKING_LOT = QUEUE_MESSAGES + ".parking-lot";
public static final String EXCHANGE_PARKING_LOT = QUEUE_MESSAGES + "exchange.parking-lot";
 
@Bean
FanoutExchange parkingLotExchange() {
    return new FanoutExchange(EXCHANGE_PARKING_LOT);
}
 
@Bean
Queue parkingLotQueue() {
    return QueueBuilder.durable(QUEUE_PARKING_LOT).build();
}
 
@Bean
Binding parkingLotBinding() {
    return BindingBuilder.bind(parkingLotQueue()).to(parkingLotExchange());
}

Secondly, let’s refactor the listener logic to send a message to the parking lot queue:

@RabbitListener(queues = QUEUE_MESSAGES_DLQ)
public void processFailedMessagesRetryWithParkingLot(Message failedMessage) {
    Integer retriesCnt = (Integer) failedMessage.getMessageProperties()
      .getHeaders().get(HEADER_X_RETRIES_COUNT);
    if (retriesCnt == null) retriesCnt = 1;
    if (retriesCnt > MAX_RETRIES_COUNT) {
        log.info("Sending message to the parking lot queue");
        rabbitTemplate.send(EXCHANGE_PARKING_LOT, 
          failedMessage.getMessageProperties().getReceivedRoutingKey(), failedMessage);
        return;
    }
    log.info("Retrying message for the {} time", retriesCnt);
    failedMessage.getMessageProperties()
      .getHeaders().put(HEADER_X_RETRIES_COUNT, ++retriesCnt);
    rabbitTemplate.send(EXCHANGE_MESSAGES, 
      failedMessage.getMessageProperties().getReceivedRoutingKey(), failedMessage);
}

Eventually, we also need to process messages that arrive at the parking lot queue:

@RabbitListener(queues = QUEUE_PARKING_LOT)
public void processParkingLotQueue(Message failedMessage) {
    log.info("Received message in parking lot queue");
    // Save to DB or send a notification.
}

Now we can save the failed message to the database or perhaps send an email notification.

Let's test this logic by running our application:

WARN 14768 --- [ntContainer#0-1] s.a.r.l.ConditionalRejectingErrorHandler : 
  Execution of Rabbit message listener failed.
INFO 14768 --- [ntContainer#1-1] c.b.s.e.c.ParkingLotDLQAmqpContainer     : 
  Retrying message for the 1 time
WARN 14768 --- [ntContainer#0-1] s.a.r.l.ConditionalRejectingErrorHandler : 
  Execution of Rabbit message listener failed.
INFO 14768 --- [ntContainer#1-1] c.b.s.e.c.ParkingLotDLQAmqpContainer     : 
  Retrying message for the 2 time
WARN 14768 --- [ntContainer#0-1] s.a.r.l.ConditionalRejectingErrorHandler : 
  Execution of Rabbit message listener failed.
INFO 14768 --- [ntContainer#1-1] c.b.s.e.c.ParkingLotDLQAmqpContainer     : 
  Sending message to the parking lot queue
INFO 14768 --- [ntContainer#2-1] c.b.s.e.c.ParkingLotDLQAmqpContainer     : 
  Received message in parking lot queue

As we can see from the output, after several failed attempts, the message was sent to the Parking Lot Queue.

6. Custom Error Handling

In the previous section, we’ve seen how to handle failures with dedicated queues and exchanges. However, sometimes we may need to catch all errors, for example for logging or persisting them to the database.

6.1. Global ErrorHandler

Until now, we’ve used the default SimpleRabbitListenerContainerFactory and this factory by default uses ConditionalRejectingErrorHandler. This handler catches different exceptions and transforms them into one of the exceptions within the AmqpException hierarchy.

It's important to mention that if we need to handle connection errors, then we need to implement the ApplicationListener interface.

Simply put, ConditionalRejectingErrorHandler decides whether to reject a specific message or not. When the message that caused an exception is rejected, it won’t be requeued.

Let’s define a custom ErrorHandler that will simply requeue only BusinessExceptions:

public class CustomErrorHandler implements ErrorHandler {
    @Override
    public void handleError(Throwable t) {
        if (!(t.getCause() instanceof BusinessException)) {
            throw new AmqpRejectAndDontRequeueException("Error Handler converted exception to fatal", t);
        }
    }
}

Furthermore, as we are throwing the exception inside our listener method it is wrapped in a ListenerExecutionFailedException. So, we need to call the getCause method to get a source exception.

6.2. FatalExceptionStrategy

Under the hood, this handler uses the FatalExceptionStrategy to check whether an exception should be considered fatal. If so, the failed message will be rejected.

By default these exceptions are fatal:

  • MessageConversionException
  • MessageConversionException
  • MethodArgumentNotValidException
  • MethodArgumentTypeMismatchException
  • NoSuchMethodException
  • ClassCastException

Instead of implementing the ErrorHandler interface, we can just provide our FatalExceptionStrategy:

public class CustomFatalExceptionStrategy 
      extends ConditionalRejectingErrorHandler.DefaultExceptionStrategy {
    @Override
    public boolean isFatal(Throwable t) {
        return !(t.getCause() instanceof BusinessException);
    }
}

Finally, we need to pass our custom strategy to the ConditionalRejectingErrorHandler constructor:

@Bean
public SimpleRabbitListenerContainerFactory rabbitListenerContainerFactory(
  ConnectionFactory connectionFactory,
  SimpleRabbitListenerContainerFactoryConfigurer configurer) {
      SimpleRabbitListenerContainerFactory factory = 
        new SimpleRabbitListenerContainerFactory();
      configurer.configure(factory, connectionFactory);
      factory.setErrorHandler(errorHandler());
      return factory;
}
 
@Bean
public ErrorHandler errorHandler() {
    return new ConditionalRejectingErrorHandler(customExceptionStrategy());
}
 
@Bean
FatalExceptionStrategy customExceptionStrategy() {
    return new CustomFatalExceptionStrategy();
}

7. Conclusion

In this tutorial, we've discussed different ways of handling errors while using Spring AMQP, and RabbitMQ in particular.

Every system needs a specific error handling strategy. We've covered the most common ways of error handling in event-driven architectures. Furthermore, we've seen that we can combine multiple strategies to build a more comprehensive and robust solution.

As always, the full source code of the article is available over on GitHub.

Spring bottom

I just announced the new Learn Spring course, focused on the fundamentals of Spring 5 and Spring Boot 2:

>> CHECK OUT THE COURSE
Comments are closed on this article!