Request Routing and Snitches in Cassandra

Last updated: January 25, 2024

Written by: baeldung

NoSQL

Cassandra

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

1. Overview

In this tutorial, we’ll learn about the job of a snitch and how Cassandra uses it to efficiently route requests. We’ll also look at various types of snitches available in Cassandra.

2. What Is a Snitch?

A snitch simply reports the rack and datacenter to which each node belongs – essentially, it determines and informs Cassandra about the network topology of the cluster.

With this knowledge of the cluster’s topology, including relative proximity between nodes, Cassandra is able to efficiently route requests to the appropriate nodes within the cluster.

2.1. Snitch on Write Operation

Cassandra uses information from the snitch to group nodes into racks and datacenters. So in order to avoid correlated failures during a write operation, Cassandra makes every effort not to store replicas in the same rack.

2.2. Snitch on Read Operation

We know that, during a read operation, Cassandra must contact a number of nodes and replicas based on their consistency levels. In order to make the read operation efficient, Cassandra uses information from the snitch to identify the node that will return a replica fastest.

This node is then queried for full row information. Then, Cassandra queries other replica nodes for hash values to ensure the latest data is returned.

3. Types of Snitches

The default snitch, SimpleSnitch, is not topology-aware. That is, it doesn’t know the racks and datacenters for the cluster. Therefore, it’s not suitable for multiple-datacenter deployments.

For this reason, Cassandra has come up with various types of snitches that meet our requirements. Normally, we can configure the type of snitch in the configuration file cassandra.yml file via the property name endpoint_snitch.

Let’s look at a few types of snitches and how they work.

3.1. PropertyFileSnitch

The PropertyFileSnitch is a rack-aware snitch. We can provide the cluster topology information as key-value properties in the file called cassandra-topology.properties. In this property file, we provide the rack and datacenter names, each node belongs to.

Moreover, we can use any name for the rack and the datacenter. But we need to make sure datacenter names match the ones defined as the NetworkTopologyStrategy in the keyspace definition.

Here is an example content of cassandra-topology.properties:

# Cassandra Node IP=Data Center:Rack 
172.86.22.125=DC1:RAC1 
172.80.23.120=DC1:RAC1 
172.84.25.127=DC1:RAC1 

192.53.34.122=DC1:RAC2 
192.55.36.134=DC1:RAC2 
192.57.302.112=DC1:RAC2 

# default for unknown nodes 
default=DC1:RAC1

In the above example, there are two racks (RAC1, RAC2) and one datacenter (DC1). Any node IP not covered will fall into the default datacenter (DC1) and rack (RAC1).

One drawback with this snitch is that we need to make sure the cassandra-topology.properties file is in sync with all the nodes in a cluster.

3.2. GossipingPropertyFileSnitch

GossipingPropertyFileSnitch is also a rack-aware snitch. To avoid manual syncing of the racks and datacenters as required in PropertyFileSnitch, in this snitch, we just define the rack and datacenter name individually for each node in the cassandra-rackdc.properties file.

And these racks’ and datacenters’ information are exchanged with all nodes using the gossip protocol.

Here’s an example content of the cassandra-rackdc.properties file:

dc=DC1
rack=RAC1

3.3. Ec2Snitch

As the name suggests, Ec2Snitch is related to cluster deployment in Amazon Web Service (AWS) EC2. Within a single AWS region deployment, the region name is treated as the datacenter and the availability zone name is considered the rack.

If we only need a single-datacenter deployment, no property configuration is required. Whereas, in the case of multiple-datacenter deployments, we can specify the dc_suffix in the cassandra-rackdc.properties file.

For example, in the us-east region, if we need a couple of datacenters, we can provide the dc_suffix configuration as:

dc_suffix=_1_DC1
dc_suffix=_1_DC2

Each configuration given above goes into two different nodes. Consequently, it results in us_east_1_DC1 and us_east_1_DC2 as the names of datacenters.

3.4. Ec2MultiRegionSnitch

In the case of multi-region cluster deployment in AWS, we should use the Ec2MultiRegionSnitch. Moreover, we need to configure both cassandra.yaml and cassandra-rackdc.properties.

We have to configure the public IP address as the broadcast_address and also, if needed, use it as a seed node in cassandra.yaml. Additionally, we must configure a private IP address as the listen_address. Finally, we have to open up the session_port or ssl_session_port on the public IP firewall.

These configurations allow the nodes to communicate across the AWS regions, thus enabling multiple-datacenter deployments across regions. In the case of a single region, Cassandra nodes switch to the private IP address for communication after the connection is made.

The dc_suffix datacenter configuration in cassandra_rackdc.properties for each node across the region is similar to Ec2Snitch.

3.5. GoogleCloudSnitch

As the name suggests, GoogleCloudSnitch is for cluster deployments in the Google Cloud Platform across one or more regions. Similar to AWS, the region name is considered the datacenter and the availability zone is the rack.

We don’t need any configuration in the case of a single-datacenter deployment. Conversely, in the case of multiple-datacenter deployments, similar to Ec2Snitch, we can set the dc_suffix in the cassandra-rackdc.properties file.

3.6. RackInferringSnitch

RackInferringSnitch infers proximity of nodes into racks and datacenters from the third and second octets of the nodes’ IP addresses.

4. Dynamic Snitching

By default, Cassandra wraps any snitch type that we configure in the cassandra.yml file with another type of snitch called the DynamicEndpointSnitch. This dynamic snitch gets the basic information of the cluster topology from the underlying snitch, which is already configured. It then monitors the read latency of nodes, even keeping track of the compaction operations in any nodes.

This performance data is then used by the dynamic snitch to select the best replica node for any read query. This way, Cassandra avoids rerouting read requests to bad or busy (slow-performing) replica nodes.

The dynamic snitch uses a modified version of the Phi accrual failure detection mechanism used by gossip to determine the best replica node on the read request. The badness threshold is a configurable parameter that determines how badly a preferred node must perform, compared to the best performing node, in order to lose its preferential status.

Cassandra resets the performance scores of each node periodically to allow the bad performing node to recover and perform better so as to reclaim its preferential status.

5. Conclusion

In this tutorial, we learned what a Snitch is and also went through some of the Snitch types available for us to configure in a Cassandra cluster deployment.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Modern Java teams move fast — but codebases don’t always keep up. Frameworks change, dependencies drift, and tech debt builds until it starts to drag on delivery. OpenRewrite was built to fix that: an open-source refactoring engine that automates repetitive code changes while keeping developer intent intact.

The monthly training series, led by the creators and maintainers of OpenRewrite at Moderne, walks through real-world migrations and modernization patterns. Whether you’re new to recipes or ready to write your own, you’ll learn practical ways to refactor safely and at scale.

If you’ve ever wished refactoring felt as natural — and as fast — as writing code, this is a good place to start.