A Guide to GemFire with Spring Data

Last updated: January 8, 2024

Written by: baeldung

Reviewed by: Michal Aibin

Caching

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

1. Overview

GemFire is a high performance distributed data management infrastructure that sits between application cluster and back-end data sources.

With GemFire, data can be managed in-memory, which makes the access faster. Spring Data provides an easy configuration and access to GemFire from Spring application.

In this article, we’ll take a look at how we can use GemFire to meet our application’s caching requirements.

IMPORTANT UPDATE:

Ownership of the spring-data-gemfire project has moved within VMware from the Spring team to the GemFire team. You can now use the Spring Data for VMware GemFire project.

The functionality is largely the same but the dependency group ID, artifact ID, version, and Maven repository have changed. To learn how to add the updated dependency to your project, check out the guide here.

2. Maven Dependencies

To make use of the Spring Data GemFire support, we first need to add the following dependency in our pom.xml:

<dependency>
    <groupId>org.springframework.data</groupId>
    <artifactId>spring-data-gemfire</artifactId>
    <version>1.9.1.RELEASE</version>
</dependency>

The latest version of this dependency can be found here.

3. GemFire Basic Features

3.1. Cache

The cache in the GemFire provides the essential data management services as well as manages the connectivity to other peers.

The cache configuration (cache.xml) describes how the data will be distributed among different nodes:

<cache>
    <region name="region">
        <region-attributes>
            <cache-listener>
                <class-name>
                ...
                </class-name>
            </cache-listener>
        </region-attributes>
    </region>
    ...
</cache>

3.2. Regions

Data regions are a logical grouping within a cache for a single data set.

Simply put, a region lets us store data in multiple VMs in the system without consideration to which node the data is stored within the cluster.

Regions are classified into three broad categories:

Replicated region holds the complete set of data on each node. It gives a high read performance. Write operations are slower as the data update need to be propagated to each node:
```
<region name="myRegion" refid="REPLICATE"/>
```
Partitioned region distributes the data so that each node only stores a part of region contents. A copy of the data is stored on one of the other nodes. It provides a good write performance.
```
<region name="myRegion" refid="PARTITION"/>
```
Local region resides on the defining member node. There is no connectivity with other nodes within the cluster.
```
<region name="myRegion" refid="LOCAL"/>
```

3.3. Query the Cache

GemFire provides a query language called OQL (Object Query Language) that allows us to refer to the objects stored in GemFire data regions. This is very similar to SQL in syntax. Let’s see how a very basic query looks like:

SELECT DISTINCT * FROM exampleRegion

GemFire’s QueryService provides methods to create the query object.

3.4. Data Serialization

To manage the data serialization-deserialization, GemFire provides options other than Java serialization that gives a higher performance, provides greater flexibility for data storage and data transfer, also support for different languages.

With that in mind, GemFire has defined Portable Data eXchange(PDX) data format. PDX is a cross-language data format that provides a faster serialization and deserialization, by storing the data in the named field which can be accessed directly without the need of fully deserializing the object.

3.5. Function Execution

In GemFire, a function can reside on a server and can be invoked from a client application or another server without the need to send the function code itself.

The caller can direct a data-dependent function to operate on a particular data set or can lead an independent data function to work on a particular server, member or member group.

3.6. Continuous Querying

With continuous querying, the clients subscribe to server side events by using SQL-type query filtering. The server sends all the events that modify the query results. The continuous querying event delivery uses the client/server subscription framework.

The syntax for a continuous query is similar to basic queries written in OQL. For example, a query which provides the latest stock data from Stock region can be written as:

SELECT * from StockRegion s where s.stockStatus='active';

To get the status update from this query, an implementation of CQListener need to be attached with the StockRegion:

<cache>
    <region name="StockRegion>
        <region-attributes refid="REPLICATE">
            ...
            <cache-listener>
                <class-name>...</class-name>
            </cache-listener>
        ...
        </region-attributes>
    </region>
</cache>

4. Spring Data GemFire Support

4.1. Java Configuration

To simplify configuration, Spring Data GemFire provides various annotations for configuring core GemFire components:

@Configuration
public class GemfireConfiguration {

    @Bean
    Properties gemfireProperties() {
        Properties gemfireProperties = new Properties();
        gemfireProperties.setProperty("name","SpringDataGemFireApplication");
        gemfireProperties.setProperty("mcast-port", "0");
        gemfireProperties.setProperty("log-level", "config");
        return gemfireProperties;
    }

    @Bean
    CacheFactoryBean gemfireCache() {
        CacheFactoryBean gemfireCache = new CacheFactoryBean();
        gemfireCache.setClose(true);
        gemfireCache.setProperties(gemfireProperties());
        return gemfireCache;
    }

    @Bean(name="employee")
    LocalRegionFactoryBean<String, Employee> getEmployee(final GemFireCache cache) {
        LocalRegionFactoryBean<String, Employee> employeeRegion = new LocalRegionFactoryBean();
        employeeRegion.setCache(cache);
        employeeRegion.setName("employee");
        // ...
        return employeeRegion;
    }
}

To set up the GemFire cache and region, we have to first setup few specific properties. Here mcast-port is set to zero, which indicates that this GemFire node is disabled for multicast discovery and distribution. These properties are then passed to CacheFactoryBean to create a GemFireCache instance.

Using GemFireCache bean, an instance of LocalRegionFatcoryBean is created which represents the region within the Cache for the Employee instances.

4.2. Entity Mapping

The library provides support to map objects to be stored in GemFire grid. The mapping metadata is defined by using annotations at the domain classes:

@Region("employee")
public class Employee {

    @Id
    public String name;
    public double salary;

    @PersistenceConstructor
    public Employee(String name, double salary) {
        this.name = name;
        this.salary = salary;
    }

    // standard getters/setters
}

In the example above, we used the following annotations:

@Region, to specify the region instance of the Employee class
@Id, to annotate the property that shall be utilized as a cache key
@PersistenceConstructor, which helps to mark the one constructor that will be used to create entities, in case multiple constructors available

4.3. GemFire Repositories

Next, let’s have a look at a central component in Spring Data – the repository:

@Configuration
@EnableGemfireRepositories(basePackages
  = "com.baeldung.spring.data.gemfire.repository")
public class GemfireConfiguration {

    @Autowired
    EmployeeRepository employeeRepository;
    
    // ...
}

4.4. Oql Query Support

The repositories allow the definition of query methods to efficiently run the OQL queries against the region the managed entity is mapped to:

@Repository
public interface EmployeeRepository extends   
  CrudRepository<Employee, String> {

    Employee findByName(String name);

    Iterable<Employee> findBySalaryGreaterThan(double salary);

    Iterable<Employee> findBySalaryLessThan(double salary);

    Iterable<Employee> 
      findBySalaryGreaterThanAndSalaryLessThan(double salary1, double salary2);
}

4.5. Function Execution Support

We also have annotation support available – to simplify working with GemFire function execution.

There are two concerns to address when we make use of functions, the implementation, and the execution.

Let’s see how a POJO can be exposed as a GemFire function using Spring Data annotations:

@Component
public class FunctionImpl {

    @GemfireFunction
    public void greeting(String message){
        // some logic
    }
 
    // ...
}

We need to activate the annotation processing explicitly for @GemfireFunction to work:

@Configuration
@EnableGemfireFunctions
public class GemfireConfiguration {
    // ...
}

For function execution, a process invoking a remote function need to provide calling arguments, a function id, the execution target (onServer, onRegion, onMember, etc.):

@OnRegion(region="employee")
public interface FunctionExecution {
 
    @FunctionId("greeting")
    public void execute(String message);
    
    // ...
}

To enable the function execution annotation processing, we need to add to activate it using Spring’s component scanning capabilities:

@Configuration
@EnableGemfireFunctionExecutions(
  basePackages = "com.baeldung.spring.data.gemfire.function")
public class GemfireConfiguration {
    // ...
}