Consider a typical microservice deployment where an order processing service runs across multiple instances. A common requirement is to daily scan for successfully processed orders from the previous day and notify downstream analytics services.
A naive implemantation using Spring’s @Scheduled annotation leads to duplicate execution in production:
@Scheduled(cron = "0 0 10 * * ?")
public void sendOrder() {
List<Order> orders = orderRepository.findSuccessOrdersFromYesterday();
orders.forEach(this::notifyAnalyticsService);
}
In a two-instance setup, both nodes trigger this job at 10:00 AM — resulting in double notifications and potential data inconsistency.
A basic distributed lock approach (e.g., Redis SETNX or ZooKeeper ephemeral nodes) ensures only one instance executes the logic:
@Scheduled(cron = "0 0 10 * * ?")
public void sendOrder() {
if (redisLock.tryLock("order-job-lock", Duration.ofMinutes(5))) {
try {
notifyAnalyticsService(orderRepository.findSuccessOrdersFromYesterday());
} finally {
redisLock.unlock();
}
}
}
This solves duplication but introduces a bottleneck: all 1 million orders are handled by a single node, underutilizing available capacity.
ElasticJob addresses both concerns via job sharding, enabling horizontal scaling of scheduled work. It partitions the dataset across running instances dynamically — without modifying business logic.
Here's a minimal working configuration using elasticjob-lite-spring-boot-starter:
pom.xml
<dependency>
<groupId>org.apache.shardingsphere.elasticjob</groupId>
<artifactId>elasticjob-lite-spring-boot-starter</artifactId>
<version>3.0.1</version>
</dependency>
Custom Job Implementtaion
@Component
public class OrderNotificationJob implements SimpleJob {
@Override
public void execute(ShardingContext context) {
int shardIndex = context.getShardingItem();
int totalShards = context.getShardingTotalCount();
String shardParam = context.getShardingParameter();
// Partition logic: process orders where id % totalShards == shardIndex
List<Order> batch = orderRepository.findSuccessOrdersFromYesterday(
shardIndex,
totalShards
);
batch.forEach(this::notifyAnalyticsService);
}
}
application.yml
elasticjob:
regcenter:
serverlists: 127.0.0.1:2181
namespace: elasticjob-demo
jobs:
orderJob:
elasticJobClass: com.example.OrderNotificationJob
cron: "0 0 10 * * ?"
shardingTotalCount: 2
shardingItemParameters: "0=shard-0,1=shard-1"
When deployed on two instances, ElasticJob automatically assigns each node a unique shardingItem: one receives 0, the other 1. Each uses its shard index to compute wich subset of the full dataset it processes — e.g., orders with id % 2 == 0 vs id % 2 == 1.
Scaling to three instances requires only changing shardingTotalCount: 3 and updating shardingItemParameters to "0=shard-0,1=shard-1,2=shard-2". ElasticJob detects the new topology via ZooKeeper’s /instances ephemeral nodes and triggers automatic rebalancing through its leader-elected /leader/sharding coordination path.
The framework relies on ZooKeeper for:
- Instance discovery:
/instances/{ip:pid}(ephemeral) - Shard assignment:
/sharding/{item}/instance - Leader election:
/leader/election - Runtime configuration:
/config(modifiable at runtime)
Rebalancing occurs when:
- New instances join or existing ones exit
- Shard count changes
- Leader fails over
Internally, ShardingService.shardingIfNecessary() runs on the elected leader only. It reads current /instances, computes optimal shard distribution, and writes assignments to /sharding/{n}/instance. Non-leader instances skip re-sharding and simply read their assigned shard index before executing.
This design ensures:
- Exactly-once execution per logical shard
- Automatic failover (if instance A dies, its shards migrate to surviving nodes)
- Zero-downtime rescaling (change config → ZooKeeper notifies all nodes → next execution uses new topology)
- No shared database locks or custom coordination code
For large datasets like 1M orders, sharding distributes load linearly: N instances handle ~1/N of the workload — improving throughput while maintaining correctness.