Distributed synchronization is a common requirement in service-oriented architectures, where a set of distributed
workers need to signal completion to a coordinator. Java's built-in CountDownLatch only works within a single JVM.
Distributed Latch extends this concept across process and machine boundaries, providing count-based coordination backed by a pluggable distributed storage layer.
Supported latch flavours:
- Count Down Latch — uni-directional permit movements (decrement only). Ideal for fan-out / fan-in patterns.
- Count Up Down Latch — bi-directional permit movements (increment and decrement). Useful when total work is not known upfront.
<dependency>
<groupId>com.phonepe</groupId>
<artifactId>distributed-latch</artifactId>
<version>${distributed-latch.version}</version>
</dependency>Replace
${distributed-latch.version}with the latest version from Maven Central or GitHub Releases.
IDistributedCountDownLatch latch = DistributedLatchFactory.createCountDownLatch(
"CLIENT_ID",
LatchLevel.DC,
"FA1",
"LATCH_ID",
AerospikeLatchStorageContext.builder()
.aerospikeClient(aerospikeClient)
.namespace("NAMESPACE")
.setSuffix("distributed_latch")
.storageType(StorageType.AEROSPIKE)
.ttl(3600)
.build(),
5 // initial count
);IDistributedCountDownLatch latch = DistributedLatchFactory.createCountDownLatch(
"CLIENT_ID",
LatchLevel.DC,
"FA1",
"LATCH_ID",
HBaseLatchStorageContext.builder()
.connection(connection) // HBase connection reference
.tableSuffix("distributed_latch")
.storageType(StorageType.HBASE)
.ttl(3600)
.build(),
5 // initial count
);PS: For optimum performance, DO NOT pre-create the HBase table. The library will create it for you with the correct schema and pre-split configuration.
-
countDown()- Decrements the latch count by 1 in the distributed store. The operation is atomic.
-
getCount()- Returns the current count from the distributed store.
-
await()- Blocks the calling thread until the latch count reaches zero. A background watcher polls the store every 5 seconds.
-
await(timeout, unit)- Blocks until the count reaches zero or the timeout expires. Returns
trueif count reached zero,falseotherwise.
- Blocks until the count reaches zero or the timeout expires. Returns
Extends CountDown Latch with an additional operation:
countUp()- Increments the latch count by 1 in the distributed store.
IDistributedCountDownLatch latch = DistributedLatchFactory.createCountDownLatch(
"order-service", LatchLevel.DC, "dc1", "batch-job-123", storageContext, 3);
// Dispatch 3 tasks to workers, then wait
boolean completed = latch.await(60, TimeUnit.SECONDS);
if (!completed) {
// handle timeout
}IDistributedCountDownLatch latch = DistributedLatchFactory.getCountDownLatch(
"order-service", LatchLevel.DC, "dc1", "batch-job-123", storageContext);
try {
processTask();
} finally {
latch.countDown();
}IDistributedCountUpDownLatch latch = DistributedLatchFactory.getOrCreateCountUpDownLatch(
"order-service", LatchLevel.DC, "dc1", "dynamic-job-456", storageContext);
// Dynamically add work
latch.countUp();
// ... do work ...
latch.countDown();- DC — Latch scoped within a single data center. Count is stored per
farmId. - XDC — Latch scoped across data centers. Count is aggregated across all farms.
Caution: Reading an XDC count aggregates across all farms. Due to storage replication lag, the count may be temporarily inconsistent across data centers. For strong consistency with XDC, use a multi-site Aerospike cluster or synchronous HBase replication.
A latch exists only within the scope of a client represented by CLIENT_ID. The internal key is D_LTCH#<clientId>#<latchId>, so different clients can use the same latch ID without conflict.