-
Notifications
You must be signed in to change notification settings - Fork 30
Description
Here's some pseudocode around the critical path around the event lifecycle.
Current design
Idempotency logic:
type Idempotence interface {
Exec(ctx, context.Context, key string, exec fun() error) error
}impl:
options:
timoutTTL time.Duration // "processing"
successfulTTL time.Duration // "processed"
Exec:
hasKey = SETNX key timeoutTTL
if !hasKey {
err = exec()
if err != nil {
DEL key
return
}
SET key successfulTTL
} else {
status = GET key
if status == nil
CONSIDERATION: how to handle this?
else if status == proccessed
return nil
else (status == processing)
time.Sleep(timeoutTTL)
status = GET key
if status == processed
return nil
else return ErrConflict
}
PublishMQ consumer:
idempotence.Exec(ctx, "idempotency:publishmq:"+event.ID, handler)
idempotence opts:
assuming visibility timeout is 5s
timeoutTTL: 5s (or 5s+1s=6s)
successfulTTL: 10s (2x visibility timeout)
handler logic:
destinations = listDestinations(event) // list from tenant ID, destinationID, topic, etc
for destination in destinations {
deliveryEvent = newDeliveryEvent(event, destination)
err = deliverymq.Publish(deliveryEvent) // TODO: parallelize this op
if err != nil {
// NOTE_1
return err
}
}
DeliveryMQ consumer:
idempotence.Exec(ctx, "idempotency:deliverymq:"+deliveryEvent.ID, handler)
idempotence opts:
assuming visibility timeout is 30s
timeoutTTL: 30s (or 30s+1s=31s)
successfulTTL: 24hr
handler logic:
deliveryEvent.result = destination.Publish(event)
err = logmq.Publish(deliveryEvent)
if err != nil {
return err
}
Problem & potential solution
At NOTE_1, how should we handle the error here better? Let's say an event is supposed to be delivered to 10 different destinations. If one destination fails, based on the current logic, we will nack the message and upon retry, we will attempt to redeliver the event to all destinations again. This approach will cause multiple duplicate event deliveries.
Solution:
One solution is to wrap the enqueue with an idempotence handler. We can consider skipping the consumer-level idempotency logic in that case because it may or may not be necessary anymore (latency vs Redis memory usage).
Here's the updated logic for PublishMQ consumer:
idempotence.Exec(ctx, "idempotency:publishmq:"+event.ID, handler) // optional, should we still?
idempotence opts:
assuming visibility timeout is 5s
timeoutTTL: 5s (or 5s+1s=6s)
successfulTTL: 10s (2x visibility timeout)
+ deliveryIdempotenceOptions: 5s timeoutTTL, 1hr successfulTTL (or 24hr?)
handler logic:
destinations = listDestinations(event) // list from tenant ID, destinationID, topic, etc
var errs []error
// assuming this loop is parallelized
for destination in destinations {
deliveryEvent = newDeliveryEvent(event, destination)
err = deliveryIdempotence.Exec(ctx, "idempotence:event_destination:"+event.ID+"_"+destination.ID, deliverymq.Publish(deliveryEvent))
if err != nil {
errs = append(errs, err)
}
}
if len(errs) > 0 -- Nack else Ack
Let's evaluate the memory key usage for an event lifecycle. Let's say 1 event to be delivered to 10 destinations:
- 1 publishmq key:
idempotence:publishmq:<event_id>(10s successfulTTL - could be negligible given the short TTL) - 10 delivery event keys:
idempotence:event_destination:<event_id>_<destination_id>(1hr successfulTTL) - 10 deliverymq keys: `idempotence:deliverymq:<delivery_event_id> (24hr successfulTTL)
- also optionally another 10 logmq keys as well
We can consider further optimize for Redis memory usage with an "eventLifecycleIdempotence" implementation where we can share long-term keys from each event delivery. So instead of 1 publishmq key + 30 keys, we can replace that with just 1 publishmq key + 10 keys only. We may need to use some separate ephemeral "processing" keys on top of the 10 long-term (24hr) keys, but given the shorter TTL, it should be negligible, all things considered.
Summary
Here are the next action items here
- parallelize deliverymq enqueue ops
- employ idempotence implementation when enqueueing to deliverymq to avoid duplicate deliveries
- (optional) custom event lifecycle idempotence implementation to optimize Redis memory usage
Would appreciate your input on this design! Let me know if I'm missing anything and what's the next step should be. Specifically, I'd love to hear your thoughts on the TTLs for these idempotency keys here.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status