Distributed Locks For Payments
ADR 005: Distributed Locks for Payment Idempotency
Date: 2026-04
Status: Accepted
Context: Payment recording must be idempotent; multiple systems (webhook, polling, retry) may trigger simultaneously.
Problem
Payment confirmation can arrive through multiple paths:
- Primer webhook (real-time)
- Polling worker (fallback)
- Retry request from client (if first confirmation got lost)
Without synchronization, concurrent requests could:
- Double-record the same payment
- Activate subscription twice
- Create duplicate credits
- Race with cache invalidation
How do we guarantee exactly-once semantics?
Decision
Acquire a distributed lock on the invoice ID before checking/recording payment.
func (p *Processor) RecordPayment(ctx context.Context, req *PaymentRequest) (*PaymentResult, error) { locker := p.repo.Locker() lockKey := locks.InvoiceLock(req.InvoiceID)
// Try to acquire lock FIRST lock, err := locker.TryLock(ctx, lockKey) if err != nil { return nil, fmt.Errorf("failed to acquire lock: %w", err) } if lock == nil { // Another process holds the lock; return idempotent result log.Info().Msg("invoice locked, returning idempotent result") return &PaymentResult{ Success: true, AlreadyProcessed: true, }, nil } defer lock.Release(ctx)
// Inside lock: check duplicate + record payment isDuplicate, err := p.IsDuplicate(ctx, req.TransactionID) if isDuplicate { return &PaymentResult{ Success: true, AlreadyProcessed: true, }, nil }
// Record payment (safe; no other thread can interfere) invoiceInfo, err := p.recordChargebeePayment(ctx, req) // ... activate subscription, invalidate cache ...}Rationale
Why Distributed Locks?
- TOCTOU prevention: Check-then-act is atomic inside lock
- Single writer: Only one goroutine processes each invoice
- Idempotent return: If locked, return “already processed” safely
- No contention: Lock is held briefly (~100ms for API call); not a bottleneck
Lock Semantics
- Scope: Per-invoice (granular, not global lock)
- Duration: Held for entire payment recording (lock → check → record → release)
- Timeout:
TryLockis non-blocking; returns immediately if locked - Implementation: Redis or in-memory (depends on infrastructure)
Without Locks (Unsafe Antipattern)
// ❌ WRONG: Race conditionisDuplicate, _ := p.IsDuplicate(ctx, req.TransactionID) // T1 checks: not duplicate// ... context switch ...isDuplicate, _ := p.IsDuplicate(ctx, req.TransactionID) // T2 checks: not duplicate// ... T1 records payment ...// ... T2 records same payment AGAIN (double-charge!)Consequences
Positive
- Atomic semantics: Payment recording cannot be partially executed
- Exactly-once guarantees: No double-charging, no missing credits
- Graceful degradation: If locked, caller safely retries later
- Observable: Lock acquisition logged; easy to debug contention
Negative
- Lock infrastructure required: Redis or similar (already in sirloin)
- Slightly increased latency: ~10-100ms for lock acquisition
- Debugging complexity: Lock failures may mask underlying issues
Testing
// Test: Concurrent calls are serializedvar wg sync.WaitGroupresults := make([]*PaymentResult, 10)
for i := 0; i < 10; i++ { idx := i wg.Add(1) go func() { defer wg.Done() results[idx], _ = processor.RecordPayment(ctx, request) }()}wg.Wait()
// Verify: Only first call succeeded; others got idempotent resultsuccesses := 0alreadyProcessed := 0for _, result := range results { if result.Success && !result.AlreadyProcessed { successes++ } else if result.AlreadyProcessed { alreadyProcessed++ }}require.Equal(t, 1, successes)require.Equal(t, 9, alreadyProcessed)
// Verify: Payment recorded only oncepurchases := repo.GetPurchases(userID)require.Equal(t, 1, len(purchases))Related ADRs
- 001: Deferred Activation — Lock protects activation transition
- 002: Polling Over Webhooks — Lock serializes webhook + polling
- 006: Saga Pattern — Locks enable safe compensation