Batch Processing Case Study
Rate-Limited CSV Batch Processing Pipeline
A production-grade CSV ingestion + queue pipeline that processes up to 1 million rows safely under rate-limited dependencies. Built with Laravel queues, atomic status claiming for idempotency, and Kubernetes-scheduled dispatch.
What this solves
- Accepts large client uploads without timing out or exhausting memory.
- Protects rate-limited downstream APIs from burst traffic.
- Makes retries safe through idempotency and atomic “claiming” semantics.
- Gives operators clear batch/row status, error visibility, and recovery tools.
Core design choices
- Split ingestion and processing: CSV → persisted work rows → queued row jobs.
- Atomic status transitions (pending → processing) as a lock-free idempotency mechanism.
- Kubernetes-scheduled dispatch with concurrencyPolicy: Forbid.
- Encrypted storage of row payloads and request context to minimize data exposure.
Case Study Description
Clients needed to upload large CSVs to trigger an invite/check workflow, but the underlying dependency services and internal APIs were rate-limited. I built a pipeline that validates input, stores the file in object storage, extracts rows into a durable work table, and then processes each row via queue jobs. Work dispatch is controlled by a Kubernetes CronJob to throttle work per client and prevent API bursts. The entire pipeline is designed to be retry-safe, observable, and operable with manual fallback modes and stuck-job recovery.
Outcomes
Scalability
Handled uploads up to 1 million rows with row-by-row processing and durable job queuing
Reliability
Retry-safe behavior through atomic claiming and completion checks at row and batch levels
Rate-Limit Safety
Controlled per-client dispatch to avoid burst traffic and dependency throttling
Operations
Manual run modes + stuck-row reconciliation to keep production work moving
Architecture Diagram (System View)
flowchart LR
U[API Client
CSV upload] --> V[Upload + Validate
Headers + rows]
V --> S[Object Storage
CSV file]
V --> B[(Batch record
status=pending)]
B --> J1[Queue Job
Extract rows]
J1 --> R[(Row table
pending rows)]
J1 --> B2[(Batch
status=queued)]
C[Kubernetes CronJob
every minute] --> D[Dispatcher Command
per-client limit]
D --> Q[Queue row jobs
N per client per run]
Q --> J2[Row Processor Job
pending→processing]
J2 --> A1[API call: address-links]
A1 --> A2[API call: get-addresses]
A2 --> A3[API call: invite/create check]
J2 --> R2[(Row status
completed/failed)]
R2 --> F[Batch completion check
atomic update]
Sequence Diagram (Technical Flow)
sequenceDiagram
participant Client as API Client
participant API as Laravel API
participant Store as Object Storage
participant DB as DB (batch + rows)
participant Q as Queue Workers
participant Cron as K8s CronJob
participant Dep as Rate-limited APIs
Client->>API: POST /v2/client/users/batch/invite (CSV)
API->>API: Validate headers + rows (max 1 million)
API->>Store: Stream upload CSV
API->>DB: Create batch (pending) + store request context
API->>Q: Enqueue ExtractRows job (batch)
Q->>DB: Claim batch (pending→processing)
Q->>Store: Download CSV
Q->>DB: Insert row work items (pending)
Q->>DB: Mark batch queued
Cron->>API: Run dispatcher command
API->>DB: Select owners + pending rows (limit per owner)
API->>Q: Enqueue row jobs
Q->>DB: Claim row (pending→processing)
Q->>Dep: address-links
Q->>Dep: get-addresses
Q->>Dep: invite/create
alt success
Q->>DB: row completed + check_id
else failure
Q->>DB: row failed + error payload
end
Q->>DB: attempt complete batch if all rows finished
Manual fallback mode
# Re-run a batch extraction (manual mode)
client:batch-csv-invite-processing {batchId} --manual-run
# Queue a single row for processing
client:csv-invite-processing {csvRowId}
# Reconcile stuck processing rows (dry-run unless --apply)
client:process-stuck-csv-invite
client:process-stuck-csv-invite --apply
Anytime test plan
- Upload a small CSV and verify batch and row statuses transition correctly.
- Simulate API failures and confirm rows become failed with captured error payloads.
- Retry row jobs and confirm atomic claiming prevents double-processing.
- Pause dispatcher and confirm no burst traffic reaches dependencies.
Idempotency notes
- Row-level claiming uses atomic status updates to ensure only one worker processes a row.
- Batch-level claiming prevents two workers extracting the same CSV concurrently.
- Batch completion is atomic: it only flips when no unfinished rows remain.
Safety controls
- Strict header validation + row-by-row validation to fail fast on bad uploads.
- Rate control via scheduled dispatch and per-client selection limits.
- Encrypted-at-rest storage for CSV row payloads and request context.
- Structured logs + alert hooks for production failures.
Need systems that can handle volume + constraints?
I design and ship resilient pipelines for startup teams that need speed without sacrificing reliability.
Share this case study: