← Back to Projects
DevOps & Tooling

CronJob Scheduler

Distributed task scheduling with Temporal.io workflow orchestration

Backend Engineer · Sep 2024 — Present
Go go-zero Temporal.io MySQL/GORM Redis Kafka

Background

Distributed job scheduler orchestrating recurring batch operations including VIP recalculation, report generation, cache warming, and data cleanup. Must guarantee exactly-once execution across multiple pods and support timezone-aware scheduling.

Architecture

Configurable cron definitions → Temporal.io workflow engine → activity workers (go-zero) → distributed lock (Redis) → batch operations → MySQL/Redis/Kafka side effects. Health monitoring watches workflow execution and alerts on failure.

Key Implementations

1

Temporal.io Workflow Orchestration

Each batch job is modeled as a Temporal workflow with retry policies, timeouts, and visibility into execution history.

Why: Temporal provides durable execution guarantees, automatic retries, and a built-in UI for monitoring long-running batch operations.

2

Distributed Lock-Protected Execution

Critical batch jobs acquire a Redis distributed lock before execution to prevent concurrent runs across multiple scheduler pods.

Why: Exactly-once execution is critical for financial operations like VIP recalculation and settlement to prevent duplicate processing.

3

Health Monitoring and Alerting

Monitors workflow execution states and triggers alerts when jobs fail, timeout, or skip their scheduled window.

Why: Silent batch job failures can compound into data inconsistencies; proactive alerting ensures issues are caught before they cascade.

Technical Decisions

Technical Decisions Chosen Alternative Reason
Job orchestration engine Temporal.io Custom cron with goroutines Temporal provides durable execution, automatic retries, and workflow visibility out of the box, eliminating the need to build these guarantees manually.
Concurrency control Redis distributed lock + Temporal single-worker queue Database-level locking Layering Redis locks with Temporal's task queue semantics provides defense-in-depth against duplicate execution across pods.