Cron Monitoring
Monitor scheduled jobs with heartbeat-based checks. Learn about schedules, grace periods, and integration patterns.
Cron monitoring (also called heartbeat monitoring) works by having your scheduled jobs send a ping to Ionhour after each successful run. If a ping doesn't arrive on time, Ionhour knows something is wrong and creates an incident.
This is the inverse of traditional monitoring — instead of Ionhour checking your service, your service checks in with Ionhour.
How It Works
- You create a check in Ionhour with a schedule (e.g., every 5 minutes).
- Ionhour gives you a unique ping URL.
- Your cron job hits that URL after each successful run.
- If Ionhour doesn't receive a ping within the expected window (schedule + grace period), it transitions the check to LATE, then DOWN, and creates an incident.
Creating a Heartbeat Check
When creating a check, set the check mode to Inbound (the default). You'll need to configure:
| Setting | Description | Constraints |
|---|---|---|
| Name | A human-readable label for the check | 1–255 characters |
| Interval | How often your job runs | 300–3,600 seconds (5 min to 1 hour) |
| Grace period | Extra time to wait before alerting | 5–60 seconds |
The minimum interval for inbound checks is 5 minutes (300 seconds).
Once created, Ionhour generates a unique token for the check. Your ping URL will be:
https://signal.ionhour.com/api/signals/ping/\{token\}Sending Pings
Simple GET Request
The simplest integration — append a curl call to the end of your cron job:
# Your scheduled job
./run-backup.sh && curl -s https://signal.ionhour.com/api/signals/ping/YOUR_TOKENThe GET endpoint returns {"message": "OK"} on success.
POST with Payload
Send a POST request with a JSON body (up to 10 KB) to include metadata about the run:
curl -X POST https://signal.ionhour.com/api/signals/ping/YOUR_TOKEN \
-H "Content-Type: application/json" \
-d '{"status": "completed", "rows_processed": 1500}'The payload is stored with the signal and visible in the check's signal history. This is useful for debugging — you can see exactly what your job reported on each run.
Reporting Duration
Include a duration field in your payload to report how long your job took to run:
curl -X POST https://signal.ionhour.com/api/signals/ping/YOUR_TOKEN \
-H "Content-Type: application/json" \
-d '{"duration": 1500}'The duration value is in milliseconds and must be a non-negative number. Ionhour stores this alongside the signal and uses it to compute average duration over time in the check's analytics view.
Reporting Dependencies
If your job depends on external services, you can report their status in the ping payload:
curl -X POST https://signal.ionhour.com/api/signals/ping/YOUR_TOKEN \
-H "Content-Type: application/json" \
-d '{
"dependencies": [
{ "name": "PostgreSQL", "status": "ok" },
{ "name": "Redis", "status": "down" }
]
}'Ionhour evaluates dependency status and can create separate DEPENDENCY_DOWN incidents when a dependency is reported as unavailable.
Schedule and Grace Period
The interval defines how often Ionhour expects a ping. The grace period adds a buffer on top of that.
For example, with a 5-minute interval and a 30-second grace period:
- Ping received at 10:00:00
- Next ping expected by 10:05:00
- Grace period extends deadline to 10:05:30
- If no ping by 10:05:30 → check transitions to LATE
Choosing the Right Grace Period
- Short grace period (5–15s): For time-critical jobs where even small delays matter.
- Medium grace period (15–30s): Good default for most cron jobs. Absorbs minor network jitter without false alarms.
- Long grace period (30–60s): For jobs with variable execution time, or when running on shared infrastructure where startup delays are common.
Check Status Lifecycle
Every check follows this state machine:
| Status | Meaning |
|---|---|
| NEW | Just created, no pings received yet |
| OK | Receiving pings on schedule |
| LATE | A ping is overdue (within grace period window) |
| DOWN | Grace period exceeded, incident created |
| PAUSED | Manually paused or paused by a deployment window |
When a check goes DOWN, Ionhour automatically creates an incident and sends alerts to your configured notification channels. When the next ping arrives, the check recovers to OK and the incident is resolved.
Integration Examples
# Run backup every hour, ping Ionhour on success
0 * * * * /usr/local/bin/backup.sh && curl -s https://signal.ionhour.com/api/signals/ping/YOUR_TOKEN > /dev/nullconst https = require('https');
async function runJob() {
// ... your job logic ...
// Ping Ionhour on success
https.get('https://signal.ionhour.com/api/signals/ping/YOUR_TOKEN');
}import requests
def run_job():
# ... your job logic ...
# Ping Ionhour on success
requests.get("https://signal.ionhour.com/api/signals/ping/YOUR_TOKEN")steps:
- name: Run scheduled task
run: ./scripts/nightly-build.sh
- name: Ping Ionhour
if: success()
run: curl -s https://signal.ionhour.com/api/signals/ping/${{ secrets.IONHOUR_TOKEN }}HEALTHCHECK --interval=5m --timeout=10s --retries=1 \
CMD curl -sf https://signal.ionhour.com/api/signals/ping/YOUR_TOKEN || exit 1Badges
Each check has a public badge URL that returns a shields.io-style status badge. No authentication required.
https://api.ionhour.com/api/checks/\{id\}/badgeEmbed it in your README or dashboard:
The badge color reflects the current check status: green for OK, red for DOWN.
Signal Behavior
Deduplication
Ionhour deduplicates signals within a 1-second window. If two pings with the same token and signal type arrive within 1 second of each other, the second one is silently ignored. This prevents accidental duplicate signals from retries or misconfigured job runners.
Drift Tracking
For every successful signal, Ionhour calculates drift — how early or late the ping arrived compared to the expected schedule.
- Formula:
drift = now - (lastPingAt + intervalSeconds) - Positive drift: The signal arrived late (e.g.,
+5000msmeans 5 seconds late). - Negative drift: The signal arrived early.
Drift is recorded on each signal and visible in the signal history. It helps you identify jobs that are gradually drifting out of schedule.
Rate Limiting
The ping endpoint allows 30 requests per 60 seconds per token. If you exceed this limit, you'll receive a 429 Too Many Requests response. This protects against misconfigured jobs that ping in a tight loop.
Signal Retention
Signals are retained for 90 days. After 90 days, signals are automatically deleted. If you need longer retention for compliance, export signal data via the API before it expires.
Best Practices
- Ping only on success. If your job fails, don't send the ping. The missed ping is what triggers the alert.
- Place the ping at the end. Make sure your job has fully completed before sending the heartbeat.
- Use POST with payloads for jobs where you want debugging context (row counts, durations, error summaries).
- Set grace periods realistically. If your job takes 10–30 seconds to run, a 5-second grace period will cause false alarms.
- One check per job. Don't reuse a single check token across multiple independent jobs — you won't know which one failed.
Keep tokens secret. Treat your ping token like an API key. Don't commit it to public repositories. Store it in environment variables or your secrets manager (e.g., GitHub Secrets, Vault).