Cron Monitoring

Monitor scheduled jobs with heartbeat-based checks. Learn about schedules, grace periods, and integration patterns.

Cron monitoring (also called heartbeat monitoring) works by having your scheduled jobs send a ping to IonHour after each successful run. If a ping doesn't arrive on time, IonHour knows something is wrong and creates an incident.

This is the inverse of traditional monitoring — instead of IonHour checking your service, your service checks in with IonHour.

How It Works

You create a check in IonHour with a schedule (e.g., every 5 minutes).
IonHour gives you a unique ping URL.
Your cron job hits that URL after each successful run.
If IonHour doesn't receive a ping within the expected window (schedule + grace period), it transitions the check to LATE, then DOWN, and creates an incident.

Creating a Heartbeat Check

When creating a check, set the check mode to Inbound (the default). You'll need to configure:

Setting	Description	Constraints
Name	A human-readable label for the check	1–255 characters
Interval	How often your job runs	300–3,600 seconds (5 min to 1 hour)
Grace period	Extra time to wait before alerting	5–60 seconds

The minimum interval for inbound checks is 5 minutes (300 seconds).

Once created, IonHour generates a unique token for the check. Your ping URL will be:

https://app.failsignal.com/api/signals/ping/\{token\}

Sending Pings

Simple GET Request

The simplest integration — append a curl call to the end of your cron job:

# Your scheduled job
./run-backup.sh && curl -s https://app.failsignal.com/api/signals/ping/YOUR_TOKEN

The GET endpoint returns {"message": "OK"} on success.

POST with Payload

Send a POST request with a JSON body (up to 10 KB) to include metadata about the run:

curl -X POST https://app.failsignal.com/api/signals/ping/YOUR_TOKEN \
  -H "Content-Type: application/json" \
  -d '{"status": "completed", "rows_processed": 1500}'

The payload is stored with the signal and visible in the check's signal history. This is useful for debugging — you can see exactly what your job reported on each run.

Reporting Dependencies

If your job depends on external services, you can report their status in the ping payload:

curl -X POST https://app.failsignal.com/api/signals/ping/YOUR_TOKEN \
  -H "Content-Type: application/json" \
  -d '{
    "dependencies": [
      { "name": "PostgreSQL", "status": "ok" },
      { "name": "Redis", "status": "down" }
    ]
  }'

IonHour evaluates dependency status and can create separate DEPENDENCY_DOWN incidents when a dependency is reported as unavailable.

Schedule and Grace Period

The interval defines how often IonHour expects a ping. The grace period adds a buffer on top of that.

For example, with a 5-minute interval and a 30-second grace period:

Ping received at 10:00:00
Next ping expected by 10:05:00
Grace period extends deadline to 10:05:30
If no ping by 10:05:30 → check transitions to LATE

Choosing the Right Grace Period

Short grace period (5–15s): For time-critical jobs where even small delays matter.
Medium grace period (15–30s): Good default for most cron jobs. Absorbs minor network jitter without false alarms.
Long grace period (30–60s): For jobs with variable execution time, or when running on shared infrastructure where startup delays are common.

Check Status Lifecycle

Every check follows this state machine:

Inbound check state diagram

Status	Meaning
NEW	Just created, no pings received yet
OK	Receiving pings on schedule
LATE	A ping is overdue (within grace period window)
DOWN	Grace period exceeded, incident created
PAUSED	Manually paused or paused by a deployment window

When a check goes DOWN, IonHour automatically creates an incident and sends alerts to your configured notification channels. When the next ping arrives, the check recovers to OK and the incident is resolved.

Integration Examples

Crontab

# Run backup every hour, ping IonHour on success
0 * * * * /usr/local/bin/backup.sh && curl -s https://app.failsignal.com/api/signals/ping/YOUR_TOKEN > /dev/null

Node.js

const https = require('https');

async function runJob() {
  // ... your job logic ...

  // Ping IonHour on success
  https.get('https://app.failsignal.com/api/signals/ping/YOUR_TOKEN');
}

Python

import requests

def run_job():
    # ... your job logic ...

    # Ping IonHour on success
    requests.get("https://app.failsignal.com/api/signals/ping/YOUR_TOKEN")

GitHub Actions

steps:
  - name: Run scheduled task
    run: ./scripts/nightly-build.sh

  - name: Ping IonHour
    if: success()
    run: curl -s https://app.failsignal.com/api/signals/ping/${{ secrets.FAILSIGNAL_TOKEN }}

Docker Healthcheck

HEALTHCHECK --interval=5m --timeout=10s --retries=1 \
  CMD curl -sf https://app.failsignal.com/api/signals/ping/YOUR_TOKEN || exit 1

Signal history in the check detail view

Badges

Each check has a public badge URL that returns a shields.io-style status badge. No authentication required.

https://app.failsignal.com/api/checks/\{id\}/badge

Embed it in your README or dashboard:

![Status](https://app.failsignal.com/api/checks/123/badge)

The badge color reflects the current check status: green for OK, red for DOWN.

Best Practices

Ping only on success. If your job fails, don't send the ping. The missed ping is what triggers the alert.
Place the ping at the end. Make sure your job has fully completed before sending the heartbeat.
Use POST with payloads for jobs where you want debugging context (row counts, durations, error summaries).
Set grace periods realistically. If your job takes 10–30 seconds to run, a 5-second grace period will cause false alarms.
One check per job. Don't reuse a single check token across multiple independent jobs — you won't know which one failed.
Keep tokens secret. Treat your ping token like an API key. Don't commit it to public repositories.