Cron Setup¶
Scheduled jobs required for a fully operational ARCHER instance. Run crontab -e and paste the block below — it is the canonical source of truth for what should be scheduled. Each entry is explained in the reference table beneath it.
Crontab block¶
# ── Analytics ─────────────────────────────────────────────────────────────────
0 3 1 * * bash /home/jay/analytics/update-geo.sh >> /home/jay/analytics/geo-update.log 2>&1
# ── ARCHER ────────────────────────────────────────────────────────────────────
0 8 * * * /home/jay/Projects/ARCHER/scripts/archer_morning_digest.sh >> /tmp/archer_digest_cron.log 2>&1
0 * * * * . /home/jay/Projects/ARCHER/archer_env/bin/activate && python3 /home/jay/Projects/ARCHER/scripts/issue_snapshot.py >> /tmp/archer_issue_snapshot.log 2>&1
0 * * * * /home/jay/scripts/archer-backup.sh
17 3 * * * . ~/.bash_secrets && /home/jay/Projects/ARCHER/archer_env/bin/python /home/jay/Projects/ARCHER/scripts/propose_objectives.py --sync-nvd >> /tmp/archer_nvd_sync.log 2>&1
0 2 * * 0 cd /home/jay/Projects/ARCHER && bash testenv/autorun.sh >> testenv/autorun.log 2>&1
Job reference¶
| Schedule | Script | Purpose | Log |
|---|---|---|---|
| Monthly (1st, 03:00) | analytics/update-geo.sh |
Refresh MaxMind GeoLite2 DB used by session geo-tagging | analytics/geo-update.log |
| Daily (08:00) | scripts/archer_morning_digest.sh |
Build and push daily health digest — RED/AMBER metrics, last eval pass rate, health summary line | /tmp/archer_digest_cron.log |
| Hourly | scripts/issue_snapshot.py |
Snapshot open issue count and re-render burndown chart in docs/assets/ |
/tmp/archer_issue_snapshot.log |
| Hourly | scripts/archer-backup.sh |
rsync ARCHER training data to external SSD (issue #156) | syslog via rsync |
| Daily (03:17) | scripts/propose_objectives.py --sync-nvd |
Pull NVD change feed since last sync; refresh per-CVE cache files that changed. Writes manifest to /tmp/archer_proposals/nvd_sync_manifest.json. Staggered 17 min after geo update to avoid I/O contention. |
/tmp/archer_nvd_sync.log |
| Weekly (Sun, 02:00) | testenv/autorun.sh |
Full eval harness run + regression watcher. Skips cleanly if lab range is unreachable (preflight ping). Exits 1 and logs if a regression is detected. | testenv/autorun.log |
Notes¶
Virtualenv: propose_objectives.py uses the venv python directly (archer_env/bin/python) because it depends on uvicorn and other packages not in the system python. issue_snapshot.py activates the venv via . activate && for the same reason. Don't swap these to python3.
NVD sync cold start: On a fresh machine the per-CVE NVD cache (/tmp/archer_proposals/nvd_CVE-*.json) is empty. The first Regenerate run in the training UI will populate it — ~23k CVEs at 0.6 s/req takes several hours. The daily --sync-nvd cron only refreshes existing cache entries; it will produce 0 updates until the cold-start run completes.
autorun.sh range dependency: The weekly eval requires the GOAD-Light lab range to be up (192.168.56.103 reachable). If the range is down the script exits 0 silently — it does not retry. Start the range manually before Sunday 02:00 if a weekly eval result is needed.
Log rotation: The /tmp/ logs are not rotated automatically and will grow unboundedly on a long-lived machine. Add logrotate entries or periodically truncate if disk space is a concern.
Secrets: archer_morning_digest.sh sources ~/.bash_secrets at runtime to pick up ARCHER_NTFY_TOPIC. The NVD sync reads NVD_API_KEY from the environment — archer_live.py injects it at server startup, but the cron job runs outside that process. If the key is not in the cron environment, the sync falls back to the unauthenticated rate (6 s/req). To fix, add to crontab:
17 3 * * * . ~/.bash_secrets && /home/jay/Projects/ARCHER/archer_env/bin/python /home/jay/Projects/ARCHER/scripts/propose_objectives.py --sync-nvd >> /tmp/archer_nvd_sync.log 2>&1