Commit Graph

18 Commits

Author SHA1 Message Date
a629291722 Auto-clear stale batch markers in filter before submitting
When a batch completes but scores aren't written back (collection
error), jobs get stuck with filter_batch_id set and never re-submitted.
Now checks: if no filter_state.json exists (no batch in flight) but
jobs have batch markers without scores, clear them so they get
re-submitted on the next run.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 16:41:08 -08:00
87cfce8eca Filter: stop spamming Telegram on submit, collect+submit in one run
- Removed Telegram notification on batch submit (only notify on collect
  when results are ready)
- After collecting, immediately submit remaining unscored jobs in the
  same run instead of waiting for next cron cycle

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 13:39:19 -08:00
c99ea10585 Richer search and filter summaries
Search: show per-track breakdown (found/added per track name)
Filter: show top 5 scoring jobs with score, title, company and cost

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 13:34:21 -08:00
4419363b3c Fix process exit: use process.exit() directly instead of logStream.end callback
logStream.end() callback wasn't firing reliably, leaving processes hanging.
process.exit() is synchronous and forces exit regardless of open handles.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 12:21:55 -08:00
d43e2025b2 Fix process not exiting after run, detect closed job listings
- All entry points with log tee now call logStream.end() + process.exit()
  (log stream kept event loop alive, blocking next cron run)
- easy_apply: detect "no longer accepting applications" and similar closed
  listing text before reporting as unsupported

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 12:19:00 -08:00
51ca354c52 Audit fixes: remove dead code, fix run timeout bug, add log tee to all entry points
- Remove unused APPLY_PRIORITY array (replaced by score-based sort)
- Fix run timeout only breaking inner loop — now breaks outer platform loop too
- Remove dead lastProgress variable in easy_apply step loop
- Add stdout/stderr log tee to job_searcher, job_filter, telegram_poller

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 12:13:01 -08:00
b1528ac0ad refactor: extract magic numbers to constants, fix audit issues
- Centralize all magic numbers/strings in lib/constants.mjs
- Fix double-replaced import names in filter.mjs
- Consolidate duplicate fs imports in job_applier/job_searcher
- Remove empty JSDoc block in job_searcher
- Update keywords.mjs model from claude-3-haiku to claude-haiku-4-5
- Extract Anthropic API URLs to constants
- Convert :has-text() selectors to page.locator() API
- Fix SIGTERM handler conflict — move partial-run notification into lock.onShutdown
- Remove unused exports (LOCAL_USER_AGENT, DEFAULT_REVIEW_WINDOW_MINUTES)
- Fix variable shadowing (b -> v) in job_filter reduce callback
- Replace SKILL.md PM2 references with system cron

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 08:45:17 -08:00
37b95b6b85 feat: track token usage and estimated cost per filter run in filter_runs.json 2026-03-06 16:22:14 +00:00
c9b527c83a feat: find-all → filter → dedup flow
- addJobs: allows same job on multiple tracks (dedup key = track::id)
- Cross-track copies get composite id (job.id_track) to avoid batch collisions
- dedupeAfterFilter(): after collect, keeps highest-scored copy per URL, marks rest as 'duplicate'
- Called automatically at end of collect phase
2026-03-06 15:55:00 +00:00
c88a71fc20 feat: one batch per track — separate GTM/AE batches with their own system prompts
- submitBatch → submitBatches: groups jobs by track, submits one batch each
- filter_state.json now stores batches[] array instead of single batch_id
- Collect waits for all batches to finish before processing
- Each track gets its own cached system prompt = better caching + cleaner scoring
- Idempotent collect: skips already-scored jobs
2026-03-06 11:35:15 +00:00
85c88f9eed fix: make filter idempotent - skip already-scored jobs on collect, exclude by filter_score on submit 2026-03-06 11:25:19 +00:00
56eb645e73 fix: import saveQueue statically instead of dynamic import; was causing queue writes to silently fail 2026-03-06 11:22:09 +00:00
64748d5889 fix: stamp filter_batch_id on submitted jobs; exclude already-submitted/filtered from resubmit
- Submit phase now excludes jobs with filter_batch_id set (already in a batch)
- After submitting, stamps each job with filter_batch_id = batchId
- Filtered jobs already excluded by status='filtered'
- Prevents duplicate submissions when batch errors cause state to clear without scores
2026-03-06 11:13:10 +00:00
85038b6ce1 fix: batch collect O(n²) → single queue write; correct model to claude-3-haiku-20240307
- updateJobStatus was called 4,652 times causing ~4,652 file reads/writes
- Now loads queue once, applies all updates in memory, saves once
- Model was using OpenClaw alias (sonnet-4-6) not native Anthropic ID
- Only claude-3-haiku-20240307 is available on this API key; update settings.example.json
2026-03-06 10:56:54 +00:00
728e0773b9 fix: sanitize Unicode surrogates in job descriptions, handle custom_id > 64 chars 2026-03-06 10:18:54 +00:00
d610060dbb feat: persistent run history logs for searcher and filter
- search_runs.json: append-only history of every searcher run
  (started_at, finished, added, seen, platforms, lookback_days)
- search_progress_last.json: snapshot of final progress state after
  each completed run — answers 'what keywords/tracks were searched?'
- filter_runs.json: append-only history of every filter batch
  (batch_id, submitted/collected timestamps, model, passed/filtered/errors)
Fixes the 'did the 90-day run complete?' ambiguity going forward
2026-03-06 10:16:06 +00:00
dbe9967713 feat: rewrite filter to use Anthropic Batch API
- Batch API = 50% cost savings vs synchronous calls
- Prompt caching on system prompt (profile + criteria shared across all jobs)
- One request per job with custom_id = job ID for result matching
- Two-phase state machine: submit → poll/collect (hourly cron safe)
- filter_state.json tracks pending batch ID between runs
- Model configurable via settings.filter.model (default: claude-sonnet-4-6)
- Telegram notifications on submit + collect
- Errors pass through — never block applications due to filter failure
- --stats flag for queue overview
2026-03-06 10:12:47 +00:00
9bf904dada feat: AI job filter — score jobs 0-10 with Claude Haiku before applying
- lib/filter.mjs: batch scoring engine (10 jobs/call, Claude Haiku)
- job_filter.mjs: standalone CLI with --dry-run and --stats flags
- Threshold configurable globally + per-search in search_config.json (filter_min_score, default 5)
- Job profiles (gtm/ae) passed as context via settings.filter.job_profiles
- Filtered jobs get status='filtered' with filter_score + filter_reason
- Filter errors pass jobs through (never block applications)
- status.mjs: added 'AI filtered' line to report
2026-03-06 10:01:15 +00:00