Update all docs: README, SKILL.md, SPEC.md for current architecture

- Add Telegram answer learning flow (poller + applier safety net)
- Add AI filtering, job scoring, cross-track dedup
- Add browser crash recovery, fuzzy select matching, shadow DOM details
- Update file structure with all new modules
- Update job statuses (no_modal, stuck, filtered, duplicate)
- Update scheduling info (OpenClaw crons, not crontab/PM2)
- Update roadmap

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-06 11:42:52 -08:00
parent 0920554dad
commit 0695d61954
3 changed files with 339 additions and 128 deletions

181
README.md
View File

@@ -7,10 +7,12 @@ Built for [OpenClaw](https://openclaw.dev) but runs standalone with Node.js.
## What it does ## What it does
- **Searches** LinkedIn and Wellfound on a schedule with your configured keywords and filters - **Searches** LinkedIn and Wellfound on a schedule with your configured keywords and filters
- **Filters** jobs using Claude AI batch scoring — only applies to roles that match your profile
- **Applies** to matching jobs automatically via LinkedIn Easy Apply and Wellfound's native flow - **Applies** to matching jobs automatically via LinkedIn Easy Apply and Wellfound's native flow
- **Learns** — when it hits a question it can't answer, it messages you on Telegram, saves your reply, and never asks again - **Learns** — when it hits a question it can't answer, it asks Claude for a suggestion, messages you on Telegram, and saves your reply for all future jobs
- **Deduplicates** across runs so you never apply to the same job twice - **Deduplicates** across runs and search tracks so you never apply to the same job twice
- **Retries** failed applications up to a configurable number of times before giving up - **Retries** failed applications up to a configurable number of times before giving up
- **Recovers** from browser crashes, session timeouts, and network errors automatically
## Quick start ## Quick start
@@ -42,16 +44,21 @@ claw-apply uses [Kernel](https://kernel.sh) for stealth browser sessions that by
```bash ```bash
npm install -g @onkernel/cli npm install -g @onkernel/cli
kernel login
# Create a residential proxy # Create a residential proxy
kernel proxies create --type residential --country US --name "claw-apply-proxy" kernel proxies create --type residential --country US --name "claw-apply-proxy"
# Create authenticated browser profiles (follow prompts to log in) # Create authenticated browser profiles
kernel auth create --name "LinkedIn-YourName" kernel auth connections create --profile-name "LinkedIn-YourName" --domain linkedin.com
kernel auth create --name "WellFound-YourName" kernel auth connections create --profile-name "WellFound-YourName" --domain wellfound.com
# Complete initial login flows
kernel auth connections login <linkedin-connection-id>
kernel auth connections login <wellfound-connection-id>
``` ```
Add the profile names and proxy ID to `config/settings.json`. Add the profile names, connection IDs, and proxy ID to `config/settings.json`.
### 3. Set up Telegram notifications ### 3. Set up Telegram notifications
@@ -60,33 +67,46 @@ Add the profile names and proxy ID to `config/settings.json`.
3. Message [@userinfobot](https://t.me/userinfobot) to get your user ID 3. Message [@userinfobot](https://t.me/userinfobot) to get your user ID
4. Add it to `settings.json` -> `notifications.telegram_user_id` 4. Add it to `settings.json` -> `notifications.telegram_user_id`
### 4. Verify setup ### 4. Create .env
```bash ```bash
KERNEL_API_KEY=your_key node setup.mjs echo "KERNEL_API_KEY=your_kernel_api_key" > .env
echo "ANTHROPIC_API_KEY=your_anthropic_api_key" >> .env
``` ```
This validates your config, tests LinkedIn and Wellfound logins, and sends a test Telegram message. The `.env` file is gitignored. `ANTHROPIC_API_KEY` is optional but enables AI keyword generation and AI-suggested answers.
### 5. Run ### 5. Verify setup
```bash ```bash
# Search for jobs node setup.mjs
KERNEL_API_KEY=your_key node job_searcher.mjs
# Preview what's in the queue before applying
KERNEL_API_KEY=your_key node job_applier.mjs --preview
# Apply to queued jobs
KERNEL_API_KEY=your_key node job_applier.mjs
``` ```
For automated runs, set up cron or use OpenClaw's scheduler: Validates config, tests logins, and sends a test Telegram message.
### 6. Run
```bash
node job_searcher.mjs # search now
node job_filter.mjs # AI filter + score jobs
node job_applier.mjs --preview # preview queue without applying
node job_applier.mjs # apply now
node telegram_poller.mjs # process Telegram answer replies
node status.mjs # show queue + run status
``` ```
Search: 0 * * * * (hourly)
Apply: 0 */6 * * * (every 6 hours) ### 7. Schedule (OpenClaw crons)
```
Scheduling is managed via OpenClaw cron jobs:
| Job | Schedule | Description |
|-----|----------|-------------|
| Searcher | `0 */12 * * *` | Search every 12 hours |
| Filter | `30 * * * *` | AI filter every hour at :30 |
| Applier | disabled by default | Enable when ready |
| Telegram Poller | `* * * * *` | Process answer replies every minute |
The lockfile mechanism ensures only one instance of each agent runs at a time.
## How it works ## How it works
@@ -94,32 +114,48 @@ Apply: 0 */6 * * * (every 6 hours)
1. Runs your configured keyword searches on LinkedIn and Wellfound 1. Runs your configured keyword searches on LinkedIn and Wellfound
2. Paginates through results (LinkedIn) and infinite-scrolls (Wellfound) 2. Paginates through results (LinkedIn) and infinite-scrolls (Wellfound)
3. Filters out excluded keywords and companies 3. Classifies each job: Easy Apply, external ATS (Greenhouse, Lever, etc.), or recruiter-only
4. Deduplicates against the existing queue by job ID and URL 4. Filters out excluded keywords and companies
5. Saves new jobs to `data/jobs_queue.json` with status `new` 5. Deduplicates against the existing queue by job ID and URL
6. Sends a Telegram summary 6. Saves new jobs to `data/jobs_queue.json` with status `new`
7. Sends a Telegram summary
### Filter flow
1. Submits jobs to Claude AI via Anthropic Batch API (50% cost savings)
2. Scores each job 1-10 based on match to your profile and search track
3. Jobs below the minimum score (default 5) are marked `filtered`
4. Cross-track deduplication keeps the highest-scoring copy
5. Two-phase design: submit batch → collect results (designed for cron)
### Apply flow ### Apply flow
1. Picks up all `new` and `needs_answer` jobs from the queue (up to `max_applications_per_run`) 1. Processes Telegram replies first — saves new answers, flips answered jobs back to `new`
2. Opens a stealth browser session per platform 2. Picks up all `new` and `needs_answer` jobs, sorted by priority (Easy Apply first)
3. For each job: 3. Reloads `answers.json` before each job (picks up Telegram replies mid-run)
- **LinkedIn Easy Apply**: navigates to job, clicks Easy Apply, fills the multi-step modal, submits 4. Opens a stealth browser session per platform (LinkedIn, Wellfound, external)
5. For each job:
- **LinkedIn Easy Apply**: navigates to job, clicks Easy Apply, fills the multi-step modal (Next → Review → Submit), handles post-submit confirmation dialogs
- **Wellfound**: navigates to job, clicks Apply, fills the form, submits - **Wellfound**: navigates to job, clicks Apply, fills the form, submits
- Detects and skips recruiter-only listings, external ATS jobs, and honeypot questions - Detects and skips recruiter-only listings, external ATS jobs, and honeypot questions
4. On unknown required fields, messages you on Telegram and moves on - Selects resume from previously uploaded resumes (radio buttons) or uploads via file input
5. Failed jobs are retried on the next run (up to `max_retries`, default 2) 6. On unknown required fields: asks Claude for a suggested answer, messages you on Telegram with the question + AI suggestion, moves on
6. Sends a summary with counts: applied, failed, needs answer, skipped 7. Failed jobs are retried on the next run (up to `max_retries`, default 2)
8. Browser crash recovery: detects dead sessions and creates fresh browsers automatically
9. Sends a summary with counts: applied, failed, needs answer, skipped
### Self-learning answers ### Self-learning answers
When the applier encounters a form question it doesn't know how to answer: When the applier encounters a form question it doesn't know how to answer:
1. Marks the job as `needs_answer` with the question text 1. Claude generates a suggested answer based on your profile and resume
2. Sends you a Telegram message with the question 2. Telegram message sent with the question, options (if select), and AI suggestion
3. You reply with the answer 3. You reply with your answer, or reply "ACCEPT" to use the AI suggestion
4. The answer is saved to `config/answers.json` as a pattern match 4. The Telegram poller (cron, every minute) saves your answer to `answers.json` and flips the job back to `new`
5. Next run, it retries the job and fills in the answer automatically 5. Next applier run retries the job with the saved answer
6. **Every future job** with the same question is answered automatically
Over time, all common questions get answered and the applier runs fully autonomously.
Patterns support regex: Patterns support regex:
@@ -137,8 +173,9 @@ Patterns support regex:
| Key | Default | Description | | Key | Default | Description |
|-----|---------|-------------| |-----|---------|-------------|
| `max_applications_per_run` | no limit | Cap applications per run (optional, set to avoid rate limits) | | `max_applications_per_run` | no limit | Cap applications per run |
| `max_retries` | `2` | Times to retry a failed application before marking it permanently failed | | `max_retries` | `2` | Times to retry a failed application |
| `enabled_apply_types` | `["easy_apply"]` | Which apply types to process |
| `browser.provider` | `"kernel"` | `"kernel"` for stealth browsers, `"local"` for local Playwright | | `browser.provider` | `"kernel"` | `"kernel"` for stealth browsers, `"local"` for local Playwright |
### Search filters ### Search filters
@@ -156,23 +193,30 @@ Patterns support regex:
``` ```
claw-apply/ claw-apply/
├── job_searcher.mjs Search agent ├── job_searcher.mjs Search agent
├── job_filter.mjs AI filter + scoring agent
├── job_applier.mjs Apply agent ├── job_applier.mjs Apply agent
├── telegram_poller.mjs Telegram answer reply processor
├── setup.mjs Setup wizard ├── setup.mjs Setup wizard
├── status.mjs Queue status report ├── status.mjs Queue status report
├── lib/ ├── lib/
│ ├── constants.mjs Shared constants and defaults │ ├── constants.mjs Shared constants and defaults
│ ├── browser.mjs Kernel/Playwright browser factory │ ├── browser.mjs Kernel/Playwright browser factory
│ ├── session.mjs Kernel Managed Auth session refresh │ ├── session.mjs Kernel Managed Auth session refresh
│ ├── form_filler.mjs Generic form filling with pattern matching │ ├── env.mjs .env loader (no dotenv dependency)
│ ├── keywords.mjs AI-generated search keywords via Claude │ ├── form_filler.mjs Form filling with pattern matching
│ ├── ai_answer.mjs AI answer generation via Claude
│ ├── filter.mjs AI job scoring via Anthropic Batch API
│ ├── keywords.mjs AI-generated search keywords
│ ├── linkedin.mjs LinkedIn search + job classification │ ├── linkedin.mjs LinkedIn search + job classification
│ ├── wellfound.mjs Wellfound search │ ├── wellfound.mjs Wellfound search
│ ├── queue.mjs Job queue and config management │ ├── queue.mjs Job queue with atomic writes
│ ├── lock.mjs Process lock to prevent parallel runs │ ├── lock.mjs PID-based process lock
│ ├── notify.mjs Telegram notifications with rate limiting │ ├── notify.mjs Telegram Bot API (send, getUpdates, reply)
│ ├── search_progress.mjs Per-platform search resume tracking
│ ├── telegram_answers.mjs Telegram reply → answers.json processing
│ └── apply/ │ └── apply/
│ ├── index.mjs Apply handler registry │ ├── index.mjs Apply handler registry + status normalization
│ ├── easy_apply.mjs LinkedIn Easy Apply │ ├── easy_apply.mjs LinkedIn Easy Apply (multi-step modal)
│ ├── wellfound.mjs Wellfound apply │ ├── wellfound.mjs Wellfound apply
│ ├── greenhouse.mjs Greenhouse ATS (stub) │ ├── greenhouse.mjs Greenhouse ATS (stub)
│ ├── lever.mjs Lever ATS (stub) │ ├── lever.mjs Lever ATS (stub)
@@ -187,7 +231,8 @@ claw-apply/
│ └── settings.json Your settings (gitignored) │ └── settings.json Your settings (gitignored)
└── data/ └── data/
├── jobs_queue.json Job queue (auto-managed) ├── jobs_queue.json Job queue (auto-managed)
── applications_log.json Application history (auto-managed) ── applications_log.json Application history (auto-managed)
└── telegram_offset.json Telegram polling offset (auto-managed)
``` ```
## Job statuses ## Job statuses
@@ -199,27 +244,47 @@ claw-apply/
| `needs_answer` | Blocked on unknown question, waiting for your reply | | `needs_answer` | Blocked on unknown question, waiting for your reply |
| `failed` | Failed after max retries | | `failed` | Failed after max retries |
| `already_applied` | Duplicate detected, previously applied | | `already_applied` | Duplicate detected, previously applied |
| `filtered` | Below AI score threshold |
| `duplicate` | Cross-track duplicate (lower-scoring copy) |
| `skipped_honeypot` | Honeypot question detected | | `skipped_honeypot` | Honeypot question detected |
| `skipped_recruiter_only` | LinkedIn recruiter-only listing | | `skipped_recruiter_only` | LinkedIn recruiter-only listing |
| `skipped_external_unsupported` | External ATS (Greenhouse, Lever, etc. — stubs ready) | | `skipped_external_unsupported` | External ATS (Greenhouse, Lever, etc.) |
| `skipped_easy_apply_unsupported` | LinkedIn job without Easy Apply button | | `skipped_easy_apply_unsupported` | LinkedIn job without Easy Apply button |
| `skipped_no_apply` | No apply button, modal, or submit found on page | | `skipped_no_apply` | No apply button found on page |
| `stuck` | Modal progress stalled | | `no_modal` | Easy Apply button found but modal didn't open |
| `incomplete` | Ran out of modal steps without submitting | | `stuck` | Modal progress stalled after repeated clicks |
| `incomplete` | Modal flow didn't reach submit |
## ATS support
| Platform | Status |
|---|---|
| LinkedIn Easy Apply | Full |
| Wellfound | Full |
| Greenhouse | Stub |
| Lever | Stub |
| Workday | Stub |
| Ashby | Stub |
| Jobvite | Stub |
External ATS jobs are queued and classified — stubs will be promoted to full implementations based on usage data.
## Roadmap ## Roadmap
- [x] LinkedIn Easy Apply - [x] LinkedIn Easy Apply (multi-step modal)
- [x] Wellfound apply - [x] Wellfound apply
- [x] Kernel stealth browsers + residential proxy - [x] Kernel stealth browsers + residential proxy
- [x] Self-learning answer bank - [x] AI job filtering via Anthropic Batch API
- [x] Retry logic for transient failures - [x] Self-learning answer bank with Telegram Q&A loop
- [x] AI-suggested answers via Claude
- [x] Telegram answer polling (instant save + applier safety net)
- [x] Browser crash recovery
- [x] Retry logic with configurable max retries
- [x] Preview mode (`--preview`) - [x] Preview mode (`--preview`)
- [x] Configurable application caps and retry limits - [x] Configurable application caps and retry limits
- [ ] Indeed support - [ ] External ATS support (Greenhouse, Lever, Workday, Ashby, Jobvite)
- [ ] External ATS support (Greenhouse, Lever, Workday, Ashby, Jobvite — stubs ready)
- [ ] Job scoring and ranking
- [ ] Per-job cover letter generation via LLM - [ ] Per-job cover letter generation via LLM
- [ ] Indeed support
## License ## License

View File

@@ -1,20 +1,20 @@
--- ---
name: claw-apply name: claw-apply
description: Automated job search and application for LinkedIn and Wellfound. Searches for matching roles every 12 hours, applies automatically every 6 hours using Playwright + Kernel stealth browsers. Handles LinkedIn Easy Apply multi-step modals and Wellfound applications. Self-learning — asks you via Telegram when it hits an unknown question, saves your answer, and never asks again. Retries failed applications automatically. Preview mode lets you review the queue before applying. description: Automated job search and application for LinkedIn and Wellfound. Searches for matching roles every 12 hours, AI-filters and scores them, applies automatically using Playwright + Kernel stealth browsers. Handles LinkedIn Easy Apply multi-step modals and Wellfound applications. Self-learning — asks you via Telegram when it hits an unknown question, suggests an AI answer, saves your reply, and never asks again. Recovers from browser crashes and retries failed applications automatically.
--- ---
# claw-apply # claw-apply
Automated job search and application. Finds matching roles on LinkedIn and Wellfound, applies automatically, and learns from every unknown question. Automated job search and application. Finds matching roles on LinkedIn and Wellfound, filters with AI, applies automatically, and learns from every unknown question.
## Requirements ## Requirements
- Node.js 18+ - Node.js 18+
- [Kernel](https://kernel.sh) account — stealth browsers + bot detection bypass (required) - [Kernel](https://kernel.sh) account — stealth browsers + bot detection bypass (required)
- Kernel CLI: `npm install -g @onkernel/cli` — see [kernel/skills](https://github.com/kernel/skills) for CLI + auth guidance - Kernel CLI: `npm install -g @onkernel/cli` — see [kernel/skills](https://github.com/kernel/skills) for CLI + auth guidance
- Telegram bot for notifications ([BotFather](https://t.me/BotFather)) - Telegram bot for notifications and interactive Q&A ([BotFather](https://t.me/BotFather))
- Anthropic API key (optional — enables AI-enhanced keyword generation) - Anthropic API key (optional — enables AI filtering, keyword generation, and suggested answers)
- OpenClaw (optional — enables auto-scheduling via `setup.mjs`) - OpenClaw (optional — enables auto-scheduling via crons)
> **Note:** Playwright is installed automatically via `npm install` as a library for browser connectivity. You don't need to install it globally or manage browsers yourself — Kernel handles all browser execution. > **Note:** Playwright is installed automatically via `npm install` as a library for browser connectivity. You don't need to install it globally or manage browsers yourself — Kernel handles all browser execution.
@@ -77,7 +77,7 @@ Create a `.env` file in the project root (gitignored — never commit this):
```bash ```bash
KERNEL_API_KEY=your_kernel_api_key KERNEL_API_KEY=your_kernel_api_key
ANTHROPIC_API_KEY=your_anthropic_api_key # optional, for AI keywords ANTHROPIC_API_KEY=your_anthropic_api_key # optional, for AI features
``` ```
### 5. Verify ### 5. Verify
@@ -94,57 +94,68 @@ Setup will:
### 6. Schedule with OpenClaw crons ### 6. Schedule with OpenClaw crons
Scheduling is managed via OpenClaw cron jobs (not system crontab). Run `setup.mjs` to register them, or add manually: Scheduling is managed via OpenClaw cron jobs (not system crontab):
- **Searcher**: `0 */12 * * *` America/Los_Angeles — every 12 hours | Job | Schedule | Description |
- **Filter**: `30 * * * *` America/Los_Angeles — every hour at :30 |-----|----------|-------------|
- **Applier**: disabled by default — enable manually when ready | Searcher | `0 */12 * * *` America/Los_Angeles | Search every 12 hours |
| Filter | `30 * * * *` America/Los_Angeles | AI filter every hour at :30 |
| Applier | disabled by default | Enable when ready to auto-apply |
| Telegram Poller | `* * * * *` America/Los_Angeles | Process answer replies every minute |
Do not use system crontab (`crontab -e`) — OpenClaw crons provide Telegram delivery, isolated sessions, and proper logging. The lockfile mechanism ensures only one instance of each agent runs at a time.
The lockfile mechanism ensures only one instance runs at a time — if a searcher is already running, the cron invocation exits immediately.
### 7. Run manually ### 7. Run manually
```bash ```bash
node job_searcher.mjs # search now node job_searcher.mjs # search now
node job_filter.mjs # AI filter + score jobs
node job_applier.mjs --preview # preview queue without applying node job_applier.mjs --preview # preview queue without applying
node job_applier.mjs # apply now node job_applier.mjs # apply now
node telegram_poller.mjs # process Telegram answer replies
node status.mjs # show queue + run status node status.mjs # show queue + run status
``` ```
## How it works ## How it works
**Search** — runs your keyword searches on LinkedIn and Wellfound, paginates through results, inline-classifies each job (Easy Apply vs external ATS), filters exclusions, deduplicates, and queues new jobs. First run searches 90 days back; subsequent runs search 2 days. **Search** — runs your keyword searches on LinkedIn and Wellfound, paginates through results, classifies each job (Easy Apply vs external ATS), filters exclusions, deduplicates, and queues new jobs. First run searches 90 days back; subsequent runs search 2 days.
**Apply**picks up queued jobs sorted by priority (Easy Apply first), opens stealth browser sessions, fills forms using your profile + learned answers, and submits. Auto-refreshes Kernel auth sessions if login expires. Retries failed jobs (default 2 retries). **Filter**submits jobs to Claude AI via Anthropic Batch API for scoring (1-10). Jobs below the threshold are filtered out. Cross-track deduplication keeps the highest-scoring copy. Two-phase design for cron compatibility.
**Learn**on unknown questions, messages you on Telegram. You reply, the answer is saved to `answers.json` with regex pattern matching, and the job is retried next run. **Apply**picks up queued jobs sorted by priority (Easy Apply first), opens stealth browser sessions, fills forms using your profile + learned answers, and submits. Processes Telegram replies at start of each run. Reloads answers.json before each job. Auto-recovers from browser crashes. Retries failed jobs (default 2 retries). Per-job timeout of 10 minutes.
**Lockfile**prevents parallel runs. If searcher is running, a second invocation exits immediately. **Learn**on unknown questions, Claude suggests an answer and you're messaged on Telegram. Reply with your answer or "ACCEPT" the AI suggestion. The Telegram poller saves it to `answers.json` instantly and the job is retried next run. Over time, all questions get answered and the system runs fully autonomously.
**Lockfile** — prevents parallel runs. If an agent is already running, a second invocation exits immediately.
## File structure ## File structure
``` ```
claw-apply/ claw-apply/
├── job_searcher.mjs Search agent ├── job_searcher.mjs Search agent
├── job_filter.mjs AI filter + scoring agent
├── job_applier.mjs Apply agent ├── job_applier.mjs Apply agent
├── setup.mjs Setup wizard + cron registration ├── telegram_poller.mjs Telegram answer reply processor
├── setup.mjs Setup wizard
├── status.mjs Queue + run status report ├── status.mjs Queue + run status report
├── lib/ ├── lib/
│ ├── browser.mjs Kernel stealth browser factory │ ├── browser.mjs Kernel stealth browser factory
│ ├── session.mjs Auth session refresh via Kernel API │ ├── session.mjs Auth session refresh via Kernel API
│ ├── linkedin.mjs LinkedIn search + Easy Apply │ ├── env.mjs .env loader
│ ├── linkedin.mjs LinkedIn search + job classification
│ ├── wellfound.mjs Wellfound search + apply │ ├── wellfound.mjs Wellfound search + apply
│ ├── form_filler.mjs Form filling with pattern matching │ ├── form_filler.mjs Form filling with pattern matching
│ ├── queue.mjs Job queue + config management │ ├── ai_answer.mjs AI answer generation via Claude
│ ├── filter.mjs AI job scoring via Anthropic Batch API
│ ├── keywords.mjs AI-enhanced keyword generation │ ├── keywords.mjs AI-enhanced keyword generation
│ ├── queue.mjs Job queue with atomic writes
│ ├── lock.mjs PID lockfile + graceful shutdown │ ├── lock.mjs PID lockfile + graceful shutdown
│ ├── notify.mjs Telegram notifications │ ├── notify.mjs Telegram Bot API (send, getUpdates, reply)
│ ├── telegram_answers.mjs Telegram reply → answers.json processing
│ ├── search_progress.mjs Per-platform search resume tracking │ ├── search_progress.mjs Per-platform search resume tracking
│ ├── constants.mjs Shared constants + ATS patterns │ ├── constants.mjs Shared constants + ATS patterns
│ └── apply/ │ └── apply/
│ ├── index.mjs Handler registry │ ├── index.mjs Handler registry + status normalization
│ ├── easy_apply.mjs LinkedIn Easy Apply (full) │ ├── easy_apply.mjs LinkedIn Easy Apply (full)
│ ├── wellfound.mjs Wellfound apply (full) │ ├── wellfound.mjs Wellfound apply (full)
│ ├── greenhouse.mjs Greenhouse (stub) │ ├── greenhouse.mjs Greenhouse (stub)
@@ -160,7 +171,7 @@ claw-apply/
## answers.json — self-learning Q&A ## answers.json — self-learning Q&A
When the applier can't answer a question, it messages you on Telegram. Your reply is saved and reused forever: When the applier can't answer a question, it asks Claude for a suggestion and messages you on Telegram. Your reply is saved and reused forever:
```json ```json
[ [
@@ -176,12 +187,12 @@ Patterns are matched case-insensitively and support regex. First match wins.
| Platform | Status | | Platform | Status |
|---|---| |---|---|
| LinkedIn Easy Apply | Full | | LinkedIn Easy Apply | Full |
| Wellfound | Full | | Wellfound | Full |
| Greenhouse | 🚧 Stub | | Greenhouse | Stub |
| Lever | 🚧 Stub | | Lever | Stub |
| Workday | 🚧 Stub | | Workday | Stub |
| Ashby | 🚧 Stub | | Ashby | Stub |
| Jobvite | 🚧 Stub | | Jobvite | Stub |
External ATS jobs are queued and classified — stubs will be promoted to full implementations based on usage data. External ATS jobs are queued and classified — stubs will be promoted to full implementations based on usage data.

215
SPEC.md
View File

@@ -1,34 +1,60 @@
# claw-apply — Technical Spec # claw-apply — Technical Spec
Automated job search and application engine. Searches LinkedIn and Wellfound for matching roles, applies automatically using Playwright + Kernel stealth browsers, and self-learns from unknown questions. Automated job search and application engine. Searches LinkedIn and Wellfound for matching roles, AI-filters and scores them, applies automatically using Playwright + Kernel stealth browsers, and self-learns from unknown questions via Telegram.
--- ---
## Architecture ## Architecture
### Two agents, shared queue ### Four agents, shared queue
**JobSearcher** (`job_searcher.mjs`) **JobSearcher** (`job_searcher.mjs`)
- Runs on schedule (default: hourly) - Runs on schedule (default: every 12 hours)
- Searches configured platforms with configured keywords - Searches configured platforms with configured keywords
- LinkedIn: paginates through up to 40 pages of results - LinkedIn: paginates through up to 40 pages of results
- Wellfound: infinite-scrolls up to 10 times to load all results - Wellfound: infinite-scrolls up to 10 times to load all results
- Classifies each job: Easy Apply, external ATS (with platform detection), recruiter-only
- Filters out excluded roles/companies - Filters out excluded roles/companies
- Deduplicates by job ID and URL against existing queue - Deduplicates by job ID and URL against existing queue
- Cross-track duplicate IDs get composite IDs (`{id}_{track}`)
- Writes new jobs to `jobs_queue.json` with status `new` - Writes new jobs to `jobs_queue.json` with status `new`
- Sends Telegram summary - Sends Telegram summary
**JobFilter** (`job_filter.mjs`)
- Runs on schedule (default: every hour at :30)
- Two-phase: submit batch → collect results (designed for cron)
- Submits unscored jobs to Claude AI via Anthropic Batch API (50% cost savings)
- One batch per search track for prompt caching efficiency
- Scores each job 1-10 based on match to profile and search track
- Jobs below threshold (default 5) marked `filtered`
- Cross-track deduplication: groups by URL, keeps highest score
- State persisted in `data/filter_state.json` between phases
**JobApplier** (`job_applier.mjs`) **JobApplier** (`job_applier.mjs`)
- Runs on schedule (default: every 6 hours) - Runs on schedule (disabled by default until ready)
- Reads queue for status `new` + `needs_answer` - Processes Telegram replies at start (safety net for answer learning)
- Respects `max_applications_per_run` cap - Reloads `answers.json` before each job (picks up mid-run Telegram replies)
- LinkedIn: navigates directly to job URL, detects apply type (Easy Apply / external / recruiter-only), fills multi-step modal - Reads queue for status `new` + `needs_answer`, sorted by priority
- Wellfound: navigates to job, fills form, submits - Respects `max_applications_per_run` cap and `enabled_apply_types` filter
- Detects honeypot questions and skips - Groups jobs by platform to share browser sessions
- On unknown required fields: messages user via Telegram, marks `needs_answer` - LinkedIn Easy Apply: multi-step modal with shadow DOM handling
- Wellfound: form fill and submit
- On unknown required fields: generates AI answer, messages user via Telegram, marks `needs_answer`
- Browser crash recovery: detects dead page, creates fresh browser session
- Per-job timeout: 10 minutes. Overall run timeout: 45 minutes
- On error: retries up to `max_retries` (default 2) before marking `failed` - On error: retries up to `max_retries` (default 2) before marking `failed`
- Sends summary with granular skip reasons - Sends summary with granular skip reasons
**TelegramPoller** (`telegram_poller.mjs`)
- Runs every minute via OpenClaw cron
- Polls Telegram `getUpdates` for replies to question messages
- Matches replies via `reply_to_message_id` stored on jobs
- "ACCEPT" → use AI-suggested answer. Anything else → use reply text
- Saves answer to `answers.json` (reused for ALL future jobs)
- Flips job back to `new` for retry
- Sends confirmation reply on Telegram
- Lightweight: single HTTP call, exits immediately if no updates
**Preview mode** (`--preview`): shows queued jobs without applying. **Preview mode** (`--preview`): shows queued jobs without applying.
### Shared modules ### Shared modules
@@ -37,11 +63,66 @@ Automated job search and application engine. Searches LinkedIn and Wellfound for
|--------|---------------| |--------|---------------|
| `lib/constants.mjs` | All timeouts, selectors, defaults — no magic numbers in code | | `lib/constants.mjs` | All timeouts, selectors, defaults — no magic numbers in code |
| `lib/browser.mjs` | Browser factory — Kernel stealth (default) with local Playwright fallback | | `lib/browser.mjs` | Browser factory — Kernel stealth (default) with local Playwright fallback |
| `lib/form_filler.mjs` | Generic form filling — custom answers first, then built-in profile matching | | `lib/session.mjs` | Kernel Managed Auth session refresh |
| `lib/queue.mjs` | Queue CRUD with in-memory caching, config file validation | | `lib/env.mjs` | .env loader (no dotenv dependency) |
| `lib/notify.mjs` | Telegram Bot API with rate limiting (1.5s between sends) | | `lib/form_filler.mjs` | Form filling — custom answers, built-in profile matching, fuzzy select matching |
| `lib/linkedin.mjs` | LinkedIn search (paginated) + Easy Apply (multi-step modal) | | `lib/ai_answer.mjs` | AI answer generation via Claude (profile + resume context) |
| `lib/wellfound.mjs` | Wellfound search (infinite scroll) + apply | | `lib/filter.mjs` | AI job scoring via Anthropic Batch API |
| `lib/keywords.mjs` | AI-generated search keywords via Claude |
| `lib/queue.mjs` | Queue CRUD with in-memory caching, atomic writes, config validation |
| `lib/notify.mjs` | Telegram Bot API — send, getUpdates, reply (with rate limiting) |
| `lib/telegram_answers.mjs` | Telegram reply processing — matches to jobs, saves answers |
| `lib/search_progress.mjs` | Per-platform search resume tracking |
| `lib/lock.mjs` | PID-based lockfile with graceful shutdown |
| `lib/apply/index.mjs` | Apply handler registry with status normalization |
| `lib/apply/easy_apply.mjs` | LinkedIn Easy Apply — shadow DOM, multi-step modal, post-submit detection |
---
## LinkedIn Easy Apply — Technical Details
LinkedIn renders the Easy Apply modal inside **shadow DOM**. This means:
- `document.querySelector()` inside `page.evaluate()` **cannot** find modal elements
- `page.$()` and ElementHandle methods **pierce** shadow DOM and work correctly
- All modal operations use ElementHandle-based operations, never `evaluate` with `document.querySelector`
### Button detection
`findModalButton()` uses three strategies in order:
1. CSS selector via `page.$()` — aria-label exact match (pierces shadow DOM)
2. CSS selector via `page.$()` — aria-label substring match
3. `modal.$$('button')` + `btn.evaluate()` — text content matching
Check order per step: **Next → Review → Submit** (submit only when no forward nav exists).
### Modal flow
```
Easy Apply click → [fill fields → Next] × N → Review → Submit application
```
- Progress tracked via `<progress>` element (not `[role="progressbar"]`)
- Stuck detection: re-reads progress value after clicking Next, triggers after 3 unchanged clicks
- Submit verification: `waitForSelector(state: 'detached', timeout: 8s)` — event-driven, not fixed sleep
- Post-submit: checks for success text, absent Submit button, or validation errors
- Multiple `[role="dialog"]` elements: `findApplyModal()` identifies the apply modal and tags it with `data-claw-apply-modal`
### Form filling
- Labels found by walking up ancestor DOM (LinkedIn doesn't use `label[for="id"]`)
- Label deduplication for doubled text (e.g. "Phone country codePhone country code")
- Resume selection: selects first radio if none checked, falls back to file upload
- Select matching: `selectOptionFuzzy()` — exact → case-insensitive → substring → value
- Phone always overwritten (LinkedIn pre-fills wrong numbers)
- EEO/voluntary fields auto-select "Prefer not to disclose"
- Honeypot detection: questions containing "digit code", "secret word", etc.
### Dismiss flow
Always discards — never leaves drafts:
1. Click Dismiss/Close button or press Escape
2. Wait for Discard confirmation dialog
3. Click Discard (by `data-test-dialog-primary-btn` or text scan scoped to dialogs)
--- ---
@@ -103,6 +184,7 @@ All user config is gitignored. Example templates are committed.
{ {
"max_applications_per_run": 50, "max_applications_per_run": 50,
"max_retries": 2, "max_retries": 2,
"enabled_apply_types": ["easy_apply"],
"notifications": { "notifications": {
"telegram_user_id": "YOUR_TELEGRAM_USER_ID", "telegram_user_id": "YOUR_TELEGRAM_USER_ID",
"bot_token": "YOUR_TELEGRAM_BOT_TOKEN" "bot_token": "YOUR_TELEGRAM_BOT_TOKEN"
@@ -112,6 +194,10 @@ All user config is gitignored. Example templates are committed.
"profiles": { "profiles": {
"linkedin": "LinkedIn-YourName", "linkedin": "LinkedIn-YourName",
"wellfound": "WellFound-YourName" "wellfound": "WellFound-YourName"
},
"connection_ids": {
"linkedin": "YOUR_LINKEDIN_CONNECTION_ID",
"wellfound": "YOUR_WELLFOUND_CONNECTION_ID"
} }
}, },
"browser": { "browser": {
@@ -145,6 +231,7 @@ Flat array of pattern-answer pairs. Patterns are matched case-insensitively and
"id": "li_4381658809", "id": "li_4381658809",
"platform": "linkedin", "platform": "linkedin",
"track": "ae", "track": "ae",
"apply_type": "easy_apply",
"title": "Senior Account Executive", "title": "Senior Account Executive",
"company": "Acme Corp", "company": "Acme Corp",
"url": "https://linkedin.com/jobs/view/4381658809/", "url": "https://linkedin.com/jobs/view/4381658809/",
@@ -153,6 +240,8 @@ Flat array of pattern-answer pairs. Patterns are matched case-insensitively and
"status_updated_at": "2026-03-05T22:00:00Z", "status_updated_at": "2026-03-05T22:00:00Z",
"retry_count": 0, "retry_count": 0,
"pending_question": null, "pending_question": null,
"ai_suggested_answer": null,
"telegram_message_id": null,
"applied_at": null, "applied_at": null,
"notes": null "notes": null
} }
@@ -165,28 +254,50 @@ Flat array of pattern-answer pairs. Patterns are matched case-insensitively and
|--------|---------|-------------| |--------|---------|-------------|
| `new` | Found, waiting to apply | Applier picks it up | | `new` | Found, waiting to apply | Applier picks it up |
| `applied` | Successfully submitted | Done | | `applied` | Successfully submitted | Done |
| `needs_answer` | Blocked on unknown question | Applier retries after user answers | | `needs_answer` | Blocked on unknown question | Telegram poller saves reply, flips to `new` |
| `failed` | Failed after max retries | Manual review | | `failed` | Failed after max retries | Manual review |
| `skipped` | Honeypot detected | Permanent skip | | `already_applied` | Duplicate detected | Permanent skip |
| `filtered` | Below AI score threshold | Permanent skip |
| `duplicate` | Cross-track duplicate (lower score) | Permanent skip |
| `skipped_honeypot` | Honeypot question detected | Permanent skip |
| `skipped_recruiter_only` | LinkedIn recruiter-only | Permanent skip | | `skipped_recruiter_only` | LinkedIn recruiter-only | Permanent skip |
| `skipped_external_unsupported` | External ATS | Saved for future ATS support | | `skipped_external_unsupported` | External ATS | Saved for future ATS support |
| `skipped_easy_apply_unsupported` | No Easy Apply button | Permanent skip | | `skipped_easy_apply_unsupported` | No Easy Apply button | Permanent skip |
| `skipped_no_apply` | No apply button found | Permanent skip |
| `no_modal` | Button found but modal didn't open | Retried |
| `stuck` | Modal progress stalled | Retried |
| `incomplete` | Modal didn't reach submit | Retried |
### `applications_log.json` ### `applications_log.json`
Append-only history of every application attempt with outcome, timestamps, and metadata. Append-only history of every application attempt with outcome, timestamps, and metadata.
### `telegram_offset.json`
Stores the Telegram `getUpdates` offset to avoid reprocessing old messages.
### `filter_state.json`
Persists batch IDs between filter submit and collect phases.
--- ---
## Unknown question flow ## Self-learning answer flow
1. Applier encounters a required field with no matching answer 1. Applier encounters a required field with no matching answer
2. Marks job as `needs_answer`, stores question in `pending_question` 2. Claude generates a suggested answer using profile + resume context
3. Sends Telegram: "Applying to Senior AE @ Acme Corp — question: 'What was your quota attainment?' — what should I answer?" 3. Telegram message sent: question text, options (if select), AI suggestion
4. Moves on to next job 4. Job marked `needs_answer` with `telegram_message_id` stored
5. User replies with answer 5. User replies on Telegram: their answer, or "ACCEPT" for the AI suggestion
6. Answer saved to `answers.json` as pattern match 6. Telegram poller (every minute) picks up the reply:
7. Next applier run retries all `needs_answer` jobs - Matches via `reply_to_message_id` job
- Saves answer to `answers.json` as pattern match
- Flips job status back to `new`
- Sends confirmation reply
7. Next applier run: reloads answers, retries the job, fills the field automatically
8. All future jobs with the same question pattern are answered automatically
Safety net: applier also calls `processTelegramReplies()` at start of each run.
--- ---
@@ -199,6 +310,8 @@ When an application fails due to a transient error (timeout, network issue, page
3. After `max_retries` (default 2) failures, job is marked `failed` permanently 3. After `max_retries` (default 2) failures, job is marked `failed` permanently
4. Failed jobs are logged to `applications_log.json` with error details 4. Failed jobs are logged to `applications_log.json` with error details
Browser crash recovery: after an error, the applier checks if the page is still alive via `page.evaluate(() => true)`. If dead, it creates a fresh browser session and continues with the remaining jobs.
--- ---
## File structure ## File structure
@@ -208,23 +321,43 @@ claw-apply/
├── README.md Documentation ├── README.md Documentation
├── SKILL.md OpenClaw skill manifest ├── SKILL.md OpenClaw skill manifest
├── SPEC.md This file ├── SPEC.md This file
├── claw.json OpenClaw skill metadata
├── package.json npm manifest
├── job_searcher.mjs Search agent ├── job_searcher.mjs Search agent
├── job_filter.mjs AI filter + scoring agent
├── job_applier.mjs Apply agent ├── job_applier.mjs Apply agent
├── telegram_poller.mjs Telegram answer reply processor
├── setup.mjs Setup wizard ├── setup.mjs Setup wizard
├── status.mjs Queue + run status report
├── lib/ ├── lib/
│ ├── constants.mjs Shared constants and defaults │ ├── constants.mjs Shared constants and defaults
│ ├── browser.mjs Kernel/Playwright browser factory │ ├── browser.mjs Kernel/Playwright browser factory
│ ├── form_filler.mjs Form filling with pattern matching │ ├── session.mjs Kernel Managed Auth session refresh
│ ├── linkedin.mjs LinkedIn search + Easy Apply │ ├── env.mjs .env loader
│ ├── form_filler.mjs Form filling with fuzzy select matching
│ ├── ai_answer.mjs AI answer generation via Claude
│ ├── filter.mjs AI job scoring via Anthropic Batch API
│ ├── keywords.mjs AI-generated search keywords
│ ├── linkedin.mjs LinkedIn search + job classification
│ ├── wellfound.mjs Wellfound search + apply │ ├── wellfound.mjs Wellfound search + apply
│ ├── queue.mjs Queue management + config validation │ ├── queue.mjs Queue management with atomic writes
── notify.mjs Telegram notifications + rate limiting ── lock.mjs PID lockfile + graceful shutdown
│ ├── notify.mjs Telegram Bot API (send, getUpdates, reply)
│ ├── telegram_answers.mjs Telegram reply → answers.json processing
│ ├── search_progress.mjs Per-platform search resume tracking
│ └── apply/
│ ├── index.mjs Handler registry + status normalization
│ ├── easy_apply.mjs LinkedIn Easy Apply (shadow DOM, multi-step)
│ ├── wellfound.mjs Wellfound apply
│ ├── greenhouse.mjs Greenhouse ATS (stub)
│ ├── lever.mjs Lever ATS (stub)
│ ├── workday.mjs Workday ATS (stub)
│ ├── ashby.mjs Ashby ATS (stub)
│ └── jobvite.mjs Jobvite ATS (stub)
├── config/ ├── config/
│ ├── *.example.json Templates (committed) │ ├── *.example.json Templates (committed)
│ └── *.json User config (gitignored) │ └── *.json User config (gitignored)
└── data/ └── data/ Runtime data (gitignored, auto-managed)
├── jobs_queue.json Job queue (auto-managed)
└── applications_log.json Application history (auto-managed)
``` ```
--- ---
@@ -232,20 +365,22 @@ claw-apply/
## Roadmap ## Roadmap
### v1 (current) ### v1 (current)
- [x] LinkedIn Easy Apply (multi-step modal, pagination) - [x] LinkedIn Easy Apply (multi-step modal, shadow DOM)
- [x] Wellfound apply (infinite scroll) - [x] Wellfound apply (infinite scroll)
- [x] Kernel stealth browsers + residential proxy - [x] Kernel stealth browsers + residential proxy
- [x] Self-learning answer bank with regex patterns - [x] AI job filtering via Anthropic Batch API
- [x] Self-learning answer bank with Telegram Q&A loop
- [x] AI-suggested answers via Claude
- [x] Telegram answer polling (instant save + applier safety net)
- [x] Browser crash recovery
- [x] Retry logic with configurable max retries - [x] Retry logic with configurable max retries
- [x] Preview mode (`--preview`) - [x] Preview mode (`--preview`)
- [x] Configurable application caps - [x] Configurable application caps and retry limits
- [x] Telegram notifications with rate limiting
- [x] Config validation with clear error messages
- [x] In-memory queue caching for performance
- [x] Constants extracted — no magic numbers in code - [x] Constants extracted — no magic numbers in code
- [x] Atomic file writes for queue corruption prevention
- [x] Cross-track deduplication after AI scoring
### v2 (planned) ### v2 (planned)
- [ ] Indeed support - [ ] External ATS support (Greenhouse, Lever, Workday, Ashby, Jobvite)
- [ ] External ATS support (Greenhouse, Lever)
- [ ] Job scoring and ranking
- [ ] Per-job cover letter generation via LLM - [ ] Per-job cover letter generation via LLM
- [ ] Indeed support