Facility Pipeline — design spec (revised)
Canonical design after review. For the decision record see ADR-0040; for the build steps see 03-implementation-plan.md.
1. Goal & shape
Turn the 576 NY prospect facilities into a worked, prioritized BD pipeline with a daily worklist,
a defensible priority order, and an automated follow-up rhythm — by adding three layers to the
existing CRM (ADR-0033) rather than a parallel system. The feature lives on the existing
/admin/crm surface (Business-development nav group), adding tabs and reusing the Kanban,
facility detail, crm_leads, crm_activities, and outbound_emails.
The thesis: a SNF posting for an MDS Coordinator / PDPM-RAC nurse / chart reviewer / QM clerk is a time-bound buying signal for exactly what Nightingale sells. Combined with public CMS quality data and the NY Medicaid case-mix transition (MDS accuracy starts driving Medicaid rates ~fall 2026), it lets us score and tier every facility and drive a research-backed cadence.
2. Current capabilities reused (don't rebuild)
facilities (581 rows; 576 NY prospect, 5 active) already carries CMS Five-Star + prospect fields
(0023) and next_follow_up_date/last_contacted_date (0031). crm_leads/crm_activities (0032)
are the Kanban + activity log (empty — the UI is the missing piece). facility_contacts (559),
facility_notes, outbound_emails (0035), the app.* RLS/audit helpers (0001/0003), pg_cron
(0019 → /api/cron/*, ADR-0024), the AI orchestrator (src/server/ai/, ADR-0026), and the
scripts/import-prospects.mjs ingestion pattern all exist.
3. Layer 1 — Job-signal ingestion
scheduled fetch → job_raw (append-only) → normalize (role + urgency) → entity-resolve to facility → upsert job_signals → raise/refresh lead + recompute score → enqueue follow-up.
- Sources (compliant only): CareerOneStop free API (start here), schema.org
JobPostingon robots-allowed SNF chain career pages, optional licensed vendor APIs (Techmap/Coresignal), manual AAPACN CSV. Never Indeed/LinkedIn scraping. Store the existence of a posting, not candidate PII. - Normalization (
src/lib/pipeline/job-signal.ts, done):classifyRoleFamily,scoreUrgency(0-4). Entity resolution (src/lib/pipeline/match-facility.ts, done): CCN exact match → blended token+trigram name similarity within state, +0.08 same-city; no confident match →facility_id NULLand the Signal Inbox manual-match queue (never dropped).facility_aliases(management-co/DBA → facility) shrinks the queue over time. - Semantic dedup: a role re-posted weekly refreshes the existing signal's
last_seen_at(looked up by facility+role within 7 days in the ingest action), not a new row. 30-day expiry; extend to 45 days for roles re-seen within 14 days (chronic gap → +1 urgency). - Manual-match SLA: auto-dismiss after 3 days unresolved; seed
facility_aliasesfrom the first weeks of matches; the CCN backfill makes most matches exact.
4. Layer 2 — Scoring & prioritization
facilities.pipeline_score (0-100) + pipeline_tier (1-3), computed by the pure
computeFacilityScore (src/lib/pipeline/score.ts, done) on ingest and on quarterly CMS refresh,
written under an optimistic guard (score_updated_at). Revised model (review fix — see
02-review-synthesis.md):
| Component | Source | Weight | Notes |
|---|---|---|---|
| Open MDS-family posting | job_signals | 0.36 | role-weighted (MDS 1.0 → QM 0.5) × urgency lift |
| QM star 1-2 | cms_qm_rating | 0.20 | MDS-driven |
| Recent F641 / CMP | last_f641 / last_f_tag_date | 0.18 | <180d = 5pts, 180-365 = 2pts |
| DON/NHA turnover <90d | don_name change / contact log | 0.12 | incoming leader audits |
| Staffing star 1-2 | cms_staffing_rating | 0.08 | reduced (partly subsumed by QM) |
| MDS-volume anomaly | CMS MDS Frequency (optional) | 0.06 | tie-breaker |
| CMS overall star | — | 0 | removed — composite of QM+staffing (no double-count) |
| NY Medicaid census | medicaid_pct | ×≤1.08 | multiplier, not a base factor |
Tiers are calibrated from the cohort distribution after the first full run (target ~15 Tier-1, ~60 Tier-2), not fixed thresholds. The Facility 360 "why this score" panel renders the per-factor breakdown the function returns.
5. Layer 3 — Follow-up automation
follow_up_sequences (templates) → follow_up_steps (Day/channel/persona/template) →
follow_up_enrollments (a facility XOR lead) → follow_up_tasks (the due touch). Seeded "NY SNF
21-Day" (9 touches: Day 1/2/4/7/9/11/14/16/21, phone-first, multi-threaded). A daily cron
materializes due tasks idempotently (unique (enrollment_id, step_no)); email steps are drafted
via the AI orchestrator into outbound_emails (status skipped = draft) and require a human send.
LinkedIn steps are manual-only, never auto-actioned. "Today's Follow-ups" = follow_up_tasks due
today, filtered to the logged-in rep's owner_id.
Capacity throttle (review fix): per-rep Tier-1 cap (≈8/rep), ≤2–3 new Tier-1 enrollments/rep/week, and a condensed 3-touch monthly Tier-2 nurture (email D1, call D15, email D30) — not the full 9-step sequence — so a 2-rep team stays under ~15 touches/day. Chain/management-group facilities are one enrollment at the Regional level, not N per facility.
6. The BD operating procedure (the flow)
- Signal arrives → matched to a facility (or manual queue).
- Active-client guard: if
facilities.status='active', route to an upsell lead (lead_source='upsell', assigned to the account owner), not a prospect lead. Existing clients are expansion/retention, never new-logo prospecting. - Prospect path: raise/refresh a
crm_lead; recompute score/tier. Contact-enrichment gate: block Tier-1 sequence enrollment until ≥1 Administrator/DON contact with a direct phone exists (seed Administrator names from the CMS backfill). - Non-signal path: a Prospect-Discovery view surfaces high-score facilities with no open posting (CMS-only signals) so they're not invisible.
- Work the cadence → log every touch to
crm_activities/facility_notes; stamplast_contacted_date+next_follow_up_date. - Won → require the intake form (
facility_intakes) by the Qualified stage; on win, flipfacilities.statusto active, linkconverted_facility_id, and stop the prospect enrollment — atomically (reusesrc/lib/crm/pipeline.ts). - Lost → record
lost_reason; a later signal on alostlead prompts re-open (not a duplicate), optionally starting mid-sequence.
7. Measurement (KPIs & dashboards)
- Funnel: signal→matched (≥80%), signal→lead (≥60%), lead→contacted (≤7d), contacted→qualified, qualified→proposal, proposal→won; signal→won (lagging).
- Activity: touch completion rate (≥85% same-day), sequence completion, touches-to-first-meeting, email draft→sent rate, manual-match queue age (<24h).
- Pipeline quality: time-in-stage, win rate by tier/source/
role_family, pipeline velocity, lost- reason distribution. - Account health (5 clients): upsell signals surfaced vs. converted, days-since-last-contact.
- Dashboards: (1) Pipeline Health (today's worklist), (2) Conversion Funnel (trailing 90d, by rep & source), (3) Score Distribution (histogram with tier bands; flags tier crossings on CMS refresh).
8. Phasing
Pure core (lib + tests) — done. Then: schema (0053) + type regen + widen contact-role schema →
ingestion (script + cron + actions) → scoring + Pipeline UI → follow-up engine + Sequences UI →
ADR + CI green. Detailed manifest and order in 03-implementation-plan.md.