Skip to main content

Facility Pipeline — design spec (revised)

Canonical design after review. For the decision record see ADR-0040; for the build steps see 03-implementation-plan.md.

1. Goal & shape

Turn the 576 NY prospect facilities into a worked, prioritized BD pipeline with a daily worklist, a defensible priority order, and an automated follow-up rhythm — by adding three layers to the existing CRM (ADR-0033) rather than a parallel system. The feature lives on the existing /admin/crm surface (Business-development nav group), adding tabs and reusing the Kanban, facility detail, crm_leads, crm_activities, and outbound_emails.

The thesis: a SNF posting for an MDS Coordinator / PDPM-RAC nurse / chart reviewer / QM clerk is a time-bound buying signal for exactly what Nightingale sells. Combined with public CMS quality data and the NY Medicaid case-mix transition (MDS accuracy starts driving Medicaid rates ~fall 2026), it lets us score and tier every facility and drive a research-backed cadence.

2. Current capabilities reused (don't rebuild)

facilities (581 rows; 576 NY prospect, 5 active) already carries CMS Five-Star + prospect fields (0023) and next_follow_up_date/last_contacted_date (0031). crm_leads/crm_activities (0032) are the Kanban + activity log (empty — the UI is the missing piece). facility_contacts (559), facility_notes, outbound_emails (0035), the app.* RLS/audit helpers (0001/0003), pg_cron (0019 → /api/cron/*, ADR-0024), the AI orchestrator (src/server/ai/, ADR-0026), and the scripts/import-prospects.mjs ingestion pattern all exist.

3. Layer 1 — Job-signal ingestion

scheduled fetch → job_raw (append-only) → normalize (role + urgency) → entity-resolve to facility → upsert job_signals → raise/refresh lead + recompute score → enqueue follow-up.

  • Sources (compliant only): CareerOneStop free API (start here), schema.org JobPosting on robots-allowed SNF chain career pages, optional licensed vendor APIs (Techmap/Coresignal), manual AAPACN CSV. Never Indeed/LinkedIn scraping. Store the existence of a posting, not candidate PII.
  • Normalization (src/lib/pipeline/job-signal.ts, done): classifyRoleFamily, scoreUrgency (0-4). Entity resolution (src/lib/pipeline/match-facility.ts, done): CCN exact match → blended token+trigram name similarity within state, +0.08 same-city; no confident match → facility_id NULL and the Signal Inbox manual-match queue (never dropped). facility_aliases (management-co/DBA → facility) shrinks the queue over time.
  • Semantic dedup: a role re-posted weekly refreshes the existing signal's last_seen_at (looked up by facility+role within 7 days in the ingest action), not a new row. 30-day expiry; extend to 45 days for roles re-seen within 14 days (chronic gap → +1 urgency).
  • Manual-match SLA: auto-dismiss after 3 days unresolved; seed facility_aliases from the first weeks of matches; the CCN backfill makes most matches exact.

4. Layer 2 — Scoring & prioritization

facilities.pipeline_score (0-100) + pipeline_tier (1-3), computed by the pure computeFacilityScore (src/lib/pipeline/score.ts, done) on ingest and on quarterly CMS refresh, written under an optimistic guard (score_updated_at). Revised model (review fix — see 02-review-synthesis.md):

ComponentSourceWeightNotes
Open MDS-family postingjob_signals0.36role-weighted (MDS 1.0 → QM 0.5) × urgency lift
QM star 1-2cms_qm_rating0.20MDS-driven
Recent F641 / CMPlast_f641 / last_f_tag_date0.18<180d = 5pts, 180-365 = 2pts
DON/NHA turnover <90ddon_name change / contact log0.12incoming leader audits
Staffing star 1-2cms_staffing_rating0.08reduced (partly subsumed by QM)
MDS-volume anomalyCMS MDS Frequency (optional)0.06tie-breaker
CMS overall star0removed — composite of QM+staffing (no double-count)
NY Medicaid censusmedicaid_pct×≤1.08multiplier, not a base factor

Tiers are calibrated from the cohort distribution after the first full run (target ~15 Tier-1, ~60 Tier-2), not fixed thresholds. The Facility 360 "why this score" panel renders the per-factor breakdown the function returns.

5. Layer 3 — Follow-up automation

follow_up_sequences (templates) → follow_up_steps (Day/channel/persona/template) → follow_up_enrollments (a facility XOR lead) → follow_up_tasks (the due touch). Seeded "NY SNF 21-Day" (9 touches: Day 1/2/4/7/9/11/14/16/21, phone-first, multi-threaded). A daily cron materializes due tasks idempotently (unique (enrollment_id, step_no)); email steps are drafted via the AI orchestrator into outbound_emails (status skipped = draft) and require a human send. LinkedIn steps are manual-only, never auto-actioned. "Today's Follow-ups" = follow_up_tasks due today, filtered to the logged-in rep's owner_id.

Capacity throttle (review fix): per-rep Tier-1 cap (≈8/rep), ≤2–3 new Tier-1 enrollments/rep/week, and a condensed 3-touch monthly Tier-2 nurture (email D1, call D15, email D30) — not the full 9-step sequence — so a 2-rep team stays under ~15 touches/day. Chain/management-group facilities are one enrollment at the Regional level, not N per facility.

6. The BD operating procedure (the flow)

  1. Signal arrives → matched to a facility (or manual queue).
  2. Active-client guard: if facilities.status='active', route to an upsell lead (lead_source='upsell', assigned to the account owner), not a prospect lead. Existing clients are expansion/retention, never new-logo prospecting.
  3. Prospect path: raise/refresh a crm_lead; recompute score/tier. Contact-enrichment gate: block Tier-1 sequence enrollment until ≥1 Administrator/DON contact with a direct phone exists (seed Administrator names from the CMS backfill).
  4. Non-signal path: a Prospect-Discovery view surfaces high-score facilities with no open posting (CMS-only signals) so they're not invisible.
  5. Work the cadence → log every touch to crm_activities/facility_notes; stamp last_contacted_date + next_follow_up_date.
  6. Won → require the intake form (facility_intakes) by the Qualified stage; on win, flip facilities.status to active, link converted_facility_id, and stop the prospect enrollment — atomically (reuse src/lib/crm/pipeline.ts).
  7. Lost → record lost_reason; a later signal on a lost lead prompts re-open (not a duplicate), optionally starting mid-sequence.

7. Measurement (KPIs & dashboards)

  • Funnel: signal→matched (≥80%), signal→lead (≥60%), lead→contacted (≤7d), contacted→qualified, qualified→proposal, proposal→won; signal→won (lagging).
  • Activity: touch completion rate (≥85% same-day), sequence completion, touches-to-first-meeting, email draft→sent rate, manual-match queue age (<24h).
  • Pipeline quality: time-in-stage, win rate by tier/source/role_family, pipeline velocity, lost- reason distribution.
  • Account health (5 clients): upsell signals surfaced vs. converted, days-since-last-contact.
  • Dashboards: (1) Pipeline Health (today's worklist), (2) Conversion Funnel (trailing 90d, by rep & source), (3) Score Distribution (histogram with tier bands; flags tier crossings on CMS refresh).

8. Phasing

Pure core (lib + tests) — done. Then: schema (0053) + type regen + widen contact-role schema → ingestion (script + cron + actions) → scoring + Pipeline UI → follow-up engine + Sequences UI → ADR + CI green. Detailed manifest and order in 03-implementation-plan.md.