Build a Healthcare Claim Appeals Agent.

This guide is not a one-off healthcare demo. It is the first vertical runbook for a larger pattern: guides assemble reusable Open Skills into a domain-specific workflow.

The working starter uses real public plan language and synthetic denial letters. The agent drafts and organizes; the human reviews, edits, and decides what to send.

People lose high-friction paperwork fights because their information is scattered, unstructured, uncited, and incomplete. The fix is to own the context: collect the mess, normalize it, ground it in source documents, and produce the next human-reviewed action.

Real plan docs Public sourcesSynthetic claims Private-safe dataHuman gate No sending
01 / Shared skeletonThe guide is a runbook shell.Healthcare appeals should prove the larger pattern: guides assemble Open Skills primitives into a domain-specific workflow.

Start here before touching denial types. The reusable architecture is the product precedent: every future guide should improve the same primitive library instead of rebuilding the pipeline from scratch.

Name the enemy: fragmented context.

People lose high-friction paperwork fights because their information is scattered, unstructured, uncited, and incomplete. The fix is to own the context: collect the mess, normalize it, ground it in source documents, and produce the next human-reviewed action.

The healthcare version applies that pattern to a denied claim. The person does not need the agent to become an insurer, a lawyer, or a doctor. They need a clean case file that forces the appeal to respond to plan language and evidence.

You do

Bring the denial letter, plan docs, and supporting documents into the starter repo.

The AI does

Assemble a cited appeal packet from the reusable Open Skills chain.

Show the full prompt
<prompt>
  <task>Build a document-grounded case workflow from reusable Open Skills primitives.</task>
  <thesis>
    People lose because their information is scattered, unstructured, uncited, and incomplete.
    The workflow should help the person own their context, not outsource judgment to a black box.
  </thesis>
  <primitive_chain>
    <step>Ingest documents into markdown/text with raw source coordinates as anchors. Use PDF page/region, CSV line number, or form box identifiers, and embed the identical anchor scheme in the markdown that downstream citations will use. Keep one numbering scheme end to end.</step>
    <step>Chunk and tag source evidence by structure.</step>
    <step>Normalize the case facts into a ledger.</step>
    <step>Run the coverage gate. Every ingested document must produce at least one normalized record or be explicitly marked reference-only. Print the list of unconsumed documents and stop before drafting if any document is unaccounted for.</step>
    <step>Reconcile shared facts across sources before drafting. Compare the same fact anywhere it appears, turn every mismatch into a named review question, and record which source governs the tracked value.</step>
    <step>Store chunks, records, mappings, and outputs in SQLite by default.</step>
    <step>Optional: if you already run OB1, mirror the case store into Open Brain; otherwise skip this step entirely. SQLite is the complete beginner path.</step>
    <step>Retrieve relevant evidence deterministically before drafting.</step>
    <step>Validate citations before export. The citation guard returns pass / needs_review / fail verdicts. Any fail blocks packet export until fixed or converted to a named review question, and the guard verdict summary must appear in the packet README.</step>
    <step>Export an editable packet and stop at human review.</step>
  </primitive_chain>
  <constraint>The agent organizes and drafts. It does not sign, send, file, submit, authorize, or transmit sensitive data.</constraint>
</prompt>

Use the same primitive chain every time.

The runbook calls Open Skills in order: ingestion, chunking/tagging, normalization, SQLite or Open Brain storage, deterministic retrieval, citation validation, packet export, and the human gate. Healthcare changes the labels and mappings, not the architecture.

Starter path: local files plus SQLite. Open Brain path: write the same chunks and normalized records into OB1 for people already running durable context.

02 / Data strategyUse real plan language and synthetic patient facts.The demo is realistic where citation matters and synthetic where privacy matters.

Real where rules matter; fake where people matter.

Use public insurer plan documents for the SBC/EOC coverage language, exclusions, prior authorization rules, network rules, emergency services language, and appeal instructions. Use synthetic denial letters, patient identity, provider details, claim IDs, service dates, procedure details, and medical facts.

That split proves document-grounded drafting without touching real PHI.

Seed three denial cases.

Build synthetic cases for administrative or coding error, prior authorization missing, and medical necessity. The first two can produce appeal arguments from plan language and claim facts. Medical necessity must stop at a doctor letter-of-medical-necessity template plus packet assembly; the agent does not invent clinical reasoning. These three seed cases leave the service-not-covered and out-of-network retrieval branches untested; build those branches anyway, mark them untested, and smoke-test them when real documents arrive.

Chunk long plan documents in two tiers.

For long plan documents, keep page-level chunks so the whole document can be cited and audited, then add clause-level chunks for the sections named by the retrieval map. Store a granularity column with values such as page and clause so retrieval can choose the right surface and the citation guard can explain what it checked.

Exclude the table of contents, cover pages, and front matter from evidence chunks. A cited chunk must contain operative plan language, not headings or dotted page listings. Smoke-test this by printing the text of every cited chunk and reading it before you trust the draft.

03 / Domain layerMap denial type to plan sections.The healthcare-specific intelligence lives in the denial-type retrieval map and output rules.

Retrieval starts from structure.

Administrative or coding error retrieves the appeals process and claim/EOB line items. Service-not-covered retrieves covered benefits and excluded services. Out-of-network retrieves network and emergency services sections. Prior authorization retrieves prior authorization sections. Medical necessity retrieves covered benefits and triggers the doctor-template path.

Reconcile the denial letter against the EOB and plan.

For each claim number, compare denial-letter fields against EOB rows field by field: CPT code, service dates, amounts, and provider. Then compare the plan-section citations named in the denial letter against the real EOC table of contents. Any mismatch becomes a needs_review unresolved question carried into the packet, never a silent correction.

Name which source governs the tracked value so downstream stages still have one value to work with. For example, use the letter the patient received as the governed value for the appeal narrative, while keeping the EOB disagreement visible as a review question. Without this step, a weak agent will see a denial letter with CPT 99214, miss that the EOB row says 99213, and argue the wrong code.

The output is an appeal packet, not a sent appeal.

The packet contains an editable appeal letter, the rendered appeal-letter PDF, citation map, deadline summary, supporting document checklist, unresolved questions, source manifest, and medical-necessity doctor template when needed. The deadline summary normalizes every appeal window to an absolute date by adding the stated days to the notice date, computes days remaining from the run date, flags anything under 14 days as URGENT at the top of the packet README, and orders cases by deadline.

The exported PDF is the rendered appeal letter itself: letter-formatted, a sane page count, no browser artifacts, and identity fields from the normalized record rather than string slicing. A weak agent without this check will turn a street address into the patient name or preserve a corrupted deadline like "y 27, 2026" because it copied text instead of validating fields.

Packet folder. One packet/ directory per case: README.md, appeal-letter.md, appeal-letter.pdf, citation-map.json, deadline-summary.md, checklist.md, unresolved-questions.md, and a sources/ manifest.

For medical-necessity cases, include a blank doctor letter-of-medical-necessity template as a required packet file. The appeal letter may quote the insurer's criteria and point to the template, but it never asserts that clinical criteria are met. Every coverage claim must cite a chunk. Anything ungrounded becomes a review question. The guard's verdict is a shipping gate: export refuses while any claim fails validation.

04 / Sources and gatesKeep authority and agency visible.The guide should show source authority, deadlines, and the human stop line instead of burying them.

Cite the process sources.

The guide should point readers to HealthCare.gov for internal appeals and external review, 45 CFR 147.136 for claim and appeal process requirements, KFF for ACA appeal/overturn framing, and the public insurer plan-document source used by the starter.

The human gate is a feature.

The workflow stops at review/edit/export. Sending an appeal means a person signs and transmits health data to an insurer. That is a product boundary, not a missing automation feature.

Save proof that the guard works.

The citation-guard test saves two reports as artifacts: one for the fully cited draft and one for the seeded fabricated-but-well-formed citation. The fully cited draft must pass with exit 0. The seeded fabricated citation must fail with nonzero exit, and the seeded sentence itself must appear as the failing item in the saved report.

A report that fails other claims does not count as proof. Fix the harness until the fabricated sentence is the reported failure, then keep both reports with the packet or test artifact bundle.

05 / Verification gatesMake done mean verified.The workflow is not done when it drafts. It is done when every stage can show the evidence that keeps a weak agent from shipping confident garbage.

Run the per-stage prove-it checklist.

Ingestion. Does every source document appear in the index with an anchor? A missing document means the draft can ignore evidence without warning. Fix the ingestion manifest and rerun the coverage gate before normalization.

Chunking. Does a cited chunk contain operative plan language? A chunk that only shows headings, front matter, or dotted page listings cannot support a claim. Exclude that chunk from evidence and create a clause-level chunk from the real plan text.

Normalization. Does the patient name look like a name; do dates parse as real dates with days remaining computed; do amounts reconcile against line-item sums? A street address stored as the patient name, a corrupted date string, or mismatched totals means the record is not usable. Mark the field needs_review, keep the source anchor, and write the unresolved question.

Retrieval. Did every expected section arrive; are missing sections flagged? A branch that returns only generic appeal instructions cannot support a coverage argument. Add the missing retrieval-map target or mark the branch untested until real documents arrive.

Drafting plus guard. Does a clean draft pass and a seeded fabricated citation fail? A guard that passes both drafts or fails for unrelated claims is not proving the boundary. Save the two reports and fix the guard until the seeded sentence is the failing item.

Export. Does the PDF open as the rendered letter with a sane page count? A browser print artifact, blank file, or unexpected page count means the packet is not ready. Regenerate from the letter renderer and recheck identity fields from the normalized record.

Human gate. Does the packet stop at review with the checklist present? A packet that sends, files, or hides open questions crosses the product boundary. Stop at export, surface the checklist, and require a person to sign and transmit.

Refuse to ship invalid packets.

Normalization QA is not optional. Every extracted field carries a source anchor and confidence. Records with failed sanity checks become needs_review with a concrete unresolved question instead of keeping a default pending status.

Packet export refuses to ship, or stamps DRAFT-INVALID on the cover, while the guard reports any FAIL. The packet README reproduces the guard's actual pass / needs_review / fail counts verbatim so the reviewer sees the same verdict the pipeline saw.