Evidence hierarchy, annotation, auditability

Hierarchy.

A claim score is the output of a three-part system: a versioned evidence corpus, a structured post corpus per analyzed account, and a deterministic scoring function. Every number traces to a specific claim, a specific card, and specific citations.

01 Ingest

→

02 Extract

→

03 Map to card

→

04 Score

→

05 Audit trail

01The truth hierarchy

Every claim gets one level, H1 through H5, based on the strength of human evidence behind that exact claim. The level is the ceiling on how confidently an account can state it before losing points.

Strong human consensus assertive OK

Replicated systematic-review signal. High-certainty evidence on hard outcomes.

LDL causal for ASCVD Statins · secondary prevention Exercise reduces mortality VO₂max predicts mortality Grip strength predicts mortality

Good human evidence assertive but hedged

Multiple decent trials or one strong RCT. Certainty limited by inconsistency, indirectness, or precision.

NMN raises NAD⁺ in humans CR improves healthspan markers Fish oil — no CV benefit in general pop

Weak / mixed human evidence hedged required

Observational, underpowered, conflicting, or surrogate outcomes only.

Metformin extends human lifespan Biological age reversibility TRF beyond calorie reduction 30 ml olive oil daily extends lifespan

Preclinical / mechanistic / anecdotal uncertain required

Animal, cell, pathway, or clinician anecdote. Anecdotes are not proof.

NMN slows aging in humans Cold exposure extends healthspan Plasma exchange rejuvenates healthy people

Unsupported / contradicted / marketing-only don't assert

No credible support, or stronger evidence points the other way.

Young plasma rejuvenation Seed oils uniquely toxic Follistatin gene therapy slows aging Blueprint protocol slows aging Resveratrol extends healthspan

Why 5 levels?

A continuous 0–1 score suggests precision we don't have. Five buckets match how clinicians already think about evidence and map cleanly to a confidence ceiling — H5 tolerates assertive, H1 basically don't.

Scope matters as much as level

Every card has a scope block — population, intervention, outcome, and a not_supported_for list. Extrapolating from a narrow scope to a broad population is a scoping penalty even when the level is high.

02Evidence card anatomy

A single versioned machine-readable dataset. Each card declares one normalized claim family:

claim_familycanonical name, e.g. statins_reduce_cv_secondary

evidence_levelH5 H4 H3 H2 H1

directionsupports · contradicts · mixed · insufficient

certainty_modifiersGRADE-style: risk_of_bias, inconsistency, indirectness, imprecision, publication_bias

scopepopulation · intervention · outcome · not_supported_for

stakescritical high moderate low

safety_flaglow · low-to-moderate · moderate · high

commercial_sensitivitygeneric · supplement-sold-frequently · procedure-sold · protocol-sold

sourceslist of citations — each with type, url, label

03Per-post annotation

Each post gets the same normalized schema. Identical claims from different phrasings map to the same family so evidence is adjudicated once.

claim_familymatched card

speaker_stancesupports · refutes · mentions

strength_languageuncertain · hedged · assertive

scope_languagenarrow · broad · absent

citation_signalnone · referenced · linked

absolutes_presenttrue · false

anecdote_signaltrue · false

acknowledges_uncertaintytrue · false

acknowledges_conflictingtrue · false

sells_matching_producttrue · false

key_claimsverbatim quotes + paraphrase

04Source hierarchy

Sources are prioritized top-to-bottom when assigning a level. Weight dots scale with tier authority.

Major guidelines

USPSTF · ACC/AHA · ESC/EAS · WHO · FDA · CDC · NICE

Cochrane-class systematic reviews

Structured certainty grading · low risk of bias

Landmark RCTs

NEJM · Lancet · JAMA · large, blinded, adequately powered

Meta-analyses & systematic reviews

Independent · pre-registered where possible

Large prospective cohorts

Framingham · UK Biobank · PURE · NHS/HPFS · EPIC

Regulator advisories

FDA · FTC · EFSA · NIEHS · NIH ODS fact sheets

Downweighted Mechanistic-only studies · small-n case reports · industry white papers · podcast-only claims · tweets citing tweets

05Corpus coverage

Loading corpus statistics…

06Auditability

Every score links to (a) the live post, (b) a web-archive snapshot, and (c) the evidence card + citations. The point of disagreement is always locatable.

Open the account page and find the claim in question.

Click View on X to confirm the post text we analyzed.

Click Archive snapshot to confirm what the post said at scrape time.

Check the card's Sources. If we mis-graded evidence, cite a stronger source.

Check the per-component A–F breakdown. Disagree? Cite the exact wording.

Submit corrections via contact. Corrections ripple through affected scores.

07Known limitations

Each account scored on a recent window of posts, not full history.

Reposts weighted as weak endorsements.

Long threads treated as a single unit; per-thread aggregation is future work.

Media (images, video, podcast) not directly scored — substantive claims there are only partially captured.

Commercial context verified manually per account; will move to automated bio-resolution at scale.

Freshness gap — guideline updates are incorporated on a rolling basis, not in real time.